What the heck is a GPU, why does it need a fridge, and why should you care?
A plain-English guide to the technology powering the AI boom
Not investment advice. Just the clearest explanation we could write.
$575B+ flowing into this infrastructure in 2026. Here's where it goes.
Think of the entire AI industry as a restaurant.
AI apps are the meal. Without great hardware, energy, and data, the meal doesn't exist. You can't serve a Michelin dinner with a microwave and no electricity.
8-16 powerful cores
10,000+ simple cores
Teaching AI is like teaching a baby. You show it billions of examples.
The 3-step loop:
Show AI millions of cat photos. Say "this is a cat."
AI guesses, gets it wrong, adjusts its "weights."
Repeat BILLIONS of times until it gets it right.
Each weight adjustment = thousands of math operations. That's why GPUs matter.
Training taught the AI. Inference is it using what it learned.
Studying for years for an exam. Expensive, slow, done once (or rarely).
The exam itself. Every time ChatGPT answers you, that's inference. Happens billions of times/day.
All chips are not created equal.
Memory isn't a processor. It's the storage component. Think of it as the workspace where data lives while the GPU works on it. More on this next.
Why your GPU is useless without the right memory.
The Problem:
A GPU can crunch data incredibly fast. But data has to travel from memory β GPU β back. If memory is slow, the GPU just... sits there waiting.
Like having the world's fastest chef but handing them ingredients one at a time through a tiny door. πͺ
The Solution: HBM (High Bandwidth Memory)
Instead of memory sitting far away, HBM stacks memory directly on top of the GPU chip. Like giving the chef a massive prep station right next to the stove.
Why it's rare: HBM requires advanced "3D stacking" manufacturing. Very few companies can do it. SK Hynix dominates. Hence: shortage and high prices.
Where chips come from, and the hard part of connecting them.
The chip-making process:
CoWoS: TSMC's Secret Weapon
Chip on Wafer on Substrate: Places GPU die + HBM memory on the same silicon substrate. Data doesn't have to leave the chip package to access memory. Incredibly fast, incredibly hard to make.
Advanced packaging is like building a tiny city on a postage stamp where every building needs to connect perfectly. One mistake = the whole thing fails. CoWoS demand >> supply. This is a real bottleneck.
Why NVDA isn't just another chip company.
Architecture Evolution (The Relentless Roadmap):
NVIDIA sells complete systems, not just chips. DGX = GPU + networking + software + cooling all designed together. Buyers get "just turn it on and train."
CUDA + ML framework integration + DGX + enterprise support = lock-in. It's not just the chip. It's the entire ecosystem. Competitors sell chips; NVIDIA sells solutions.
Who's making the memory that makes AI possible.
π°π· Samsung: The Giant That's Behind
Everyone can make regular DRAM. Almost nobody can stack HBM. It requires 3D stacking at 10-micron precision. SK Hynix commands ~50-55% of HBM revenue. Until Samsung and Micron catch up, HBM capacity = GPU supply = AI growth rate. π’
72 GPUs walk into a data center... and they need to talk to each other.
8 GPUs per server, each mostly independent. Like 8 chefs with their own kitchens.
72 GPUs in one system, all sharing work on the same AI task. Like 72 chefs in one giant kitchen.
What 72 GPUs need to share constantly:
If the connection between GPUs is slow, 71 GPUs sit idle waiting for data from GPU #1. The bottleneck stops being the GPU. It becomes the network.
It's like having 72 chefs in a kitchen. If they can't communicate, they make 72 different dishes instead of one great one.
Why AI networking is ditching DSPs, and who wins regardless of which standard wins.
The New Optics Taxonomy (LPO / XPO / NPO / ACC):
NVDA Goes Vertical into Optics:
β οΈ Marvell (MRVL): Under Siege
The translator between copper and light. The unsung hero of AI networking.
A transceiver does two things:
π‘ Transmit
Electrical signal β light β sends through fiber
π₯ Receive
Light from fiber β electrical signal β feeds to chip
The sweet spot now. Massive deployments in AI clusters. Upgrade from 400G underway.
Next gen. Starts shipping 2026. Much higher ASPs. Each data center will need tens of thousands.
The company that owns AI networking without most people knowing.
AVGO's moat: they're inside everything
Off-the-shelf networking chips. Every switch/router in every data center.
"Give us your requirements, we'll design a chip exactly for you." Google's TPU networking? Broadcom. Meta's AI engine? Broadcom.
Computing as a utility. Like renting power from the grid instead of buying a generator.
Buying compute vs renting compute is like buying a generator vs using the power grid. The grid is cheaper, more reliable, and always available when you need more power.
Why Microsoft Wins:
The technology behind ChatGPT, explained in 60 seconds.
LLM = Large Language Model
The scale is what makes it magic:
The drive-through window of software.
An API is like a restaurant's drive-through. You don't need to go inside (understand the code). You just say what you want at the window, and you get it. That's it.
How AI APIs work:
You send a request:
"Write a haiku about NVIDIA"
The API runs it through GPT-4 on remote servers
Returns the result.
You pay per 1,000 tokens (~750 words)
Why APIs matter for investment:
"AI on your data" and the shift from chatbot to coworker.
The Problem with Public AI:
The Solution:
AI Agents: The Next Phase
Real Agent Example (ServiceNow):
IT ticket: "My laptop is slow." β Agent diagnoses via device logs β checks if update is needed β schedules overnight update β tells user when done. Zero humans involved.
Why the money is moving from training to inference.
How API pricing actually works:
The margin structure:
Training was a one-time $100M check. Inference is a recurring revenue stream, like a toll booth. Every ChatGPT query = money. Every API call = money. And it scales with usage.
The Inference Shift:
Your data center has a power problem.
To put that in perspective:
The 2030 Problem:
Some estimates suggest the US may need dozens of new power plants just for AI. Power lines take 5-10 years to build.
The fundamental problem: the US grid needs 2-3Γ expansion, but transmission lines take 5-10 years to build. Every AI data center is racing to find power now.
~$237B market cap, The Established Grid Giant
~$37B market cap, The On-Site Power Disruptor
~$108B market cap, The Nuclear Pure-Play
NuScale (SMR) & Oklo (OKLO): Pre-revenue, longer-term
The Power Spectrum: From Safe to Speculative
GPUs run hot. Very hot. And they need a fridge.
Why heat is a problem:
Who benefits:
Vertiv (VRT) makes power distribution, thermal management, and cooling systems for data centers. They make the precision cooling, power distribution units (PDUs), and liquid cooling infrastructure that every AI data center needs. Every rack in every hyperscaler is a potential Vertiv customer, and the shift to liquid cooling is a forced upgrade cycle. This is a multi-hundred-billion dollar infrastructure opportunity.
What to own at each layer and why.
The Energy Layer: Four Players
Why some companies are almost impossible to displace.
The bull case is clear. Here's what could break it.
Where to go deeper on AI infrastructure.
π° Technical Deep Dives
π₯ YouTube & Video Explainers
π§ Deep Tech Research
Built with love by Janel (with the help of Pip) Β· Not financial advice Β· Do your own research