The AI Stack Explained

Haris Ansari

June 30, 2026

TECHNOLOGY • READ TIME: 8 MIN

The AI Stack Explained: How the AI Economy Actually Works — and Where the Bottlenecks Are

Most people picture AI as one chatbot from one company. The reality is a twelve-layer industrial supply chain. This is a plain-language map of how it actually works, where the real bottlenecks are, and the one debate running underneath all of it — written to be useful whether or not you work in the industry.

Introduction

For most people, "AI" means a chat window. You type a question, an answer appears, and the company behind the window feels like the whole story. That picture isn't wrong so much as incomplete. Underneath every model response is a deep industrial supply chain: chips, memory, factories, advanced packaging, networking, optics, power, cooling, and the software that ties it all together.

Understanding that supply chain is the difference between hearing AI headlines and actually following what they mean. This paper is a plain-language map of that economy — the roughly twelve layers, the bottlenecks that keep migrating from one layer to the next, and the one debate now running through the whole thing.

It is written for two kinds of reader. The first is the person who keeps hearing about AI, chips, and data centers and wants a clear picture of how the pieces actually fit together — no engineering background required. The second is the person who works in or near the industry, knows their own corner of it well, and wants to see the whole stack laid out in one place. Either way, the goal is the same: to replace a vague sense that "AI is big" with a real map you can reason from.

Q: Why does AI look like one company but actually run on a dozen layers?

The popular mental model is a straight line: OpenAI builds ChatGPT, ChatGPT is AI, end of story. In reality, every model response travels through a stack of roughly a dozen industrial layers, each one solving a different bottleneck. Capital providers fund it. Accelerators compute it. Memory feeds it. Foundries fabricate the chips, and packaging assembles them. Networking and optics connect them into clusters. Servers turn the parts into deployable systems, and power and cooling keep those systems alive. At the top, foundation models turn all that infrastructure into intelligence, and applications turn that intelligence into products people pay for.

The dollars flow from the top down: a hyperscaler's spending decision cascades through every layer beneath it. The value, though, doesn't necessarily stick where the spending starts. The further down the stack you go, the closer you get to the picks and shovels that make the whole thing possible. The further up, the closer you get to the end customer. Where value ultimately concentrates is the entire investing question — and it is not yet settled.

Table 1: The Twelve Layers of the AI Stack

Layer	What Happens Here	Example Companies (Illustrative)
1. Capital providers	Fund the entire build-out, spending tens of billions a year each on AI infrastructure.	Microsoft, Amazon, Google, Meta, Oracle
2. AI accelerators	The chips that do the math — the compute engine of the whole stack.	NVIDIA, AMD; in-house chips from Google, Amazon, Microsoft, Meta
3. HBM memory	High-bandwidth memory feeds data into the chip fast enough to keep it busy.	SK Hynix, Micron, Samsung
4. Foundries	Physically manufacture the chips a designer only draws up.	TSMC, Samsung, Intel
5. Advanced packaging	Assemble processor, memory, and interconnects into one working unit.	TSMC, ASE, Amkor
6. Networking	Let tens of thousands of chips behave like a single computer.	NVIDIA, Arista, Broadcom, Cisco
7. Optics	Move data as light when copper becomes too slow.	Coherent, Lumentum, Fabrinet
8. Servers & ODMs	Assemble the components into deployable systems.	Dell, HPE, Supermicro, Foxconn
9. Power	Deliver electricity to data centers now planned at hundreds of megawatts.	Eaton, Schneider Electric, GE Vernova, Siemens Energy
10. Cooling	Remove the heat — increasingly with liquid as racks outgrow air.	Vertiv, Modine
11. Foundation models	Turn infrastructure into intelligence. They consume everything below.	OpenAI, Anthropic, Google DeepMind, xAI, Meta
12. Applications	Turn intelligence into products people pay for.	GitHub Copilot, Cursor, Harvey, Salesforce

Company examples are illustrative only — included to show the kinds of businesses operating in each layer, not as recommendations. Many companies appear in more than one layer.

Q: Why were the hyperscalers the first beneficiaries?

When AI broke into the mainstream, the first names the market rewarded were the big cloud companies — Microsoft, Amazon, Google, Meta, Oracle. The intuitive read is that they won because they sell AI services. The more structural read is that they sit at the top of the stack as the capital providers: they fund the entire build-out beneath them.

The numbers are large enough to be hard to picture. A handful of hyperscalers are each spending on the order of tens of billions of dollars a year on AI infrastructure — Microsoft has publicly guided to roughly $145 billion of capital spending in its 2026 fiscal year alone. Every one of those dollars cascades downward: into chip makers, memory suppliers, networking vendors, and the companies that power and cool the buildings. They were early beneficiaries not only because they sell the service at the top, but because they are the railroads paying to lay the track. When their capital spending rises, the whole stack below feels it. When it pauses, the same is true in reverse — which is one reason the entire trade tends to move on hyperscaler capex signals.

Q: What is an AI accelerator, and why has NVIDIA been at the center?

At the heart of the stack is the accelerator — the chip that does the actual math. Underneath the marketing, AI models are giant matrix-multiplication machines, and the accelerator is the engine that runs those calculations. Without it there is no training and no inference. NVIDIA has been at the center of this layer because it sells that compute to everyone — every hyperscaler, enterprise, and government — rather than building only for itself.

But the reason NVIDIA has been hard to dislodge isn't really the hardware. It's CUDA, the software platform developers have built on for well over a decade. Frameworks were optimized for CUDA, tools were built around CUDA, and an entire generation of engineers learned to program GPUs through it. That accumulated ecosystem is the moat, and it is far stickier than any single chip. The cleanest way to think about it: evaluating NVIDIA purely as a chip company is like evaluating Apple purely as a phone company. The visible product is the hardware. The durable advantage is the software that everything else runs on. That is a statement about the structure of the stack, not a view on the stock.

Q: If NVIDIA is so dominant, what is the actual threat — and why is this the central debate?

The instinct is to assume NVIDIA's real competition is the other merchant chip designer. The more important pressure comes from its own biggest customers. Google (TPU), Amazon (Trainium and Inferentia), Microsoft (Maia), and Meta (MTIA) are all designing custom chips to run their own workloads — and several of those run on software paths that bypass CUDA entirely.

The logic is a genuine trade-off rather than a clear win for either side. A general-purpose GPU is flexible, and flexibility costs money. A hyperscaler running the same workload millions of times can build a specialized chip that is cheaper to operate — at the cost of that flexibility. So there are two plausible futures, and serious people disagree about which dominates.

The Merchant-GPU Path	The Custom-Silicon Path
Buyers keep purchasing flexible, general-purpose chips from a merchant designer.	Hyperscalers design their own chips for their own repeated workloads.
The software ecosystem (CUDA) keeps developers on one platform.	In-house software stacks bypass that ecosystem for internal use.
Compute dollars concentrate with the merchant designer.	Compute economics get internalized inside the largest buyers.
Flexibility and breadth win.	Cost-per-workload at massive scale wins.

This is the single most important unresolved question in AI right now, and we do not pretend to know the answer. We watch it closely, because the resolution reshapes where value accrues across the entire stack — not just for chips, but for everyone who sells into them.

Q: Why did memory suddenly become a bottleneck?

For years the working assumption was that compute was the constraint — build a faster chip, get faster AI. Increasingly, the binding constraint isn't how fast the chip can calculate. It's how fast data can reach it. That is the job of high-bandwidth memory, or HBM, supplied by a short list of manufacturers — SK Hynix, Micron, and Samsung.

Every generation of AI hardware needs more accelerators and more memory per accelerator, so HBM demand has compounded faster than almost anything else in the stack. When a layer of the supply chain has only a few credible suppliers and demand that keeps climbing, that is the textbook definition of a bottleneck. And bottlenecks are where pricing power tends to sit — which is exactly why this layer has drawn the investment and attention it has.

Q: Why does so much of this route through a handful of foundries and packaging houses?

Designers like NVIDIA don't actually manufacture anything. The physical chips are built by foundries — overwhelmingly TSMC, with Samsung and Intel also in the picture — and then assembled by advanced-packaging operations that combine the processor, the memory stacks, and the interconnects into one functioning unit.

Packaging is the least-understood part of the stack and one of the tightest constraints in it. Even when there are enough raw wafers, a shortage of packaging capacity — TSMC's CoWoS is the marquee example — can cap how many finished accelerators actually ship. If NVIDIA is the Ferrari brand, TSMC is the factory that builds every Ferrari, and packaging is the final assembly line that, when it backs up, leaves finished cars sitting without engines. The concentration here is striking: an enormous share of the world's most advanced AI chips depends on a very small number of facilities.

Q: What about the parts nobody talks about — networking, optics, power, and cooling?

Most coverage stops at the chip. But a modern AI cluster isn't one chip, or even a thousand — it's tens of thousands of accelerators that have to behave like a single computer. The unglamorous layers that make that possible have quietly become bottlenecks in their own right.

Networking (high-end InfiniBand and Ethernet from a small group of vendors) and optics (converting electrical signals into light when copper gets too slow) determine whether all those chips actually work together or sit idle waiting on one another. Poor networking means lower utilization, slower training, and higher cost — fifty thousand workers with no way to coordinate. Power and cooling, meanwhile, have become hard physical limits. Individual data centers are now planned at hundreds of megawatts, and racks that once drew a few kilowatts can now draw a hundred or more, which is why liquid cooling is steadily displacing air.

These layers have absorbed enormous investment for a simple reason: you cannot run a frontier model on chips you can't power, connect, or keep from overheating. And it points to the most important pattern in the whole stack — the bottleneck keeps migrating downward. The binding question has moved from "can we get the chips" toward "can we power and cool the building they live in." Where that constraint sits next is one of the things we pay the most attention to.

Q: Where does the money actually get made — the models or the applications?

At the top of the value chain sit the foundation-model companies — OpenAI, Anthropic, Google DeepMind, xAI, Meta — and, above them, the applications: coding assistants, healthcare scribes, customer-service tools, and the long tail of software being rebuilt around AI. The models consume everything below them, and the applications turn the intelligence into something people will actually pay for.

Here's the honest tension. Today, much of the visible profit sits lower in the stack, where infrastructure providers sell picks and shovels into a gold rush. The open question — and it is the one that decides the long-run winners — is whether durable, defensible value eventually concentrates up at the application layer, the way it did in prior technology cycles once the infrastructure got commoditized. "Where the spending is today" and "where the profit lands tomorrow" are different questions. The gap between them is precisely where investors form their opinions, and where they most often disagree.

Q: What's the single best mental model for all of this?

Twelve layers is a lot to hold in your head. The cleanest analogy is the oil industry — a business almost everyone already understands intuitively.

Oil Industry	AI Industry
Capital	Hyperscalers
Drilling equipment	AI accelerators
Fuel delivery	HBM memory
Refineries	Foundries
Pipelines	Networking & optics
Power grid	Power infrastructure
Processing plants	Data centers
Refined oil products	Foundation models
Consumer products	AI applications

The further down the stack you go, the closer you get to the picks, shovels, and infrastructure that make the whole economy possible. The further up, the closer you get to the end customer. Where you choose to stand in that stack is itself an investment decision — and most people have never consciously made it.

Q: How should a non-specialist make sense of all this?

You don't need to predict winners to understand the landscape — and predicting winners is genuinely hard, because the bottleneck keeps moving down the stack. The layer that mattered most last year may not be the one that matters most next year. So rather than asking "which AI company should I care about," the more durable habit is to ask where in the stack a given piece of news actually sits. A chip shortage, a data-center power deal, a memory supplier's results, and a splashy new chatbot are events at four different layers, involving four different sets of companies — and they don't all mean the same thing.

There's also a quieter point worth making, and it applies whether or not you work in tech. Most people already have some exposure to this stack without having chosen it. The largest names in it — the hyperscalers and chip makers — now make up an outsized share of the major stock indexes, so anyone holding a broad index fund owns a slice of this trade by default. For those who work in the industry, employer stock can stack a second, much larger layer of the same exposure on top.

None of that is a reason to do anything in particular. It's a reason to understand what you already have before deciding whether it reflects what you actually want. The map is useful precisely because it turns "AI" from a single headline into a set of distinct, understandable parts — and understanding usually beats reacting.

Q: What are the bottlenecks you're paying attention to?

People ask us which layer is the "winner." We don't frame it that way, because the honest answer is that the picture is still being written and the bottleneck is still moving. What we pay attention to instead is a short list of questions that we think actually matter.

Will the hyperscalers' custom silicon meaningfully erode the merchant-GPU position, or remain a complement to it? Which physical bottlenecks — memory, advanced packaging, power, cooling — stay tight enough to hold pricing power, and which get relieved as new supply comes online? Where does durable profit ultimately settle: in the infrastructure layers selling the picks and shovels, or up at the model and application layers closer to the customer? And, for any individual: how concentrated is your existing exposure to this stack, and does it match the risk you actually want to carry?

None of those questions has a clean answer yet. But they are the right questions, and asking them deliberately beats reacting to whichever layer happened to make headlines this week.

Want to Make Sense of Where AI Fits — for You?

Understanding the stack is interesting on its own. It also raises a practical question most people never stop to ask: where does AI already show up in what I own, and does that match what I actually want? For someone holding broad index funds, the answer is usually "more than you'd guess." For someone who also works in the industry, it can be a good deal more than that.

We offer an AI Landscape Review — a straightforward conversation, not a sales pitch. We walk through the stack as it applies to your situation and, if you'd find it useful, show you where your existing holdings intersect with it. We don't pick stocks and we don't predict which layer wins. The goal is simply to replace a vague sense of "AI is everywhere" with a clear picture you can make decisions from — including the decision to leave things exactly as they are.

To schedule an AI Landscape Review, contact Haris Ansari at hansari@pcrg.com or visit ascentwealthsolutions.com.

Securities and Investment Advisory Services are offered through Osaic Wealth, Inc., member FINRA/SIPC. Osaic Wealth is separately owned and other entities and/or marketing names, products or services referenced here are independent of Osaic Wealth. Osaic Wealth does not offer tax or legal advice. Company names and examples are illustrative only and do not constitute investment recommendations. All investing involves risk, including the possible loss of principal.