NVIDIA GTC 2026: Feynman's Silicon Photonics, Vera Rubin in Production, and the Agentic AI Pivot

Three Days Out, and the Picture Is Getting Sharper

Jensen Huang's Monday keynote at GTC 2026 is now less than 72 hours away, and the drip of pre-event leaks has turned into a steady stream. Over 30,000 attendees from 190 countries will descend on San Jose's convention center starting March 16, and Huang has promised to unveil "a few new chips the world has never seen before." That kind of line from most CEOs would be dismissed as showmanship. From the man whose company supplies the computational backbone of the AI revolution, it carries weight.

What has changed since the initial preview coverage is how much clearer the details have become. We now have hard numbers on Vera Rubin's production specs, concrete technical parameters for Feynman's silicon photonics, and a much better sense of how NVIDIA plans to redefine what AI infrastructure means. Let's break it down.

Vera Rubin: 336 Billion Transistors, Now Shipping

The Vera Rubin platform is no longer a slide deck; it has entered production, with partner availability expected in H2 2026. The numbers are staggering: 336 billion transistors packed into a single platform, built around custom "Olympus" Armv9 CPU cores designed entirely in-house, and paired with HBM4 memory for bandwidth that previous generations could not touch.

This is a meaningful departure from Blackwell. Where Blackwell relied on NVIDIA's Grace CPUs (themselves based on standard Arm Neoverse cores), Vera Rubin introduces Olympus, a ground-up custom CPU design that gives NVIDIA tighter integration between processor and accelerator. It is the same playbook Apple used when it moved from Intel to its own silicon: when you design both the CPU and the GPU, you can optimize the data path between them in ways that off-the-shelf components cannot match.

For the hyperscalers placing orders right now, the question is not whether Vera Rubin is impressive; it is how fast they can get it racked and running. Every quarter of delay in deploying next-gen hardware is a quarter where competitors might gain ground in training the next frontier model.

Feynman: Light Speed Computing Arrives in 2028

If Vera Rubin is the present, Feynman is the future NVIDIA wants everyone to start planning around. Built on TSMC's A16 process at 1.6 nanometers, Feynman will be the first NVIDIA chip to incorporate silicon photonics, a technology that replaces electrical interconnects with optical ones.

The numbers NVIDIA has shared are striking. Feynman's co-packaged optics will deliver 1.6 Terabits per second of bandwidth while consuming 3.5 times less energy than today's pluggable optical modules. The system will also be 10 times more resilient to failures, which matters enormously when you are running clusters of tens of thousands of GPUs where a single faulty interconnect can stall an entire training run.

Why does this matter? Because the dirty secret of modern AI training is that the GPUs themselves are not the bottleneck anymore; the interconnects are. Moving data between chips burns enormous power and introduces latency that limits how efficiently you can scale training across thousands of nodes. Silicon photonics attacks that problem at the physics level, replacing electrons with photons that move faster and generate less heat. If Feynman delivers on these specs, it changes the economics of AI infrastructure in fundamental ways.

The Cost Collapse Nobody Is Talking About Enough

Here is the number that should make every AI company sit up: agentic AI token costs on Feynman will drop to one-tenth of what they are on Blackwell. Let that sink in. Running an AI agent that currently costs you a dollar on Blackwell hardware would cost ten cents on Feynman.

The training side is equally dramatic. Mixture-of-Experts model training on Feynman will require only one-quarter the GPUs compared to current architectures. That is not a marginal improvement; it is a structural shift in who can afford to train large models. Today, only a handful of companies with billions in capital can train frontier models. If Feynman's efficiency gains hold, the barrier to entry drops substantially.

This cost trajectory is what makes NVIDIA's roadmap so consequential. It is not just about raw performance; it is about making AI capabilities accessible to a much wider range of organizations. The companies that plan their infrastructure strategy around these cost curves will have a significant advantage over those that do not.

NemoClaw and the Agentic AI Pivot

Beyond hardware, GTC 2026 is shaping up as the event where NVIDIA makes its clearest push into the software layer of AI. The centerpiece is NemoClaw, an open-source AI agent platform designed for enterprise deployment.

NemoClaw represents NVIDIA's bet that the next phase of AI is not about chatbots or image generators; it is about autonomous agents that can take actions, make decisions, and interact with real-world systems on behalf of businesses. Think of it as the operating system for AI agents, providing the tools and frameworks that enterprises need to deploy agents that can actually do things, not just talk about them.

This is backed by serious money. NVIDIA is investing up to $26 billion in open-weight AI models, a figure that positions the company as not just a hardware vendor but a full-stack AI platform provider. It is a strategic move to ensure that the software ecosystem running on NVIDIA hardware is rich enough that switching to a competitor's chips means giving up tools and frameworks you have built your workflow around.

Huang's "Five-Layer Cake" Framework

One of the more interesting strategic signals coming out of pre-GTC briefings is Huang's new conceptual framework for AI infrastructure. He describes AI as a "5-Layer Cake": energy at the base, then chips, infrastructure, models, and applications at the top.

The framework is not just a presentation gimmick. It reflects how NVIDIA thinks about its addressable market. The company already dominates the chip layer. With Vera Rubin and Feynman, it is reinforcing that position. But by articulating the full stack from energy to applications, Huang is signaling that NVIDIA intends to play at every level, whether through its own products, partnerships, or the ecosystems it cultivates.

Huang put it plainly in a recent interview: AI is "no longer a single breakthrough or application; it is essential infrastructure." That framing is deliberate. Essential infrastructure gets funded differently than experimental technology. It gets regulated differently. It gets built at a different scale. And NVIDIA is positioning itself as the company that builds the infrastructure of the infrastructure.

GTC as Industry Barometer

The conference itself has become something of an economic indicator for the AI sector. With 30,000 attendees from 190 countries, the composition of the audience tells you where the money and attention are flowing. Huang has called GTC "the epicenter of the AI industrial era," and while that is self-serving, it is also not wrong.

This year's event matters more than most because the AI industry is at an inflection point. The easy gains from scaling up models are getting harder to come by. Inference costs, not training costs, are becoming the binding constraint for most deployments. And the entire sector is pivoting from "build the model" to "deploy the agent," a shift that requires fundamentally different infrastructure than what has been built over the past three years.

NVIDIA's ability to address that shift, not just with faster chips but with the software, networking, and cost economics that make agentic AI practical at scale, will determine whether the company maintains its grip on the industry or starts to see its dominance erode.

What to Watch on Monday

Jensen Huang takes the stage Monday, March 16 at 8 AM Pacific for what promises to be a three-hour keynote. Three things deserve particular attention.

First, the mystery chips. Huang has explicitly promised to reveal things "the world has never seen before." Given that Vera Rubin and Feynman are already known, the surprise likely involves either an extreme-performance variant, a new product category entirely, or a Feynman timeline acceleration that nobody expected. Second, watch the Feynman photonics demonstration closely. If NVIDIA shows working silicon photonics prototypes rather than just slides, it confirms the technology is further along than most analysts assumed. Third, pay attention to NemoClaw's architecture and partner ecosystem. The companies that show up on stage to announce NemoClaw integrations will signal where agentic AI adoption is heading first.

The AI industrial era that Huang keeps talking about is not some distant future. It is being built right now, one chip reveal at a time. Monday will tell us how fast it is arriving.