Nvidia wants to own your AI data center from end to end

By Febspot 17 Mar 2026 • 1 min read

Source: Latest news

Nvidia presented an image of 40 rectangle-shaped server racks at its GTC event, implying a single company could supply every layer of an AI data center. CEO Jensen Huang used the keynote to expand the company’s chip and system lineup, naming its Vera CPU and Rubin GPU families and unveiling a new LPX rack designed for ultra-fast inference.

The LPX will use silicon based on intellectual property Nvidia licensed from Groq for $20 billion, implemented as the Groq 3 LPU alongside Rubin GPUs. The original Groq LPU includes 500 megabytes of on-chip SRAM that can hold model weights and the KV cache, reducing off-chip DRAM requests and cutting latency.

“Things that took day-long queries are going to be produced in less than an hour,” Ian Buck, Nvidia’s head of hyperscale and HPC, told reporters. Nvidia highlighted efficiency gains from that local SRAM. TechInsights reported the LPU’s energy-per-bit for memory access is about one third of a picojoule—roughly 20 times lower than a GPU’s 6 picojoules to access DRAM.

nvidia, gtc, vera cpu, rubin gpu, lpx rack, groq, groq 3, lpu, on-chip sram, inference latency