Tech

Nvidia unveils Rubin supercomputing platform to cut LLM costs

By Febspot 06 Jan 2026 • 1 min read

Image source: Zdnet.com

Nvidia used a CES 2026 press conference to introduce Rubin, a new AI supercomputing platform the company says is designed to reduce the cost of training and running large language models. According to Nvidia, Rubin can deliver up to a 10x reduction in inference token costs and requires four times fewer graphics cards to train mixture-of-experts models compared with the older Blackwell platform.

Nvidia said it built Rubin using an "extreme codesign" approach that integrates six chips, including a central Nvidia Vera CPU with 88 custom Olympus cores and Armv9.2 compatibility, a Rubin GPU with a third-generation Transform Engine capable of up to 50 petaflops of NVFP4 compute, an NVLink 6 Switch, ConnectX-9 SuperNICs, BlueField-4 DPUs, and a Spectrum-6 Ethernet switch.

Rubin will be offered in multiple configurations; Nvidia cited an example, the Vera Rubin NVL72, which combines 36 Vera CPUs, 72 Rubin GPUs, an NVLink 6 switch, multiple ConnectX-9 SuperNICs and BlueField-4 DPUs. Nvidia said the platform aims to make large-scale AI deployment more practical and to accelerate mainstream adoption of advanced models, particularly in the consumer space, by lowering hardware and token costs.

Nvidia said the first Rubin platforms will roll out to partners in the second half of 2026, with Amazon Web Services, Google Cloud and Microsoft among the initial partners.

Key Topics

Tech, Nvidia, Rubin Platform, Vera Cpu, Rubin Gpu, Amazon Web Services