Jin Daily AI Trivia – Ever wonder how Nvidia names their GPU generations?

If you’re shopping around to rent an Nvidia GPU for AI training, you’ve probably seen a confusing mix of names: 4090, L40S, V100, H200 and so on.

Ever wondered what those names actually mean and where they come from?

Here’s a quick trivia tour through Nvidia’s GPU architectures and naming.

Nvidia started with the Riva TNT 2D/3D accelerator card, whose architecture was retroactively named Fahrenheit.

This kicked off a whole era of architectures named after scientists behind temperature scales: Celsius (GeForce 256 / GeForce 2), Kelvin (GeForce 3 / 4), Rankine (GeForce FX).

The theme then shifted to Curie (GeForce 6 / 7), named after Marie Curie, still in the more “classic GPU” era before modern CUDA-style compute.

The real pivot came with Tesla in the late 2000s. Tesla ditched the old fixed pipeline and brought in unified shaders, what we now casually call CUDA cores.

The Tesla architecture spanned a wide range of GeForce cards (GeForce 8 / 9 / 100 / 200 / 300 series) and, more importantly,

Nvidia’s first serious GPGPU lineup, branded as the Nvidia Tesla series. This is where “GPU for compute” started to become a proper product category, not just a hack by researchers.

Next came Fermi (GeForce 400 / 500), then Kepler (GeForce 600 / 700), and then Maxwell (GeForce 800 / 900 GTX).

Nvidia also shipped datacenter GPGPU variants for these: Fermi (M2090 6 GB), Kepler (K40 12 GB), Maxwell (M40 24 GB).

These cards all used GDDR5, which by today’s AI standards means limited bandwidth and VRAM, so they’re pretty impractical for modern large-scale AI workloads.

Fun fact: AlexNet was trained on two GTX 580s, which are Fermi-generation cards. Deep learning’s breakout ImageNet moment literally ran on what was essentially “gaming GPU plus vibes”.

Then we entered the high-memory-bandwidth era with Pascal (GTX 1000 series).

Pascal introduced HBM2 on the P100 16 GB datacenter GPU and landed in 2016, and those P100s are still usable for many AI workloads today despite being roughly a decade old.

They won’t compete with H100s, but for smaller models and classic DL work, they’re still workable.

After the AI boom picked up, Nvidia split the consumer and datacenter lines more aggressively.

The next architecture, Volta, was datacenter-only, with the flagship V100 32 GB. Volta was the first Nvidia GPU family with Tensor Cores, a hardware block specifically designed for deep learning acceleration.

Nvidia bundled V100s into the first DGX-1 systems, their first “AI supercomputer in a box”. OpenAI and Elon Musk famously received one of the earliest DGX-1 units from Jensen; the first DGX-1 was Pascal-based, then upgraded to Volta, jumping from around 170 TFLOPS to roughly 960 TFLOPS of deep learning performance.

With Tensor Cores in place, Nvidia could finally support real-time ray tracing efficiently, which led to Turing (RTX 2000 / GTX 1600 series).

Turing added RT Cores for ray tracing and brought Tensor Cores into the consumer lineup, while some GDDR6-based Turing parts were turned into low-end datacenter GPUs like the T4 16 GB.

Nvidia then realised that maintaining completely different generations for consumer and datacenter made their software stack messy.

So they unified again.

In 2020, Ampere (RTX 3000 series) launched, and the A100 80 GB datacenter GPU quickly became the “cheap enough, big enough” workhorse for modern LLMs and AI training.

To this day, A100s are still everywhere because they hit a useful sweet spot of price, performance, and VRAM.

In 2022, ChatGPT arrived and shocked the world, kicking off the AI arms race. Nvidia responded by doubling down on a single historical figure: US Rear Admiral Grace Hopper.

They launched two product lines named after her: the Grace CPU family and the Hopper GPU family. Grace CPUs exist to feed GPU clusters properly, providing more memory capacity and high-bandwidth NVLink-C2C connectivity that traditional x86 server CPUs lacked at the time.

Hopper is what the AI world is still running on today: H100 80/96 GB and H200 141 GB with HBM3e. Because of post-Covid supply constraints and insane datacenter demand, there was effectively no true consumer GeForce generation built on Hopper.

Nvidia tasted the full power of the AI money printer, and the next step was the slightly chaotic Ada Lovelace era (RTX 4000 series). Nvidia wanted every segment: consumer, workstation, and datacenter. So we ended up with three almost identical GPU dies sold at different price points: RTX 4090, RTX 6000 Ada, and the L40 / L40S 48 GB datacenter GPU, all fundamentally the same architecture with different fuses and memory configurations. Nvidia literally disables extra cores and memory lanes on the cheaper SKUs to create segmentation.

Ada Lovelace, however, does not have an HBM-based flagship datacenter GPU; it sticks with GDDR6/X for its main products.

As AI demand kept exploding, Nvidia announced Blackwell (RTX 5000 series on the consumer side) and finally delivered an HBM3e monster for datacenters: the B200 192 GB. On top of that, they introduced a full turnkey rack-scale AI system, the GB200 NVL72 – a complete rack with 72 Blackwell-class GPUs paired with Grace CPUs, totaling around 13.5 TB of GPU VRAM in one system. It’s basically “AI cluster as a product SKU”.

Greedy-mode Nvidia didn’t stop there. They reused the “6000” branding and rebadged the Blackwell-era workstation card as RTX Pro 6000, which is effectively a slightly tuned version of what you’d think of as a “5090-class” chip. Same naming trick, new generation.

In the most recent keynotes, Nvidia has already shown the next wave: Vera Rubin. This time, they go all-in and call the product lines directly Vera CPU and Rubin GPU, instead of just using Rubin as an internal codename.

For this generation, there’s no traditional consumer GeForce product in sight; Rubin is positioned squarely as an AI-optimised GPU architecture, with low-precision AI math in mind, and Vera is the next-gen Arm server CPU with even more memory bandwidth and capacity for massive GPU clusters.

That’s all for today – hopefully now those weird Nvidia names in your cloud GPU dropdown make a lot more sense.

Side note: the embedded lineage is just as fun.

Tegra X1 is based on Maxwell and powers the Nintendo Switch and Jetson Nano.
Tegra X2 is Pascal-based (Jetson TX2).
Xavier is Volta-based (Jetson Xavier NX), and was Nvidia’s first serious autonomous driving SoC family.
Orin is Ampere-based (Jetson Orin Nano and friends).
Thor is tied to Blackwell and underpins the Jetson AGX Thor generation.

Trivia Image

Jin Daily AI Trivia – Ever wonder how Nvidia names their GPU generations?

Topics