The chip developer confirms that the first three RTX 4000 models presented are each based on a different piece of silicon – and that there is still potential for a performance increase of more than 12 percent for the future RTX 4090 Ti
The buzz surrounding NVIDIA’s new generation of video cards, better and worse, refuses to fade – even if the products themselves are not yet available in any store in the world. After we received the prices and basic technical specifications for the first three models as part of the announcement at the GTC 2022 conference, we have a breakdown of the core data on which they will be based: AD104, AD103 and AD102 – and in all three cases it is a simply amazing jump in the density of the transistors inside.
The AD104 core that will be used by the GeForce RTX 4080 12GB includes 35.8 billion transistors in a physical area of 294.5 square millimeters, on the way to a density of 121.1 million tensors per square millimeter – with 60 processing clusters and 7,680 CUDA units available, plus 240 accelerators for tensor calculations, 60 accelerators for ray tracing calculations and 48MB of L2 memory, when these numbers actually make it clear that the RTX 4080 12GB exhausts everything that the hardware has to give us, so if there is further use of the AD104 it will be in the frame of reduced and cheaper RTX 3070 models .
The AD103 core that will be used by the GeForce RTX 4080 16GB, which according to every reasonable consideration should have received a different model name to help highlight the significant differences between it and the RTX 4080 12GB in every possible aspect, includes 45.9 billion transistors on a chip with a surface area of 378.6 square millimeters, with the same density as the AD104 and 80 processing clusters where 10,240 CUDA units are available, 320 tensor units, 80 RT units and a total of 80MB of L2 cache memory.
The RTX 4080 16GB itself has 76 active processing clusters and 9,728 CUDA units, which means that it is theoretically possible for another RTX 4080 model (or any RTX 4080 Ti) to use the same AD103 core, this time fully active, to provide a small improvement of approx. -5 percent in performance. This is purely speculation at this point, it’s worth noting, as we have historical examples of processing cores that never got a chance to go into a product when fully operational due to various throughput and manufacturing considerations.
The formidable AD102 core already has 76.3 billion transistors spread over 608 square millimeters, with an even better density of 125.5 million transistors per square millimeter and 144 processing clusters where 18,432 CUDA units are available, 576 tensor units, 144 RT units and 96MB of L2 memory in the overall summary.
Monstrous data in the full sense of the word, of which the GeForce RTX 4090 uses only about 87.5 percent – which already draws us the outlines for a future RTX 4090 Ti model that will be able to use 100 percent of the hardware components on the core, or close to it.
The most amazing figure for all three Ada Lovelace-era cores is that they all have more transistors in them than the big GA102 core from the Ampere generation used by the RTX 3090 and RTX 3090 Ti – with “only” 28.3 billion transistors in an area of 628 square millimeters, which translates to a density of 45.1 million transistors per square millimeter. In other words, NVIDIA’s jump from Samsung’s 8nm manufacturing process to TSMC’s 4N process allowed it to approach triple (!) the density of transistors in its cores, which helps explain, among other things, why an RTX 4080 12GB core has less than half the area of GA102 and half the dynamic memory should offer similar or higher performance than the RTX 3090 Ti using a power envelope of 285 watts instead of 450 watts.
There is no doubt that the relatively long wait between the RTX 3000 generation and the RTX 4000 generation, in the shadow of the availability crisis of the hardware world, helped make the jump between generations the most exciting – and now it will be interesting to find out whether the competitor AMD’s transition to a slightly older production process than -5 nm as part of the Radeon RX 7000 generation will be able to keep up.