Nvidia leaves technology giants behind with new AI chips

Subscribe to Teknoblog content on Google News:

Nvidia comes up with an important innovation in the artificial intelligence chip industry. While the company’s previously announced H100 AI chip has already achieved great success in the market, Blackwell now aims to take its leadership even further with the B200 GPU and GB200 “super chips”. These new technologies have the potential to make Nvidia a multi-trillion dollar company.

Features of Nvidia B200 and GB200 super chips

Nvidia’s B200 GPU delivers 20 petaflops of FP4 power using 208 billion transistors. GB200, on the other hand, combines two B200 GPUs with a single Grace CPU, promising 30 times more performance, especially for large language models (LLM). This combination reduces cost and energy consumption by up to 25 times compared to the H100.

In the training process, training of a model with 1.8 trillion parameters was previously carried out using 8,000 Hopper GPUs and 15 megawatts of power. According to Nvidia’s CEO, today with 2,000 Blackwell GPUs this can be done while consuming only four megawatts.

In the GPT-3 LLM benchmark, with 175 billion parameters, GB200 offers seven times more performance than H100 and provides four times faster training.

Nvidia points out several important improvements behind this major breakthrough. A second-generation transformer engine provides a significant improvement by doubling computational capacity, bandwidth, and model size. This improvement is achieved by using four bits instead of eight for each neuron. Additionally, if these GPUs are connected in large quantities, the new generation NVLink switch allows 576 GPUs to communicate with each other with a bidirectional bandwidth of 1.8 terabits/second.

This required Nvidia to develop a new network switch chip that contains 50 billion transistors and has 3.6 teraflops of FP8 computing power on its own. Previously, a cluster of just 16 GPUs spent 60 percent of its time communicating with each other and 40 percent doing actual computation.

Nvidia is targeting bulk purchases of these GPUs, offering them in larger designs, such as the GB200 NVL72, which packs 36 CPUs and 72 GPUs into a single liquid-cooled rack. This design offers a total of 720 petaflops in AI training performance, or 1,440 petaflops (i.e. 1.4 exaflops) for inference.

It is stated that companies such as Amazon, Google, Microsoft and Oracle plan to offer these rack systems on cloud services. Nvidia emphasizes that these systems can scale with tens of thousands of GB200 superchips and offer 800Gbps network connectivity with the new Quantum-X800 InfiniBand or Spectrum-X800 ethernet.

These announcements came from Nvidia’s GPU Technology Conference, which focused solely on AI and GPU operations. Therefore, no news about gaming GPUs is expected. However, the Blackwell GPU architecture could also power the RTX 50 series for desktop graphics cards in the future.