Home Update Inside right this moment’s Azure AI cloud information facilities

Inside right this moment’s Azure AI cloud information facilities

66
Inside today’s Azure AI cloud data centers


Azure CTO Mark Russinovich’s annual Azure infrastructure displays at Build are at all times fascinating as he explores the previous, current, and way forward for the {hardware} that underpins the cloud. This yr’s speak was no totally different, specializing in the identical AI platform touted in the remainder of the occasion.

Over the years it’s been clear that Azure’s {hardware} has grown more and more complicated. At the beginning, it was a main instance of utility computing, utilizing a single commonplace server design. Now it’s many alternative server varieties, capable of assist all lessons of workloads. GPUs had been added and now AI accelerators.

That final innovation, launched in 2023, reveals how a lot Azure’s infrastructure has advanced together with the workloads it hosts. Russinovich’s first slide confirmed how shortly trendy AI fashions had been rising, from 110 million parameters with GPT in 2018, to over a trillion in right this moment’s GPT-4o. That progress has led to the event of huge distributed supercomputers to coach these fashions, together with {hardware} and software program to make them environment friendly and dependable.

Building the AI supercomputer

The scale of the methods wanted to run these AI platforms is big. Microsoft’s first massive AI-training supercomputer was detailed in May 2020. It had 10,000 Nvidia V100 GPUs and clocked in at quantity 5 within the world supercomputer rankings. Only three years later, in November 2023, the newest iteration had 14,400 H100 GPUs and ranked third.

In June 2024, Microsoft has greater than 30 comparable supercomputers in information facilities around the globe. Russinovich talked concerning the open supply Llama-3-70B mannequin, which takes 6.Four million GPU hours to coach. On one GPU that might take 730 years, however with one among Microsoft’s AI supercomputers, a coaching run takes roughly 27 days.

Training is simply half the issue. Once a mannequin has been constructed, it must be used, and though inference doesn’t want supercomputer-levels of compute for coaching, it nonetheless wants plenty of energy. As Russinovich notes, a single floating-point parameter wants two bytes of reminiscence, a one-billion-parameter mannequin wants 2GB of RAM, and a 175-billion-parameter mannequin requires 350GB. That’s earlier than you add in any essential overhead, akin to caches, which may add greater than 40% to already-hefty reminiscence necessities.

All which means that Azure wants plenty of GPUS with very particular traits to push via plenty of information as shortly as attainable. Models like GPT-Four require vital quantities of…



Source hyperlink

LEAVE A REPLY

Please enter your comment!
Please enter your name here