Amazon Web Services has unveiled a new generation of GPU-powered cloud computing instances aimed squarely at customers running machine learning applications.
The P2’s a major step up from the previous generation of GPU-powered AWS instances, and it has plenty of memory to burn. But it’s built with an earlier generation of GPU, so it’s less suited for the bleeding-edge machine learning work that needs the most recent advances in GPU technology.
New hotness …
The prior variety of AWS instances with GPUs, the G2, maxed out at four GPUs with 4GB of video RAM and 80GB of system memory per instance. Amazon is currently billing the G2 as suitable for “graphics-intensive applications,” rather than machine learning specifically.
The P2, on the other hand, is definitely for machine learning. P2 instances start with four virtual CPUs, a single Nvidia K80 GPU with 4GB of memory, and 61GB of system memory. At the top end, they provide 64 CPUs, 16 K80 GPUs, and 732GB of RAM. With the sheer amount of memory on board, both in system memory and with each GPU, the P2 is far more suited to the fast-increasing demands imposed by modern machine learning applications.
Another advantage of the P2 line: It supports Nvidia’s GPUDirect protocol for direct communication between the GPU and other system resources, such as storage and memory. It reduces GPU contention for bus resources and, thus, speeds operations that require multiple GPUs to work in concert.
As with the G2, Amazon offers reserved-instance and spot pricing for P2 instances, which work out cheaper per hour than the earlier generation: $1.44 per hour for the p2.16xlarge P2 Linux instances versus $2.60 per hour for g2.8xlarge G2 Linux instances.
… or old and busted?
However, the P2 is potentially behind the curve due to the GPU it uses: the Nvidia K80 Kepler GPU, which first appeared in 2012.
On the face of it, this isn’t a bad idea. The Kepler GPUs are a staple building block for constructing machine learning architectures. But Kepler is already showing its age. Nvidia’s more recent Pascal architecture has been touted as providing substantial speed improvements over its predecessors.
Some of Pascal’s advantages come from hardware design improvements. such as NVLink, another method of speeding communications between GPUs and the CPU without tying up the PCIe bus. But the Pascal line also sports new GPU instructions designed to accelerate machine learning operations; in fact, Nvidia’s CuDNN deep neural network library are already taking advantage of them.
The pressure’s on
Amazon has been facing competition when it comes to providing cloud-based GPU resources. Back in August, Microsoft unveiled several new VMs for Azure powered by Nvidia Tesla GPUs, although they also use the older Kepler architecture, not Pascal. In addition, they come outfitted with less system memory than the P2s (224GB RAM at the high end, versus 732GB for the P2) and fewer GPU cores (four K80s versus the P2’s 16).
Google also recently revamped its GPU-powered Google Cloud Machine Learning, but it differs from both Amazon and Microsoft in that it’s opaque. It trains and deploys algorithms, but it shields the user from the need to provision machines or GPUs, and it performs its own performance turning.
Not everyone is fond of this hands-off approach, so there’s room for both that and the hands-on Amazon/Microsoft models to flourish. Amazon’s plan is to put raw power — even if it isn’t the most recent generation — into developers’ hands.