Home Update NVIDIA Announces PCIe A100 Accelerator: 250 Watt Ampere In A…

NVIDIA Announces PCIe A100 Accelerator: 250 Watt Ampere In A…

295


With the launch of their Ampere structure and new A100 accelerator barely a month behind them, NVIDIA this morning is saying the PCIe model of their accelerator as a part of the beginning of the now-virtual ISC Digital convention for top efficiency computing. The extra straight-laced counterpart to NVIDIA’s flagship SXM4 model of the A100 accelerator, the PCie model of the A100 is designed to supply A100 in a extra conventional type issue for patrons who want one thing that they’ll plug into standardized servers. Overall the PCIe A100 presents the identical peak efficiency because the SXM4 A100, nevertheless with a decrease 250 Watt TDP, real-world efficiency received’t be fairly as excessive.

The compulsory counterpart to NVIDIA’s SXM type issue accelerators, NVIDIA’s PCIe accelerators serve to flesh out the opposite aspect of NVIDIA’s accelerator lineup. While NVIDIA would gladly promote everybody SXM-based accelerators – which would come with the dear NVIDIA HGX service board – there are nonetheless quite a few prospects who want to have the ability to use GPU accelerators in commonplace, PCIe-based rackmount servers. Or for smaller workloads, prospects don’t want the sort of 4-way and better scalability supplied by SXM-form issue accelerators. So with their PCIe playing cards, NVIDIA can serve the remainder of the accelerator market that their SXM merchandise can’t attain.

The PCIe A100, in flip, is a full-fledged A100, simply in a special type issue and with a extra acceptable TDP. In phrases of peak efficiency, the PCIe A100 is simply as quick as its SXM4 counterpart; NVIDIA this time isn’t transport this as a cut-down configuration with decrease clockspeeds or fewer purposeful blocks than the flagship SXM4 model. As a consequence the PCIe card brings every thing A100 presents to the desk, with the identical heavy deal with tensor operations, together with the brand new greater precision TF32 and FP64 codecs, in addition to even sooner integer inference.






















NVIDIA Accelerator Specification Comparison
  A100

(PCIe)
A100

(SXM4)
V100

(PCIe)
P100

(PCIe)
FP32 CUDA Cores 6912 6912 5120 3584
Boost Clock 1.41GHz 1.41GHz 1.38GHz 1.3GHz
Memory Clock 2.4Gbps HBM2 2.4Gbps HBM2 1.75Gbps HBM2 1.4Gbps HBM2
Memory Bus Width 5120-bit 5120-bit 4096-bit 4096-bit
Memory Bandwidth 1.6TB/sec 1.6TB/sec 900GB/sec 720GB/sec
VRAM 40GB 40GB 16GB/32GB 16GB
Single Precision 19.5 TFLOPs 19.5 TFLOPs 14.1 TFLOPs 9.Three TFLOPs
Double Precision 9.7 TFLOPs

(half of FP32 charge)
9.7 TFLOPs

(half of FP32 charge)
7 TFLOPs

(half of FP32 charge)
4.7 TFLOPs

(half of FP32 charge)
INT8 Tensor 624 TOPs 624 TOPs N/A N/A
FP16 Tensor 312 TFLOPs 312 TFLOPs 112 TFLOPs N/A
TF32 Tensor 156 TFLOPs 156 TFLOPs N/A N/A
Relative Performance (SXM Version) 90% 100% N/A N/A
Interconnect NVLink 3

12 Links (600GB/sec)
NVLink 3

12 Links (600GB/sec)
NVLink 2

4 Links (200GB/sec)
NVLink 1

4 Links (160GB/sec)
GPU GA100

(826mm2)
GA100

(826mm2)
GV100

(815mm2)
GP100

(610mm2)
Transistor Count 54.2B 54.2B 21.1B 15.3B
TDP 250W 400W 250W 300W
Manufacturing Process TSMC 7N TSMC 7N TSMC 12nm FFN TSMC 16nm FinFET
Interface PCIe 4.0 SXM4 PCIe 3.0 SXM
Architecture Ampere Ampere Volta Pascal

But as a result of the dual-slot add-in card type issue is designed for decrease TDP merchandise, providing much less room for cooling and usually much less entry to energy as nicely, the PCIe model of the A100 does need to ratchet down its TDP from 400W to 250W. That’s a large…



Source

LEAVE A REPLY

Please enter your comment!
Please enter your name here