NVIDIA DGX A100

The World's First System for datascience,
deep learning and inferencing.

Request a Quotation

EXPLORE THE POWERFUL COMPONENTS OF

DGX A100

01.

8X NVIDIA A100 GPUS WITH 320 GB TOTAL GPU MEMORY

12 NVLinks/GPU, 600 GB/s GPU-to-GPU Bi-directonal Bandwidth

02.

6X NVIDIA NVSWITCHES

4.8 TB/s Bi-directional Bandwidth, 2X More than Previous Generation NVSwitch

03.

9x MELLANOX CONNECTX-6 200Gb/S NETWORK INTERFACE

450 GB/s Peak Bi-directional Bandwidth

04.

DUAL 64-CORE AMD CPUs AND 1 TB SYSTEM MEMORY

3.2X More Cores to Power the Most Intensive AI Jobs

05.

15 TB GEN4 NVME SSD

25GB/s Peak Bandwidth, 2X Faster than Gen3 NVME SSDs

For all the specifications

Download datasheet

THE TECHNOLOGY INSIDE NVIDIA DGX A100

GPU A100

The NVIDIA A100 Tensor Core GPU delivers unprecedented acceleration for AI, data analytics, and high-performance computing (HPC) to tackle the world’s toughest computing challenges. With third-generation NVIDIA Tensor Cores providing a huge performance boost, the A100 GPU can efficiently scale up to the thousands or, with Multi-Instance GPU, be allocated as seven smaller, dedicated instances to accelerate workloads of all sizes.

MULTI-INSTANCE GPU

With MIG, the eight A100 GPUs in DGX A100 can be configured into as many as 56 GPU instances, each fully isolated with their own high-bandwidth memory, cache, and compute cores. This allows administrators to right-size GPUs with guaranteed quality of service (QoS) for multiple workloads.

NVLINK & NVSWITCH

The third generation of NVIDIA^® NVLink^™ in DGX A100 doubles the GPU-to-GPU direct bandwidth to 600 gigabytes per second (GB/s), almost 10X higher than PCIe Gen4. DGX A100 also features next-generation NVIDIA NVSwitch™, which is 2X times faster than the previous generation.

Infiniband

DGX A100 features the latest Mellanox ConnectX-6 VPI HDR InfiniBand/Ethernet adapters, each running at 200 gigabits per second (Gb/s) to create a high-speed fabric for large-scale AI workloads.

Optimized Software Stack

DGX A100 integrates a tested and optimized DGX software stack, including an AI-tuned base operating system, all necessary system software, and GPU-accelerated applications, pre-trained models, and more from NGC^™.

Security

DGX A100 delivers the most robust security posture for AI deployments, with a multi-layered approach stretching across the baseboard management controller (BMC), CPU board, GPU board, self-encrypted drives, and secure boot.

Major deep learning frameworks pre-installed

ESSENTIAL BUILDING BLOCK OF THE AI DATA CENTER

Universal AI system

NVIDIA DGX A100 is the universal system for all AI infrastructure, from analytics to training to inference. It sets a new bar for compute density, packing 5 petaFLOPS of AI performance into a 6U form factor, replacing legacy infrastructure silos with one platform for every AI workload.

DGXperts: Integrated Access to AI Expertise

NVIDIA DGXperts are a global team of 14,000+ AI-fluent professionals who have built a wealth of experience over the last decade to help you maximize the value of your DGX investment.

Fastest Time To Solution

NVIDIA DGX A100 is the world’s first AI system built on the NVIDIA A100 Tensor Core GPU. Integrating eight A100 GPUs, the system provides unprecedented acceleration and is fully optimized for NVIDIA CUDA-X^™ software and the end-to-end NVIDIA data center solution stack.

Unmatched Data Center Scalability

NVIDIA DGX A100 features Mellanox ConnectX-6 VPI HDR InfiniBand/Ethernet network adapters with 450 gigabytes per second (GB/s) of peak bi-directional bandwidth. This is one of the many features that make DGX A100 the foundational building block for large AI clusters such as NVIDIA DGX SuperPOD^™, the enterprise blueprint for scalable AI infrastructure.

GAME CHANGING PERFORMANCE

Analytics

PageRank

Faster Analytics Means Deeper Insights to Fuel AI Development

Graph Edges per Second (Billions)

3,000X CPU Servers vs. 4X DGX A100. Published Common Crawl Data Set: 128B Edges, 2.6TB Graph.

Training

NLP: BERT-Large

Faster Training Enables the Most Advanced AI Models

Sequences per Second

BERT Pre-Training Throughput using PyTorch including (2/3)Phase 1 and (1/3)Phase 2. Phase 1 Seq Len = 128, Phase 2 Seq Len = 512. V100: DGX-1 with 8X V100 using FP32 precision. DGX A100: DGX A100 with 8X A100 using TF32 precision.

Inference

Peak Compute

Faster Inference Increases ROI Through Maximized System Utilization

TeraOPS per Second

CPU Server: 2X Intel Platinum 8280 using INT8. DGX A100: DGX A100 with 8X A100 using INT8 with Structural Sparsity.