Amazon Web Services and NVIDIA have announced an expansion of their strategic collaboration to deliver the most-advanced infrastructure, software and services to power customers’ generative artificial intelligence (AI) innovations.
The companies will bring together the best of NVIDIA and AWS technologies—from NVIDIA’s newest multi-node systems featuring next-generation GPUs, CPUs and AI software, to AWS Nitro System advanced virtualization and security, Elastic Fabric Adapter (EFA) interconnect, and UltraCluster scalability—that are ideal for training foundation models and building generative AI applications.
The expanded collaboration builds on a longstanding relationship that has fueled the generative AI era by offering early machine learning (ML) pioneers the compute performance required to advance the state-of-the-art in these technologies.
As part of the expanded collaboration to supercharge generative AI across all industries:
AWS will be the first cloud provider to bring NVIDIA® GH200 Grace Hopper Superchips with new multi-node NVLink™ technology to the cloud. The NVIDIA GH200 NVL32 multi-node platform connects 32 Grace Hopper Superchips with NVIDIA NVLink and NVSwitch™ technologies into one instance. The platform will be available on Amazon Elastic Compute Cloud (Amazon EC2) instances connected with Amazon’s powerful networking (EFA), supported by advanced virtualization (AWS Nitro System), and hyper-scale clustering (Amazon EC2 UltraClusters), enabling joint customers to scale to thousands of GH200 Superchips.
NVIDIA and AWS will collaborate to host NVIDIA DGX™ Cloud—NVIDIA’s AI-training-as-a-service—on AWS. It will be the first DGX Cloud featuring GH200 NVL32, providing developers the largest shared memory in a single instance. DGX Cloud on AWS will accelerate training of cutting-edge generative AI and large language models that can reach beyond 1 trillion parameters.
NVIDIA and AWS are partnering on Project Ceiba to design the world’s fastest GPU-powered AI supercomputer—an at-scale system with GH200 NVL32 and Amazon EFA interconnect hosted by AWS for NVIDIA’s own research and development team. This first-of-its-kind supercomputer—featuring 16,384 NVIDIA GH200 Superchips and capable of processing 65 exaflops of AI—will be used by NVIDIA to propel its next wave of generative AI innovation.
AWS will introduce three additional new Amazon EC2 instances: P5e instances, powered by NVIDIA H200 Tensor Core GPUs, for large-scale and cutting-edge generative AI and HPC workloads, and G6 and G6e instances, powered by NVIDIA L4 GPUs and NVIDIA L40S GPUs, respectively, for a wide set of applications such as AI fine-tuning, inference, graphics and video workloads. G6e instances are particularly suitable for developing 3D workflows, digital twins and other applications using NVIDIA Omniverse™, a platform for connecting and building generative AI-enabled 3D applications.