NVIDIA and Microsoft unvei hyperscale GPU accelerator blueprint

Providing hyperscale data centers with a fast, flexible path for AI, the new HGX-1 hyperscale GPU accelerator is an open-source design released in conjunction with Microsoft’s Project Olympus.

  • 7 years ago Posted in
HGX-1 does for cloud-based AI workloads what ATX – Advanced Technology eXtended – did for PC motherboards when it was introduced more than two decades ago. It establishes an industry standard that can be rapidly and efficiently embraced to help meet surging market demand.
 
The new architecture is designed to meet the exploding demand for AI computing in the cloud – in fields such as autonomous driving, personalized healthcare, superhuman voice recognition, data and video analytics, and molecular simulations.
 
“AI is a new computing model that requires a new architecture,” said Jen-Hsun Huang, founder and chief executive officer of NVIDIA. “The HGX-1 hyperscale GPU accelerator will do for AI cloud computing what the ATX standard did to make PCs pervasive today. It will enable cloud-service providers to easily adopt NVIDIA GPUs to meet surging demand for AI computing.”
 
“The HGX-1 AI accelerator provides extreme performance scalability to meet the demanding requirements of fast-growing machine learning workloads, and its unique design allows it to be easily adopted into existing data centers around the world,” wrote Kushagra Vaid, general manager and distinguished engineer, Azure Hardware Infrastructure, Microsoft, in a blog post.
 
For the thousands of enterprises and startups worldwide that are investing in AI and adopting AI-based approaches, the HGX-1 architecture provides unprecedented configurability and performance in the cloud.
 
Powered by eight NVIDIA® Tesla® P100 GPUs in each chassis, it features an innovative switching design – based on NVIDIA NVLink™ interconnect technology and the PCIe standard – enabling a CPU to dynamically connect to any number of GPUs. This allows cloud service providers that standardize on the HGX-1 infrastructure to offer customers a range of CPU and GPU machine instance configurations.
 
Cloud workloads are more diverse and complex than ever. AI training, inferencing and HPC workloads run optimally on different system configurations, with a CPU attached to a varying number of GPUs. The highly modular design of the HGX-1 allows for optimal performance no matter the workload. It provides up to 100x faster deep learning performance compared with legacy CPU-based servers, and is estimated at one-fifth the cost for conducting AI training and one-tenth the cost for AI inferencing.
 
With its flexibility to work with data centers across the globe, HGX-1 offers existing hyperscale data centers a quick, simple path to be ready for AI.
 
Collaboration to Bring Industry Standard to Hyperscale
Microsoft, NVIDIA and Ingrasys (a Foxconn subsidiary) collaborated to architect and design the HGX-1 platform. The companies are sharing it widely as part of Microsoft’s Project Olympus contribution to the Open Compute Project, a consortium whose mission is to apply the benefits of open source to hardware and rapidly increase the pace of innovation in, near and around the data center and beyond.
 
Sharing the reference design with the broader Open Compute Project community means that enterprises can easily purchase and deploy the same design in their own data centers.
Atos and IQM have published the findings from the first global IDC study on the current status and...
With the ability to build a supercomputer in minutes, the platform promises to reduce the time and...
Market pressures and post-pandemic transformation initiatives are driving organizations to...
The most powerful & energy-efficient HPC system in Europe based on General Purpose CPUs?
Super Micro Computer is expanding its HPC market reach for a broad range of industries by...
Breakthrough HPC clustering solution and simplified programmability enable massive scale-out of...
NVIDIA has introduced NVIDIA Quantum-2, the next generation of its InfiniBand networking platform,...
Sulis supercomputer created by university consortium to empower engineering and physical sciences...