VMware : and Graphcore Collaborate to Bring Virtualized IPUs to Enterprise Environments

October 19, 2021 at 04:31 pm EDT

By Saurabh Kulkarni, Head of Engineering for North America, Graphcore; Alex Tsyplikhin, Senior Manager, Field AI Engineering, Graphcore; and Mazhar Memon, Director of R&D, VMware

We are excited to share that VMware's Project Radium will support Graphcore IPUs as part of its hardware-disaggregation initiative. This will enable pooling and sharing of Intelligence Processing Unit (IPU) resources over the primary datacenter network in virtualized, multi-tenant environments, without pushing complexity to the user or management software. The network disaggregated scale-out architecture of the IPU-PODs - coupled with flexible virtualization features in Project Radium - will unlock new frontiers in training very large models at scale and deploying models in reliable production environments for AI-based services.

A closer look at the IPU

The IPU (above, see diagram of Graphcore's Colossus™ MK2 GC200 IPU) is a new type of parallel processor designed with a keen focus on addressing the computational requirements of modern AI models. The IPU has a high degree of fine-grained parallelism at the hardware level: it supports single- and half-precision floating-point arithmetic and is ideal for sparse compute, without taking any specific dependency on sparsity in the underlying data. The processor is ideal for both training and inference of deep neural networks, which are the workhorses of contemporary machine-learning (ML) workloads.

Instead of adopting a conventional single instruction, multiple data (SIMD)/Single instruction, multiple threads (SIMT) architecture, the IPU uses a multiple instruction, multiple data (MIMD) architecture with ultra-high bandwidth, on-chip memory, and low-latency/high-bandwidth interconnects for efficient intra- and inter-chip communications. This makes IPUs an ideal target to parallelize ML models at datacenter scale.

IPU-PODs and the power of disaggregation

Scaling out from one to thousands of IPUs is seamless, thanks to IPU-POD architecture. IPU-PODs are network-disaggregated clusters of IPUs that can scale elastically (based on workload needs and independently of CPU resources that they are connected to) over the network. This allows users to dial the CPU:IPU ratio up or down in hyperscale or on-premises enterprise environments through simple resource-binding constructs. The IPU-POD architecture also enables near bare-metal performance in virtualized environments.

The flexibility offered by this independent scalability of CPU and IPU resources helps users meet workload-specific demands on compute resources in a cost-optimized manner. As an example, ML models for natural-language processing tasks are generally not CPU-intensive, whereas computer vision tasks can be CPU-intensive, due to tasks such as image pre-processing or augmentation. This can be especially useful in cloud environments, where it is easy to spin CPU resources up and down, which allows customers to reap the benefits of economies of scale.

Software considerations

Graphcore's Poplar SDK has been co-designed with the processor since Graphcore's inception. It supports standard ML frameworks, including PyTorch and TensorFlow, as well as container, orchestration, and deployment platform technologies, such as Docker and Kubernetes.

Besides support for core ML software frameworks, integration with virtualization, orchestration, and scheduling software is crucial for customers to easily use IPUs at scale in enterprise environments. Multi-tenancy, isolation, and security are key tenets that solution providers need to adhere to while operating in hyperscale environments. Resource-management components in Graphcore's software stack facilitate easy integration with a variety of cloud provisioning and management stacks, such as the one offered by VMware. This creates frictionless operation in public-cloud, hybrid-cloud, or on-premises infrastructure environments.

About Project Radium

A big step towards disaggregated computation optimized for AI, Project Radium enables remoting, pooling, and sharing of resources on a wide range of different hardware architectures, including Graphcore IPUs and IPU-PODs. Learn more here. Device virtualization and remoting capabilities are delivered across a multitude of high-performance AI accelerators without the need for explicit code changes or user intervention. Developers can fully concentrate on their models, rather than hardware-specific compilers, drivers, or software optimizations. By dynamically attaching to hardware, such as IPU-PODs, over a standard network, users will be able to leverage high-performance architectures (such as the IPU) to accelerate more demanding use cases at scale.

Enterprise AI made easy

The combination of these technologies make enterprise AI features much more accessible. Radium lets users leverage the benefits of the network disaggregated architecture of the IPU-PODs and addresses the needs of multi-tenancy, isolation, and security in the most demanding enterprise environments. Whether it is a public cloud, hybrid cloud, or on premises, the combination of VMware Radium with Graphcore IPUs is a compelling solution for bringing virtualized IPUs to enterprise environments.

Attachments

Original document
Permalink

Disclaimer

VMware Inc. published this content on 19 October 2021 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 19 October 2021 20:30:05 UTC.

	1st Jan change	Capi.
ZSCALER, INC.	-20.09%	26.2B
MONDAY.COM LTD.	+1.38%	9.06B
WALKME LTD.	-25.68%	714M
TRAFFIC CONTROL TECHNOLOGY CO., LTD.	-1.92%	486M
XPERI INC.	-9.71%	443M
INSYDE SOFTWARE CORP.	-9.63%	270M
ZHENGZHOU TIAMAES TECHNOLOGY CO.,LTD	-28.88%	206M
HANGZHOU HOPECHART IOT TECHNOLOGY CO.,LTD	-42.63%	184M
ELLIPTIC LABORATORIES ASA	-6.34%	159M

1st Jan change

Capi.

ZSCALER, INC.

-20.09%

26.2B

MONDAY.COM LTD.

+1.38%

9.06B

WALKME LTD.

-25.68%

714M

TRAFFIC CONTROL TECHNOLOGY CO., LTD.

-1.92%

486M

XPERI INC.

-9.71%

443M

INSYDE SOFTWARE CORP.

-9.63%

270M

ZHENGZHOU TIAMAES TECHNOLOGY CO.,LTD

-28.88%

206M

HANGZHOU HOPECHART IOT TECHNOLOGY CO.,LTD

-42.63%

184M

ELLIPTIC LABORATORIES ASA

-6.34%

159M

Liquid Web Expands Business Continuity Solutions with Disaster Recovery Protection Powered by VMware	Mar. 26	CI
VIAVI Solutions Inc. and VMware, Inc. Expand RIC Testbed as a Service to Advance Digital Twin Environments	Feb. 21	CI
VMware, Inc.(NYSE:VMW) dropped from S&P Software & Services Select Industry Index	Nov. 28	CI
VMware, Inc.(NYSE:VMW) dropped from S&P TMI Index	Nov. 28	CI
VMware, Inc.(NYSE:VMW) dropped from S&P Global BMI Index	Nov. 28	CI
VMware, Inc.(NYSE:VMW) dropped from FTSE All-World Index	Nov. 27	CI
Global markets live: Vertex, Amazon, Microsoft, BP, Unilever...	Nov. 23
Broadcom Borrows $28.39 Billion to Fund VMware Acquisition	Nov. 22	MT
Global markets live: HP, Nordstrom, Deere, Nvidia, Broadcom...	Nov. 22
Trending : Broadcom Finally Seals VMware Deal	Nov. 22	DJ
Broadcom Completes $69 Billion Acquisition of VMware	Nov. 22	MT
Broadcom Closes VMware Acquisition	Nov. 22	MT
Broadcom closes $69 bln VMware deal after China approval	Nov. 22	RE
'Boomerang' CEOs of major companies	Nov. 22	RE
Broadcom Inc. completed the acquisition of VMware, Inc. from Michael S. Dell, Dodge & Cox, Silver Lake Management, L.L.C., Silver Lake Partners V DE, L.P., SL SPV-2, L.P. and others.	Nov. 21	CI
Broadcom, VMware Receive Final Regulatory Approval for Merger; Expect to Close Deal by Wednesday	Nov. 21	MT
Broadcom, VMware Say $61 Billion Takeover Deal is Cleared by Regulators	Nov. 21	DJ
Broadcom plans to close $69 billion VMWare deal on Wednesday	Nov. 21	RE
Broadcom, VMWare Acquire All Regulatory Approvals For Merger, Intend to Close Acquisition November 22	Nov. 21	MT
China market regulator grants conditional approval of Broadcom-VMware deal	Nov. 21	RE
CHINA'S MARKET REGULATOR GAVE CONDITIONAL APPROVAL FOR BROADCOM'…	Nov. 21	RE
Orange Business and VMware, Inc. Transform Flexible Sd-Wan to Simplify Customer Experience with Digitalization and Automation	Nov. 08	CI
NetApp Launches Hybrid Cloud Offering for Small, Medium-Sized Businesses	Nov. 07	MT
VMware Expands Partnership With Google Cloud	Nov. 07	MT
VMware Unveils Advanced Automation Capabilities Aimed to Simplify Information Technology Workflows	Nov. 07	MT

VMware, Inc.

Equities

VMW

US9285634021

Software

VMware : and Graphcore Collaborate to Bring Virtualized IPUs to Enterprise Environments

Latest news about VMware, Inc.

Chart VMware, Inc.

Company Profile

Sector System Software