Transforming the Data Center with CXL

Steven Woo

Fellow and Distinguished Inventor

November 2022

Key Data Center Memory Challenges

Decreasing memory bandwidth per core

Source: Meta, OCP Summit

Presentation Nov. '21

Huge latency and

capacity gap

Capacity and Latency

Bandwidth

Server Memory Hierarchy

Stranded memory resources

and low utilization

CPU CPU

GPU xPU

Direct‐attached Memory

CXL‐attached memory is the current focus of the CXL ecosystem

2

CXL Memory Tiers Span the Latency Gap

Direct‐attached native DRAM

Direct‐attached CXL DRAM

Pooled CXL DRAM

CXL switch/fabric‐attached Memory & SCM

Solid State Drives

  • Memory tiering is being introduced to the Data Center, much like storage tiering before it
  • The industry is now working on software infrastructure to take advantage of these tiers

3

Benefits of CXL‐Attached Memory

CXL‐Enabled Server

Classic Server

  1. Increase memory bandwidth & capacity
  2. Improve bandwidth per unit of capacity
  3. Media independence
    • For the first time a CPU will be able to utilize a prior generation of DDR memory
  4. Improve support for persistent memory technology
  5. Lower solution costs
    • 1/3 the pins are required for the same memory bandwidth

Type 1

PCIe

Type 2

Smart

PCIe

AI Accel

HBM

NIC

CXL

CPU

CXL

HBM

CXL

CXL

Type 3

DDR

CXL Memory Module

CXL Memory

Native DRAM

CXL Memory Module

Controller

Addition of CXL DRAM provides >2x the memory capacity

and 1.3‐1.5x the memory bandwidth (GB/s)

All while leveraging the existing PCIe electrical interface

CXL enables new memory alternatives and lower solution costs

4

Scaling CXL‐Attached Memory

CXL Memory Pooling

CXL Switch/Fabric‐Attached Memory

Compute Node

CPU 0

CPU 1

CXL

C

C

C

∙ ∙ ∙

C

C

C

C

C

C

∙ ∙ ∙

C

C

C

CXL Switch

C

C

C

∙ ∙ ∙

C

C

CXL Switch

C

C

∙ ∙ ∙

C

C

C

CXL Switch

Memory Node

∙ ∙ ∙

M

M

M

∙ ∙ ∙

M

M

M

CXL Pooling Memory

Controller

M

M

M

∙ ∙ ∙

M

M

M

CXL Fabric

Switch

CXL Switch

M

M

M

∙ ∙ ∙

M

M

CXL Switch

M

M

∙ ∙ ∙

M

M

M

  • CXL Pooling On‐demand access to a shared pool of memory for high utilization and improved TCO
  • CXL Pooling with Switch Increased scale with modest latency penalty
  • CXL Pooling with Switch Fabric Highest scalability enabling new architectures; higher latency

CXL provides mechanisms for CPUs to allocate/deallocate memory from a common pool

5

This is an excerpt of the original content. To continue reading it, access the original document here.

Attachments

  • Original Link
  • Original Document
  • Permalink

Disclaimer

Rambus Inc. published this content on 04 November 2022 and is solely responsible for the information contained therein. Distributed by Public, unedited and unaltered, on 04 November 2022 19:11:09 UTC.