Oracle and AMD announced that AMD Instinct?? MI355X GPUs will be available on Oracle Cloud Infrastructure (OCI) to give customers more choice and more than 2X better price-performance for large-scale AI training and inference workloads compared to the previous generation. Oracle will offer zettascale AI clusters accelerated by the latest AMD Instinct processors with up to 131,072 MI355X GPUs to enable customers to build, train, and inference AI at scale.

To support new AI applications that require larger and more complex datasets, customers need AI compute solutions that are specifically designed for large-scale AI training. The zettascale OCI Supercluster with AMD Instinct MI355X GPUs meets this need by providing a high-throughput, ultra-low latency RDMA cluster network architecture for up to 131,072MI355X GPUs. AMD Instinct MI355X delivers nearly triple the compute power and a 50% increase in high-bandwidth memory than the previous generation.

AMD Instinct MI355 X-powered shapes are designed with superior value, cloud flexibility, and open-source compatibility--ideal for customers running today's largest language models and AI workloads. With AMD Instinct MI355X on OCI, customers will be able to benefit from: Significant performance boost: Helps customers increase performance for AI deployments with up to 2.8X higher throughput. To enable AI innovation at scale, customers can expect faster results, lower latency, and the ability to run larger AI workloads.

Larger, faster memory: Allows customers to execute large models entirely in memory, enhancing inference and training speeds for models that require high memory bandwidth. The new shapes offer 288 gigabytes of high-bandwidth memory 3 (HBM3) and up to eight terabytes per second of memory bandwidth. New FP4 support: Allows customers to deploy modern large language and generative AI models cost-effectively with the support of the new 4-bit floating point compute (FP4) standard.

This enables ultra-efficient and high-speed inference. Dense, liquid-cooled design: Enables customers to maximize performance density at 125 kilowatts per rack for demanding AI workloads. With 64 GPUs per rack at 1,400 watts each, customers can expect faster training times with higher throughput and lower latency.

Built for production-scale training and inference: Supports customers deploying new agentic applications with a faster time-to-first token (TTFT) and high tokens-per-second throughput. Customers can expect improved price performance for both training and inference workloads. Powerful head node: Assists customers in optimizing their GPU performance by enabling efficient job orchestration and data processing with an AMD Turin high-frequency CPU with up to three terabytes of system memory.

Open-source stack: Enables customers to leverage flexible architectures and easily migrate their existing code with no vendor lock-in through AMD ROCm. AMD ROCm is an open software stack that includes popular programming models, tools, comp and data processing and inference.