Search

AMD Datacenter & AI Conference Recap: All Eyes on AI and Cost Optimization

The News: AMD recently held its AMD Datacenter & AI Conference where the company made a flurry of announcements such as its EPYC data center processor, its upcoming Instinct Accelerators, and its growing software ecosystem. Read AMD’s Press Release for more information.

AMD Datacenter & AI Conference Recap: All Eyes on AI and Cost Optimization

Analyst Take: We continue to witness more organizations continuing to embark on digital transformation (DX) strategies that involve AI permeating throughout the company in the hope of gaining productivity improvements, gross and operating margin accretion strategies, supply chain risk mitigation, and much more. According to Futurum Group’s most recent Digital Transformation Index for 2023, the results are indicative of that viewpoint. Thus, AMD’s Datacenter & AI Conference announcements were on the mark with market needs.

AMD made a flurry of product announcements, which included updates to its 4th Generation server processors (e.g., Bergamo and Genoa-X), its Instinct MI300 Series accelerator, and updates to its networking portfolio. The following are key announcements we were tracking.

4th GEN AMD EPYC 97X4 (Bergamo): Targeting CSP Cloud-Native Workloads

The company’s EPYC 97X4 processors (e.g., 9754, 9754S, and 9734) were developed using TSMC’s 5-nanometer process and designed for higher-density and power efficiency targeting cloud-native workloads (e.g., NoSQL, Content & Collaboration, etc.) that are wrapped around its Zen 4c architecture.

The EPYC 9754 processor has 128 cores in a single socket coupled with 256 threads per socket. The chip has a base clock frequency of 2.25 GHz while its max boost clock is 3.1 GHz. The processor can use up to 6144 GB of memory in 12 memory channels. It has a maximum memory bandwidth is 461.0 GB/s.

Image Source: AMD

Like its rivals (e.g., Ampere, etc.) that are targeting the fast-growth cloud service provider (CSP) server segment, the company added greater density over clock speeds that promises to make the economics much richer for the CSP segment.

Fundamentally, this allows CSPs, which have thousands of servers in their rapidly growing data centers, greater ability to add fewer servers together with consuming less power, which translates into better OPEX and CAPEX savings.

Backing its claims, AMD illustrated how its AMD EPYC 9754 versus Ampere saw 55% fewer servers, 39% less power annually, 39% lower operating expenses, and 19% lower total cost of ownership (TCO). In essence, this allows the ability to run more virtual CPUs (vCPUs) that can run on one server instance.

Image Source: AMD

4th GEN AMD EPYC (Genoa-X) 3D V-Cache Targeting Technical Computing

The Genoa-X line of processors (e.g., 9684X, 9384X and 9184X) are targeting technical computing workloads such as design automation, fluid dynamics, finite element analysis. The 4th Gen AMD EPYC 9684X has 96 cores coupled with 192 threads. The chip has a base clock frequency of 2.55GHz while its max boost clock is up to 3.7GHz. The processor can use up to 6144 GB of max (DDR5-4800) memory in 12 memory channels and has a maximum memory bandwidth of 461.0 GB/s. Its thermal design is 400W.

Image Source: AMD

The chip is a descendant of the Milan-X chip featuring V-cache, which tripled the amount of L3 cache available by stacking a 3D die on an existing Zen 3 chip. The target for the chip is for customers who need to maximize per-core performance or workloads that benefit from extra cache such as CAD/CAM, finite element analysis, and electronic design automation (EDA).

Accelerator Market: Instinct MI300A & MI300X

Like its key rivals such as NVIDIA, AMD sees a massive opportunity in the $30 billion data center AI accelerator market, which is expected to grow to $150 billion-plus by 2027, a >50% compound annual growth rate (CAGR). The market will continue to grow as more companies continue to see the promise around ChatGPT, among other factors.

Image Source: AMD

AMD is targeting generative AI workloads with its Instinct MI300A and MI300X AI accelerators that are based on its CDNA 3 GPU architecture for accelerating AI workloads. The MI300A is a CPU+GPU, while the MI300X is a GPU.

The AMD Instinct MI300A (CPU+GPU) accelerated processing unit (APU) is a combination of a CPU plus GPU design. The device has 24 Zen 4 CPU cores that incorporate AMD’s CDNA 3 graphics engine with 128 GB of HBM3 (high bandwidth memory) 8 stacks. The company noted that the MI300A is going to be deployed at Lawrence Livermore National Labs El Capitan Supercomputer.

The AMD Instinct MI300X will be AMD’s flagship for large AI. The device is a GPU-only AI accelerator that is targeting large language models (LLMs) and generative AI. The device has a total of 192 GB of HBM3 memory delivering 5.2 TB/s of memory bandwidth and is comprised of 152 billion transistors across a 12-chipset design.

The AMD Instinct MI300X is designed to go toe-to-toe with NVIDIA’s H100 Hopper AI accelerator and is optimized for LLMs and generative AI.
The MI300X will have a total of 192 GB of HBM3 memory on board, delivering 5.2 TB/s of memory bandwidth and comprises 153 billion transistors in a 12 chiplet design.

AMD Infinity Architecture Platform

AMD also introduced its Instinct Platform. The device is an 8-way design powered by 8 MI300X chips, has 1.5 TB of HBM3 memory (high bandwidth memory), and an industry standard design (Open Compute Project server platform) that will help companies strengthen AI training and inferencing. The platform is intended to compete with NVIDIA’s DGX Supercomputer platform for AI applications.

Image Source: AMD

Overall, the AMD Datacenter & AI Conference did not disappoint, and the company is laser-focused on capturing a greater share of the server market targeting cloud-native (CSPs) and technical computing workloads with its EPYC 97X4 (Bergamo) and Genoa-X CPUs. Fundamentally, AMD recognizes that the hyperscale providers have been using their own Arm-based CPUs coupled with others (e.g., Ampere) allowing for greater density per server since you are able to add more virtual CPU (vCPUs) cores on a single server. In fact, the company backed up its claims versus Ampere showing it could use fewer servers with less power, thereby saving on CAPEX and OPEX.

AMD clearly has NVIDIA in its sights and wants a sizable slice of the $30 billion data center AI accelerator market that is pegged to grow to $150 billion-plus by 2027, a >50% CAGR as companies continue to see the benefits of generative AI. AMD also announced its MI300A (CPU+GPU), MI300X (GPU), and AMD Infinity Architecture Platform. To be sure, a key part of the company’s strategy for taking a portion of the market (and trying to thwart NVIDIA) is partnering with the open source development community (e.g., PyTorch, Hugging Face, etc.) with its ROCm AI open-source alternative to NVIDIA’s CUDA, which bodes well for the company. Thus, this is also like BI groups using R or Python for their analysis since it is much easier to find global talent, coupled with best practices sharing.

Disclosure: The Futurum Group is a research and advisory firm that engages or has engaged in research, analysis, and advisory services with many technology companies, including those mentioned in this article. The author does not hold any equity positions with any company mentioned in this article.

Analysis and opinions expressed herein are specific to the analyst individually and data and other information that might have been provided for validation, not those of The Futurum Group as a whole.

Other insights from The Futurum Group:

AMD and Hugging Face Team Up to Democratize AI Compute – Shrewd Alliance Could Lead to AI Compute Competition, Lower AI Costs

AMD Datacenter & AI Event

AMD 4th Gen Epyc CPUs Now Optimized for Embedded Systems

SHARE:

Latest Insights:

T-Mobile Raises 2024 Guidance Driven by Q1 2024 Service Revenue, Profitability, and High-Speed Internet Breakthroughs Plus Record Low Postpaid Phone Churn
The Futurum Group’s Ron Westfall and Daniel Newman examine T-Mobile’s Q1 2024 results and why they expect T-Mobile to fulfill its raised 2024 guidance as the company is outperforming its rivals across important mobile network service categories.
Generative AI-Powered Workflows Are Helping to Fuel Performance Across All Key Business Areas
The Futurum Group’s Daniel Newman and Keith Kirkpatrick cover ServiceNow’s Q1 2024 earnings and discuss how the company has successfully leveraged generative AI across its platform to drive revenue growth.
A Game-Changer in the Cloud Software Space
The Futurum Group’s Paul Nashawaty and Sam Holschuh provide their insights on the convergence of IBM, Red Hat, and now potentially HashiCorp and the compelling synergy in terms of developer tools, security offerings, and automation capabilities.
Google Announces Q1 2024 Earnings, Powered by Revenue Gains across Cloud, Advertising, AI, and Search
The Futurum Group’s Steven Dickens and Keith Kirkpatrick cover Google’s Q1 2024 earnings and discuss how the company’s innovations across cloud, workflows, and AI are helping it to drive success.