Gorodenkoff - stock.adobe.com

News

AMD Instinct MI300 AI accelerator takes aim at Nvidia GPUs

Data center-grade GPUs and accelerators for enterprise customers and cloud vendors are the new battleground for AI hardware. AMD and Google advance the race with new chips.

By

Don Fluckinger, Senior News Writer

Published: 06 Dec 2023

Both AMD and Google released AI accelerators today: AMD Instinct MI300 and Google TPU v5e. Both are data center-grade processors that speed AI tasks, such as training large language models.

AMD is playing catch-up to Nvidia, which has parleyed its gaming tech expertise into an AI processing superpower. AI typically runs on chips adjacent to CPUs; AMD's accelerator is a GPU, while Google's is a proprietary tensor processing unit (TPU) that powers AI in the Google Cloud.

What do the 153 billion transistors in AMD's MI300 accelerator -- and its claimed 17TB/second bandwidth -- get enterprise IT buyers? The Instinct MI300 chips run AI operations much faster, AMD CEO Lisa Su said at a launch event.

AMD customers and partners there, including Dell, HPE, Microsoft, Meta, Oracle, Databricks and others, said they had the chips either running in their products and services, are testing them, or plan to use them soon. Not only are the chips faster than their predecessors, but they can be combined to further improve performance.

"Generative AI is the most demanding data center workload ever," Su said. "It requires tens of thousands of accelerators to train and refine models with billions of parameters. And that same infrastructure is also needed to answer the millions of queries from everyone around the world.

A graphic representation of the AMD Instinct MI300 GPU Accelerator. — AMD Instinct MI300 GPU Accelerator.

"It's very simple: The more compute you have, the more capable the model, the faster the answers are generated. And the GPU is at the center of this generative AI world," she said.

The hardware upon which AI accelerators run has become a key feature of AI accelerators, said Daniel Newman, Futurum Research founder. It's not just speeds and feeds anymore but open source platforms that let developers build software and connect their large language models to the hardware.

"Today is all about AMD entering with valid, competitive capabilities and products using open source in the era of an incredibly strong or even dominant Nvidia in the AI training [chip] and overall AI chip," said Daniel Newman, Futurum Research founder. "It isn't just about performance. It is also about availability, viability, capability, and the world understanding that open-source collaborative ecosystems for AI are important."

Enterprise AI buyers, take note

Many companies still field their own GPUs in their data centers or colocations -- even in the cloud-first era -- Gartner analyst Chirag Dekate said. Data privacy regulations or the need for intellectual property protection force companies to take a hybrid approach that mixes their own data centers and public clouds such as Google, AWS and Microsoft.

In some cases, an enterprise might run its proprietary LLM in its own data center to keep it off a public cloud.

The AMD GPU accelerators will be adopted not only by large public clouds but also by individual enterprise customers, Dekate predicted. The combination of hardware, software and partnerships will help those customers set up their AI operations faster.

"What AMD is announcing today is not just a GPU that can be deployed in the data center," Dekate said. "They're also announcing cloud partnerships. They're announcing platforms and software stacks. [Together they will] enable enterprises to hit the ground running with an AMD-native strategy."

Google delivers new AI accelerators

Amid its Gemini general AI model release and unveiling of plans to be the first manufacturer to put generative AI on smartphones, Google also released the TPU v5e, its latest AI accelerator. TPUs power Google's own AI in apps such as Maps, YouTube and Gmail, and it hopes Google Cloud Platform customers will follow suit.

In the future, it's likely that enterprise cloud services buyers will have different AI services powered by different manufacturers' chips, Dekate said. Some enterprise applications and operations will work best -- or cheapest -- on one chipmaker's array compared to the others. It will depend on the scale and bandwidth required for a job, such as training a large enterprise language model.

Competition will be the key to keeping AI chips viable and to keep advancements moving in the AI hardware race as each manufacturer tries to outdo the others, Newman said.

"Ultimately we need a highly competitive marketplace for AI infrastructure, chipsets, software, and more," Newman said. "[Generative AI represents] the biggest transformation our world has seen technologically, and a healthy, vibrant, competitive ecosystem is critical."

Don Fluckinger covers digital experience management, end-user computing, CPUs and assorted other topics for TechTarget Editorial. Got a tip? Email him here.

Dig Deeper on Data center hardware and strategy

SearchWindowsServer

What do admins need to know about the CLI for Microsoft 365?
The options to manage the Microsoft 365 platform are seemingly endless, but the CLI for Microsoft 365 offers distinct advantages ...
Microsoft delivers light December Patch Tuesday for admins
IT operations teams should prioritize deploying the Windows cumulative update to dispatch a critical MSHTML bug affecting ...
Stop errors with these PowerShell ValidatePattern examples
Using regular expressions to validate input information and generate a tailored error message on a new user script can prevent ...

Cloud Computing

Top enterprise hybrid cloud management tools to review
The techniques used to build hybrid cloud architectures have come a long way, but managing these environments long term is plenty...
Top 8 benefits of hybrid cloud for business
Why choose between public cloud and private systems when you can have both? With hybrid cloud, enterprises can address workload ...
How to survive a cloud service outage
Outages might be rare, but they are rarely cheap -- any amount of downtime can cost you money. Learn how to minimize the risks ...

Storage

Top 4 data storage trends for 2024
This year's storage trends include some technologies, such as artificial intelligence, that make platforms more secure and easier...
UltiHash compresses, deduplicates data on a binary level
In this Q&A, the CEO of UltiHash talks about the continuous compression and deduplication it aims to provide primary storage ...
Solix eyes data lakes as it builds out archive platform
Solix Technologies' latest Common Data Platform update focuses on archiving, sorting and storing customer data for data lake ...

Sustainability and ESG

Regulatory requirements intensify with EU's CSRD
Companies that operate in the European Union will need to conduct more extensive ESG reporting in 2024 to meet requirements of ...
How does climate change affect businesses? 5 financial impacts
Learn about five important effects climate change is having -- and will continue to have -- on the business sector, and why ...
Businesses need to prepare for climate reporting in 2024
The EU CSRD will require climate reporting starting in 2024, while businesses will need to prepare for California's climate rule ...

Close