Getty Images/iStockphoto

News

Storage's role in generative AI

Generative AI continues to create industry buzz. Experts say storage plays a critical role in making the technology run.

By

Adam Armstrong, News Writer

Published: 09 May 2023

Generative AI is rising in popularity due to a confluence of growth in IT infrastructure, including storage.

Generative AI relies on deep learning, compute and GPU, all of which have matured in the last ten years. It also needs high IOPS storage to provide fast access to large datasets tech vendors have been refining for decades as IT has continued to evolve. Storage tools such as object storage, which can scale for large datasets, and distributed parallel files systems, which provide high-performance, low-latency data processing, have been the backbone of cloud computing and the big data movement.

Now storage is becoming an underlying foundation for AI. Some AI models are small enough to execute in memory, putting more of a spotlight on compute, according to Mike Matchett, an analyst at Small World Big Data, an IT analyst firm. But large language models (LLMs) like ChatGPT require, in some cases, billions of nodes, which is too cost prohibitive to be kept in memory.

"You're not holding [billions of] nodes in memory. The storage becomes a lot more important," Matchett said.

Despite its speed, memory such as RAM is more expensive than storage, according to Steve McDowell, an analyst and founding partner at NAND Research.

"You're always going to be limited by the cost of RAM, and it's always going to be a balance [with storage]," McDowell said.

He said LLMs would need a parallel file system, such as Weka or Panasas, sitting on top of a high-performance scalable storage system, such as Dell's PowerMax, Vast Data's Universal Storage and Pure Storage's FlashBlade.

Storage's role in generative AI

Generative AI can only produce a good outcome after being trained on reams of data, according to Khalid Eidoo, co-founder and CTO of Crater Labs, an AI and machine learning company based in Toronto that works with businesses to solve specific problems using AI. One method Crater employs is a type of generative AI called generative adversarial networks (GANs), which it used to identify potential structural defects in welds when constructing a nuclear power plant.

In this case, the GAN, which uses four different neural networks, produces images that then get reconciled. Out of the hundreds of thousands of images generated, only five or six meet the high quality level needed, Eidoo said.

To support this functionality, Crater needed high-throughput storage that could read and write synchronously and chose Pure Storage's FlashBlade product. "When dealing with generative networks, you're simultaneously reading millions of images to write millions of images," Eidoo said.

GPUs play an important role in generative AI by accelerating the training of models. But when working with millions of images, the GPU buffer quickly fills up and images need to be written quickly to storage, Eidoo said. High-throughput storage can reduce the potential for a data bottleneck.

Flash not necessary, but optimal

High IOPS storage can provide a user experience more like high-performance computing, according to Matchett.

"You can do parallel file systems on a large number of spinning disks in aggregate," Matchett said.

A parallel file system feeds data from the LLMs to the GPUs, like DDN's A3I that combines DDN's Exascalar, parallel file system with NVIDIA's DGX, Matchett said.

A hybrid version of Exascalar could be used for generative AI, but it caches and tiers storage, potentially affecting performance, McDowell said. The GPUs can't sit idle, so the aggregated HDD performance will be cached to SSDs that operate faster than memory.

"[Those] that are serious about large language models, they're buying high-end flash storage," McDowell said.

Flash provides high IOPS in denser footprints and can also provide LLMs with aggregated performance, Eidoo said. It's possible to use millions of HDDs, but footprint matters. Flash storage is denser, higher performing and uses less power than HDDs. Technology that reduces power consumption now will benefit generative AI in the future.

"GPUs use power like there's no tomorrow," Eidoo said.

Cloud vs. on premises

LLMs also need space to train models. Whether that is on premises, in the public cloud or a hybrid of the two depends on the size of the model and the performance and control needed, Matchett said.

If generative AI is used for research, storing LLMs on the cloud is ideal because users can get the scale required without investing in the capex infrastructure. However, Matchett predicts vendors will offer generative AI applications that will become core to their business platforms. For those that are dependent on performance and security, on-premises storage will be key.

"As an enterprise operation, you've got production workloads that are running at some level of continuity, and that can get expensive," Matchett said.

Before choosing Pure Storage, Crater Labs worked with AWS and Google Cloud before moving to a hybrid infrastructure for speed, security and costs. Crater considered NetApp and HPE before choosing Pure.

Now, Crater Labs uses a combination of on premises -- FlashBlade and FlashBlade's built-in connection to an S3 object store bucket, according to Eidoo. Crater generates terabytes of data per week, which is inefficient to store solely on premises. Using the S3 object store lets Crater access images on the cloud for modeling.

"We knew very quickly as we started developing these generative models that the performance we were getting in the cloud wasn't adequate," Eidoo said.

Adam Armstrong is a TechTarget Editorial news writer covering file and block storage hardware and private clouds. He previously worked at StorageReview.com.

Dig Deeper on Flash memory and storage

Disaster Recovery

The emerging role of the chief resilience officer in BCDR
Business continuity, disaster recovery and resilience are often managed by a variety of different personnel. Appointing a chief ...
High availability and resiliency: A DR strategy needs both
An organization's resiliency during and after a crisis depends on many factors. High availability is one aspect of overall ...
Build and maintain digital resilience for a stronger DR program
A digital resilience program builds on existing preventive and restorative activities by identifying the ways an organization's ...

Data Backup

Tutorial: Use Linux rsync backup to protect files
Rsync is a file transfer utility designed to move data from one Linux network host to another, which has made it a popular option...
Tutorial: How to use Linux Deja Dup to back up and restore files
Deja Dup isn't advertised as a backup tool, but that doesn't mean you can't use it for data protection. Find out how to use Linux...
Veeam exec illustrates how AI fits into backup, recovery
Veeam CTO Danny Allan details the impact of AI on backup and how his company uses the tech. He also discusses what he's heard ...

Data Center

Top data center infrastructure management software in 2024
DCIM tools can improve data center management and operation. Learn how six prominent products can help organizations control ...
A rundown of server hardware vendors and the server options
Discover and compare the leading vendors in server hardware with these in-depth overviews of the blade, rack and mainframe ...
Lenovo's new ThinkSystem, ThinkAgile offerings aimed at AI
Lenovo has added the latest generation of Intel CPUs to new ThinkSystem servers and ThinkAgile HCI appliances to better address ...

Sustainability and ESG

Regulatory requirements intensify with EU's CSRD
Companies that operate in the European Union will need to conduct more extensive ESG reporting in 2024 to meet requirements of ...
How does climate change affect businesses? 5 financial impacts
Learn about five important effects climate change is having -- and will continue to have -- on the business sector, and why ...
Businesses need to prepare for climate reporting in 2024
The EU CSRD will require climate reporting starting in 2024, while businesses will need to prepare for California's climate rule ...

Close