Cloud Field Day Report: WEKA Throws the Data Gauntlet

I was excited for the WEKA presentation at Cloud Field Day, and the WEKIES did not disappoint. Launched in 2017 at reInvent, WEKA provides data services aimed at the HPC cloud services whether in public cloud or within an on-prem private cloud. Their customers are scientists exploring anything from drug discovery and genome sequencing to advances in autonomous transportation and electronic design automation. And yes, their solutions extend to AI as many of the infrastructure principles from HPC have formed the foundation for tightly coupled AI clusters.

When you consider optimization of an AI data pipeline, each stage of the pipeline introduces different challenges for data storage. WEKA has unified this optimization through a single software approach for each stage and further optimized for the different IO characteristics across AI models.

WEKA discusses super efficient HPC storage optimization

WEKA runs in across the four major cloud players, and the performance is eye opening – 5 million IOPs in AWS, rendering 120 fps from the cloud, and 40X faster model deployment, 2 TB/s of data in OCI, and 10X faster research. This performance also results in cost savings in reduction of duplicate copies and pay for what you use storage. This also is estimated to have saved a collective 260 million tons of carbon emissions for WEKA customers.

They deliver this in part by modernizing storage tiering by keeping metadata in the flash tier and chopping large files into small objectives and packing up tiny files into larger objects. They also pre-stage data on the SSD while they’re waiting for current process to complete. Object storage is also kept on hotter tier SSDs until space is needed to lower holistic latency and improve application performance. They walked through an S3 example but apply similar approaches across public offerings.

WEKA also drives data portability within hybrid and multi-cloud environments. They’re doing this by pulling data from on-prem, snapping to object storage, encrypting the data, and moving it to the public cloud. This is also done for bursting where analysis is delivered in the cloud and data is then moved back on prem so that public services can be spun down.

The WEKIE team also discussed zero footprint storage, or running applications and WEKA on the same servers to deliver cost savings and value at scale to the customer. This is employed within GPU farms where IO resources are under-tapped and in other hardware resources with excess IO capacity.

TechArena’s take: cloud providers are desperate to drive AI clusters to higher levels of performance and efficiency. That was the #1 topic at OCP this week, and it’s no surprise that all of the big players have integrated WEKA into their offerings. With more organizations moving HPC workloads to the cloud and more enterprises expanding their AI workloads in the cloud when they can get the GPU cycles, WEKA has an opportunity for major growth in the days ahead. I’m most intrigued by the converged mode zero footprint solution as an innovative path to increased resource and cost efficiency. I’ll be watching this space.

Previous
Previous

The Quest for Broad Data Center Advancement with Arm’s Eddie Ramirez

Next
Next

Storage Innovation in the AI Era with Solidigm’s Roger Corell and Tahmid Rahman