Q&A for Accelerating Gen AI Dataflow Bottlenecks

Generative AI is front page news everywhere you look. With advancements happening so quickly, it is hard to keep up. The SNIA Networking Storage Forum recently convened a panel of experts from a wide range of backgrounds to talk about Gen AI in general and specifically discuss how dataflow bottlenecks can constrain Gen AI application performance well below optimal levels. If you missed this session, “Accelerating Generative AI: Options for Conquering the Dataflow Bottlenecks,” it’s available on-demand at the SNIA Educational Library.

We promised to provide answers to our audience questions, and here they are.

Q: If ResNet-50 is a dinosaur from 2015, which model would you recommend using instead for benchmarking?

A: Setting aside the unfair aspersions being cast on the venerable ResNet-50, which is still used for inferencing benchmarks 😊, we suggest checking out the MLCommons website. In the benchmarks section you’ll see multiple use cases on Training and Inference. There are multiple benchmarks available that can provide more information about the ability of your infrastructure to effectively handle your intended workload. Read More

Accelerating Generative AI

Workloads using generative artificial intelligence trained on large language models are frequently throttled by insufficient resources (e.g., memory, storage, compute or network dataflow bottlenecks). If not identified and addressed, these dataflow bottlenecks can constrain Gen AI application performance well below optimal levels.

Given the compelling uses across natural language processing (NLP), video analytics, document resource development, image processing, image generation, and text generation, being able to run these workloads efficiently has become critical to many IT and industry segments. The resources that contribute to generative AI performance and efficiency include CPUs, DPUs, GPUs, FPGAs, plus memory and storage controllers. Read More

A Q&A on the Open Programmable Infrastructure (OPI) Project

Last month, the SNIA Networking Storage Forum hosted several experts leading the Open Programmable Infrastructure (OPI) project with a live webcast, “An Introduction to the OPI (Open Programmable Infrastructure) Project.” The project has been created to address a new class of cloud and datacenter infrastructure component. This new infrastructure element, often referred to as Data Processing Unit (DPU), Infrastructure Processing Unit (IPU) or xPU as a general term, takes the form of a server hosted PCIe add-in card or on-board chip(s), containing one or more ASIC’s or FPGA’s, usually anchored around a single powerful SoC device.

Our OPI experts provided an introduction to the OPI Project and then explained lifecycle provisioning, API, use cases, proof of concept and developer platform. If you missed the live presentation, you can watch it on demand and download a PDF of the slides at the SNIA Educational Library. The attendees at the live session asked several interesting questions. Here are answers to them from our presenters.

Q. Are there any plans for OPI to use GraphQL for API definitions since GraphQL has a good development environment, better security, and a well-defined, typed, schema approach?

Read More

FAQ on CXL and SDXI

How are Compute Express Link™ (CXL™) and the SNIA Smart Data Accelerator Interface (SDXI) related? It’s a topic we covered in detail at our recent SNIA Networking Storage Forum webcast, “What’s in a Name? Memory Semantics and Data Movement with CXL and SDXI” where our experts, Rita Gupta and Shyam Iyer, introduced both SDXI and CXL, highlighted the benefits of each, discussed data movement needs in a CXL ecosystem and covered SDXI advantages in a CXL interconnect. If you missed the live session, it is available in the SNIA Educational Library along with the presentation slides. The session was highly rated by the live audience who asked several interesting questions. Here are answers to them from our presenters Rita and Shyam.

Q. Now that SDXI v1.0 is out, can application implementations use SDXI today?

Read More

An Overview of the Linux Foundation OPI (Open Programmable Infrastructure)

A new class of cloud and datacenter infrastructure component is emerging into the marketplace. This new infrastructure element, often referred to as Data Processing Unit (DPU), Infrastructure Processing Unit (IPU) or xPU as a general term, takes the form of a server hosted PCIe add-in card or on-board chip(s), containing one or more ASIC’s or FPGA’s, usually anchored around a single powerful SoC device.

The Open Programmable Infrastructure (OPI) project has been created to address the configuration, operation, and lifecycle for these devices. It also has the goal of fostering an open software ecosystem for DPUs/IPUs covering edge, datacenter, and cloud use cases. The project intends to delineate what a DPU/IPU is, to define frameworks and architecture for DPU/IPU-based software stacks applicable to any vendors’ hardware solution, to  create a rich open-source application ecosystem, to integrate with existing open-source projects aligned to the same vision such as the Linux kernel, IPDK.io, DPDK, DASH, and SPDK to create new APIs for interaction with and between the elements of the DPU/IPU ecosystem:

Read More