This article is based on Santosh Radha’s brilliant talk at the AI Accelerator Summit in San Jose. As an AIAI member, you can enjoy the complete recording here. For more exclusive content, head to your membership dashboard.
Generative AI is revolutionizing how we interact with technology. From chatbots that converse like humans to image generators producing stunning visuals, this incredible tech is transforming our world.
But beneath these mind-blowing capabilities lies a massive computing infrastructure packed with technical complexities that often go unnoticed.
In this article, we’ll dive into the realm of high-performance computing (HPC) and the challenges involved in productionizing generative AI applications like digital twins. We’ll explore the explosive growth in computing demands, the limitations of traditional HPC setups, and the innovative solutions emerging to tackle these obstacles head-on.
But first, let me quickly introduce myself. I’m Santosh, and my background is in physics. Today, I head research and product at Covalent, where we focus on orchestrating large-scale computing for AI, model development, and other related domains.
Now, let’s get into it.
The rise of generative AI
Recently, at the GDC conference, Jensen Huang made an interesting observation: he called generative AI the “defining technology of our time” and termed it the fourth industrial revolution. I’m sure you’d all agree that generative AI is indeed the next big thing.
We’ve already had the first industrial revolution with steam-powered machines, followed by the advent of electricity, and then, of course, computers and the internet. Now, we’re witnessing a generative AI revolution that’s transforming how we interact with various industries and touching almost every sector imaginable.
We’ve moved beyond machine learning; generative AI is making inroads into numerous domains. It’s used in climate tech, health tech, software and data processing, enterprise AI, and robotics and digital twins. It’s these digital twins that we’re going to focus on today.
Digital twins: Bridging the physical and virtual worlds
In case you’re not familiar with digital twins, let me explain the concept. A digital twin is a virtual representation of a physical system or process. It involves gathering mathematical data from the real-world system and feeding it into a digital model.
For instance, let’s consider robotics and manufacturing applications. Imagine a large factory with numerous robots operating autonomously. Computer vision models track the locations of robots, people, and objects within the facility. The goal is to feed this numerical data into a database that a foundational AI model can understand and reason with.
With this virtual replica of the physical environment, the AI model can comprehend the real-world scenario unfolding. If an unexpected event occurs – say, a box falls from a shelf – the model can simulate multiple future paths for the robot and optimize its recommended course of action.
Another powerful application is in healthcare. Patient data from vital signs and other medical readings could feed into a foundational model, enabling it to provide real-time guidance and recommendations to doctors based on the patient’s current condition.
The potential of digital twins is immense. However, taking this concept into real-world production or healthcare environments presents numerous technical challenges that need to be addressed.
The computing power behind the scenes
Let’s shift our focus now to what powers these cutting-edge AI applications and use cases – the immense computing resources required.
A few years ago, giants like Walmart were spending the most on cloud computing services from providers like AWS and GCP – hundreds of millions of dollars every year. However, in just the last couple of years, it’s the new AI startups that have emerged as the biggest consumers of cloud computing resources.
For example, training ChatGPT-3 in 2022 reportedly cost around $4 million in computing power alone. Its successor, ChatGPT-4, skyrocketed to an estimated $75 million in computing costs. And Google’s recently launched Gemini Ultra is said to have stacked up nearly $200 million in computing expenditure.