Artificial intelligence, like any software, relies on two fundamental components: the AI programs, often referred to as models, and the computational hardware, or chips, that drive these programs. So far, the focus in AI development has been on refining the models, while the hardware was typically seen as a standard component provided by third-party suppliers. Recently, however, this approach has started to change. Major AI firms such as Google, Meta, and Amazon have started developing their own AI chips. The in-house development of custom AI chips is heralding a new era in AI advancement. This article will explore the reasons behind this shift in approach and will highlight the latest developments in this evolving area.
Why In-house AI Chip Development?
The shift toward in-house development of custom AI chips is being driven by several critical factors, which include:
Increasing Demand of AI Chips
Creating and using AI models demands significant computational resources to effectively handle large volumes of data and generate precise predictions or insights. Traditional computer chips are incapable of handling computational demands when training on trillions of data points. This limitation has led to the creation of cutting-edge AI chips specifically designed to meet the high performance and efficiency requirements of modern AI applications. As AI research and development continue to grow, so does the demand for these specialized chips.
Nvidia, a leader in the production of advanced AI chips and well ahead of its competitors, is facing challenges as demand greatly exceeds its manufacturing capacity. This situation has led to the waitlist for Nvidia’s AI chips being extended to several months, a delay that continues to grow as demand for their AI chips surges. Moreover, the chip market, which includes major players like Nvidia and Intel, encounters challenges in chip production. This issue stems from their dependence on Taiwanese manufacturer TSMC for chip assembly. This reliance on a single manufacturer leads to prolonged lead times for manufacturing these advanced chips.
Making AI Computing Energy-efficient and Sustainable
The current generation of AI chips, which are designed for heavy computational tasks, tend to consume a lot of power, and generate significant heat. This has led to substantial environmental implications for training and using AI models. OpenAI researchers note that: since 2012, the computing power required to train advanced AI models has doubled every 3.4 months, suggesting that by 2040, emissions from the Information and Communications Technology (ICT) sector could comprise 14% of global emissions. Another study showed that training a single large-scale language model can emit up to 284,000 kg of CO2, which is approximately equivalent to the energy consumption of five cars over their lifetime. Moreover, it is estimated that the energy consumption of data centers will grow 28 percent by 2030. These findings emphasize the necessity to strike a balance between AI development and environmental responsibility. In response, many AI companies are now investing in the development of more energy-efficient chips, aiming to make AI training and operations more sustainable and environment friendly.
Tailoring Chips for Specialized Tasks
Different AI processes have varying computational demands. For instance, training deep learning models requires significant computational power and high throughput to handle large datasets and execute complex calculations quickly. Chips designed for training are optimized to enhance these operations, improving speed and efficiency. On the other hand, the inference process, where a model applies its learned knowledge to make predictions, requires fast processing with minimal energy use, especially in edge devices like smartphones and IoT devices. Chips for inference are engineered to optimize performance per watt, ensuring prompt responsiveness and battery conservation. This specific tailoring of chip designs for training and inference tasks allows each chip to be precisely adjusted for its intended role, enhancing performance across different devices and applications. This kind of specialization not only supports more robust AI functionalities but also promotes greater energy efficiency and cost-effectiveness broadly.
Reducing Financial Burdens
The financial burden of computing for AI model training and operations remains substantial. OpenAI, for instance, uses an extensive supercomputer created by Microsoft for both training and inference since 2020. It cost OpenAI about $12 million to train its GPT-3 model, and the expense surged to $100 million for training GPT-4. According to a report by SemiAnalysis, OpenAI needs roughly 3,617 HGX A100 servers, totaling 28,936 GPUs, to support ChatGPT, bringing the average cost per query to approximately $0.36. With these high costs in mind, Sam Altman, CEO of OpenAI, is reportedly seeking significant investments to build a worldwide network of AI chip production facilities, according to a Bloomberg report.
Harnessing Control and Innovation
Third-party AI chips often come with limitations. Companies relying on these chips may find themselves constrained by off-the-shelf solutions that don’t fully align with their unique AI models or applications. In-house chip development allows for customization tailored to specific use cases. Whether it’s for autonomous cars or mobile devices, controlling the hardware enables companies to fully leverage their AI algorithms. Customized chips can enhance specific tasks, reduce latency, and improve overall performance.
Latest Advances in AI Chip Development
This section delves into the latest strides made by Google, Meta, and Amazon in building AI chip technology.
Google’s Axion Processors
Google has been steadily progressing in the field of AI chip technology since the introduction of the Tensor Processing Unit (TPU) in 2015. Building on this foundation, Google has recently launched the Axion Processors, its first custom CPUs specifically designed for data centers and AI workloads. These processors are based on Arm architecture, known for their efficiency and compact design. The Axion Processors aim to enhance the efficiency of CPU-based AI training and inferencing while maintaining energy efficiency. This advancement also marks a significant improvement in performance for various general-purpose workloads, including web and app servers, containerized microservices, open-source databases, in-memory caches, data analytics engines, media processing, and more.
Meta’s MTIA
Meta is pushing forward in AI chip technology with its Meta Training and Inference Accelerator (MTIA). This tool is designed to boost the efficiency of training and inference processes, especially for ranking and recommendation algorithms. Recently, Meta outlined how the MTIA is a key part of its strategy to strengthen its AI infrastructure beyond GPUs. Initially set to launch in 2025, Meta has already put both versions of the MTIA into production, showing a quicker pace in their chip development plans. While the MTIA currently focuses on training certain types of algorithms, Meta aims to expand its use to include training for generative AI, like its Llama language models.
Amazon’s Trainium and Inferentia
Since introducing its custom Nitro chip in 2013, Amazon has significantly expanded its AI chip development. The company recently unveiled two innovative AI chips, Trainium and Inferentia. Trainium is specifically designed to enhance AI model training and is set to be incorporated into EC2 UltraClusters. These clusters, capable of hosting up to 100,000 chips, are optimized for training foundational models and large language models in an energy efficient way. Inferentia, on the other hand, is tailored for inference tasks where AI models are actively applied, focusing on decreasing latency and costs during inference to better serve the needs of millions of users interacting with AI-powered services.
The Bottom Line
The movement towards in-house development of custom AI chips by major companies like Google, Microsoft, and Amazon reflects a strategic shift to address the increasing computational needs of AI technologies. This trend highlights the necessity for solutions that are specifically tailored to efficiently support AI models, meeting the unique demands of these advanced systems. As demand for AI chips continues to grow, industry leaders like Nvidia are likely to see a significant rise in market valuation, underlining the vital role that custom chips play in advancing AI innovation. By creating their own chips, these tech giants are not only enhancing the performance and efficiency of their AI systems but also promoting a more sustainable and cost-effective future. This evolution is setting new standards in the industry, driving technological progress and competitive advantage in a rapidly changing global market.