A New Compute Platform for Generative AI ?

Is generative AI big enough to spark the creation of a new compute platform?

Created Using DALL-E

Next Week in The Sequence:

Edge 361: Our current series about LLM reasoning explores the tree-of-thought method including its original paper. We also dive into LangChain’s LangSmith tool for LLM debugging and evaluation.
Edge 362: We review one of my favorites papers of last year. DeepMind’s FunSearch is a method that was able to discover new math and computer science algorithms.

You can subscribe below!

📝 Editorial: Would Generative AI Require New Hardware Platforms?

One of the best-known secrets in tech investing is that any sufficiently large tech trend can create a new computing platform. The advancements in microprocessor design in the late ’70s sparked the creation of personal computing. The evolution of the internet enabled the creation of the web browser in the ’90s. Similarly, advancements in mobile computing led to the smartphone revolution a few years ago. You can also make the case that devices like Alexa or Google Home have become new compute platforms on a smaller scale.

Would AI have triggered the creation of a new compute platform?

The answer is far from trivial and is somehow based on the balance between the revolutionary impact of generative AI and the footprint of existing computing platforms. Given that the technology market has grown at a multi-exponential clip, every generation of new compute platforms requires a bigger effort to disrupt the existing platforms that have a well-established footprint. In the case of generative AI, would a new compute platform need to capture a significant percentage of use cases that won’t take place on existing platforms such as web browsers, smartphones, or home devices?

Over the last few months, we have seen the inception of efforts such as OpenAI working with famous designer Jony Ive to design a new device for generative AI. Also, initial efforts such as the Humane Pin are showcasing new interaction paradigms with generative AI.

One of the most interesting keynotes/announcements at last week’s CES came from Rabbit with a new generative AI device called R1. The new device is based on a simplistic design that includes a 2.88-inch touchscreen, a rotating camera for taking photos and videos, and a scroll wheel. However, the most intriguing feature of R1 is that it relies on a new foundation model based on the Large Action Model (LAM) paradigms. LAMs are language models optimized for performing actions on external systems. Rabbit seems to have developed an entire tech stack, dubbed Rabbit OS, for the development of LAMs.

The release of R1 is one of the most complete examples of how a new platform for generative AI could look. A new compute platform for generative AI is a seductive idea but also a massive undertaking.

🔎 ML Research

Fair LLM Serving

A group of stellar researchers from UC Berkeley, Stanford University and Duke University published a paper proposing a technique for LLM serving fairness. Specifically, the algorithm called eVirtual TokenCounter(VTC) uses a cost function based on the number of input and output tokens —> Read more.

TinyLlama

Researchers from StatNLP Research Group, Singapore University adn others published a paper unveiling TinyLlama, a 1.1 B LLM pretrained on one trillion tokens. TinyLlama shows the potential of small LLMs by performing incredibly well across different tasks —> Read more.

Diffusion DPO

Salesforce Research published a paper detailing Diffusion DPO to streamline the adoption of human feedback in text-to-image models. Diffusion DPO incorproates the efficient irect Preference Optimization (DPO) training method to text-to-image models —> Read more.

CosMo

Researchers from Microsoft and the National University of Singapore published a paper detailing CosMo, a new pretraining method for vision-language models. The method work efficiently for both image and video models —> Read more.

Responsible AI

Microsoft Research published a series of paper outlining their latest work in responsible AI. The collection includes areas such as privacy, testing, human feedback, transparency and several others —> Read more.