Gemini and Mistral MoE: Both Impactul Altough Very Different Releases

Created Using DALL-E

Next Week in The Sequence:

  • Edge 351: Presents a detailed summary of our series about fine-tuning in foundation models.

  • Edge 352: Will dive into LinkedIn’s embedding architecure that power its semantic search capabilities.

You can subscribe below:

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: Gemini and Mistral MoE: Both Impactul Altough Very Different Releases

This week’s generative AI news was dominated by Google’s announcement of Gemini, its new multimodal model. Unfortunately, much of the attention was diverted to a controversy surrounding their promotional video, which was apparently heavily edited. This is regrettable because Gemini appears to be a very impressive model. According to the technical report, Gemini can process interleaved input sequences of text, images, audio, code, and video, making it quite unique. The model was designed with multimodality in mind from the outset. Ultra, the top model in the Gemini family, seems to push the boundaries of reasoning tasks, which could be one of the next frontiers in generative AI.

While Google’s Gemini release made a big splash, Mistral quietly dropped a torrent link to a version of its model based on a Mixture of Experts (MoE) architecture. Specifically, Mistral 8x7B (87 GB in size) is based on eight 7B models but is actually smaller than the original Mistral 7B (120 GB). This reduction in size might be due to optimizations in the reusability of the attention layers. For any token inference task, Mistral 8x7B uses two models. Its release has been called a “scaled down GPT-4,” given the alleged similarities with the GPT-4 architecture.

The upcoming releases of both Gemini and Mistral 8x7B mark relevant milestones in the evolution of foundation models, but they also highlight the contrast between the open-source and closed-source ethos in this space. One is more commercial and polished, the other more scrappy and hacker-ish. As George Hotz succinctly put it: ‘Google released a press release and a fake demo. Mistral released a torrent


Vector Transformation Made Easy and Fast

Have you struggled with getting your text documents converted into high-quality vector embeddings? Now you can do it right from the Zilliz Cloud vector database platform. Eliminate the headache of creating vectors for your AI-powered search application with Zilliz Cloud Pipelines Learn more ->


🔎 ML Research

Starling-7B

Researchers from UC Berkeley published a paper discussing Starling-7B, an open source LLM fine-tuned using reinforcement learning with AI feedback(RLAIF). The paper also details Nectar, a dataset to benchmark RLHF capabilities —> Read more.

AudioBox

Meta AI published details around Audibox, a new model for text-to-audio generation. The model builds on the previous research around Voicebox to unify audio generation and editing capabilities in a single model —> Read more.

Ego-Exo4D

Meta AI published a paper detailing Ego-Exo4D, a dataset for video learning. The dataset includes over 1400 hours of video and the corresponding annotations —> Read more.

LLMLingua

Microsoft Research published a paper detailing LLMLingua, a prompt compression technique. The method removes unimportatn tokens from prompts in order to accelerate inference —> Read more.

Elo

Researchers from Cohere published a paper discussing Elo, a scoring method for LLM evaluation. The method draws inspiration for the player ranking technique used in dynamic games such as chess —> Read more.

🤖 Cool AI Tech Releases

Gemini

Google introduced Gemini, its highly anticipated GPT-4 competitor —> Read more.

Mistral MoE

Mistral released (via torrent) a new model consisting in a mixture of experts with 8 7B models —> Read more.

TensorRT-LLM

NVIDIA announced the latest enhacements to its TensorRT-LLM acceleration library —> Read more.

🛠 Real World ML

Evolving GitHub Copilot

GitHub discusses their LLM experiments to evolve its Copilot platform —> Read more.

📡AI Radar

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.