Gemini and Mistral MoE: Both Impactul Altough Very Different Releases

Created Using DALL-E

Next Week in The Sequence:

Edge 351: Presents a detailed summary of our series about fine-tuning in foundation models.
Edge 352: Will dive into LinkedIn’s embedding architecure that power its semantic search capabilities.

You can subscribe below:

📝 Editorial: Gemini and Mistral MoE: Both Impactul Altough Very Different Releases

This week’s generative AI news was dominated by Google’s announcement of Gemini, its new multimodal model. Unfortunately, much of the attention was diverted to a controversy surrounding their promotional video, which was apparently heavily edited. This is regrettable because Gemini appears to be a very impressive model. According to the technical report, Gemini can process interleaved input sequences of text, images, audio, code, and video, making it quite unique. The model was designed with multimodality in mind from the outset. Ultra, the top model in the Gemini family, seems to push the boundaries of reasoning tasks, which could be one of the next frontiers in generative AI.

While Google’s Gemini release made a big splash, Mistral quietly dropped a torrent link to a version of its model based on a Mixture of Experts (MoE) architecture. Specifically, Mistral 8x7B (87 GB in size) is based on eight 7B models but is actually smaller than the original Mistral 7B (120 GB). This reduction in size might be due to optimizations in the reusability of the attention layers. For any token inference task, Mistral 8x7B uses two models. Its release has been called a “scaled down GPT-4,” given the alleged similarities with the GPT-4 architecture.

The upcoming releases of both Gemini and Mistral 8x7B mark relevant milestones in the evolution of foundation models, but they also highlight the contrast between the open-source and closed-source ethos in this space. One is more commercial and polished, the other more scrappy and hacker-ish. As George Hotz succinctly put it: ‘Google released a press release and a fake demo. Mistral released a torrent

Vector Transformation Made Easy and Fast

Have you struggled with getting your text documents converted into high-quality vector embeddings? Now you can do it right from the Zilliz Cloud vector database platform. Eliminate the headache of creating vectors for your AI-powered search application with Zilliz Cloud Pipelines Learn more ->

🔎 ML Research

Starling-7B

Researchers from UC Berkeley published a paper discussing Starling-7B, an open source LLM fine-tuned using reinforcement learning with AI feedback(RLAIF). The paper also details Nectar, a dataset to benchmark RLHF capabilities —> Read more.

AudioBox

Meta AI published details around Audibox, a new model for text-to-audio generation. The model builds on the previous research around Voicebox to unify audio generation and editing capabilities in a single model —> Read more.

Ego-Exo4D

Meta AI published a paper detailing Ego-Exo4D, a dataset for video learning. The dataset includes over 1400 hours of video and the corresponding annotations —> Read more.

LLMLingua

Microsoft Research published a paper detailing LLMLingua, a prompt compression technique. The method removes unimportatn tokens from prompts in order to accelerate inference —> Read more.

Elo

Researchers from Cohere published a paper discussing Elo, a scoring method for LLM evaluation. The method draws inspiration for the player ranking technique used in dynamic games such as chess —> Read more.

🤖 Cool AI Tech Releases

Gemini

Google introduced Gemini, its highly anticipated GPT-4 competitor —> Read more.

Mistral MoE

Mistral released (via torrent) a new model consisting in a mixture of experts with 8 7B models —> Read more.

TensorRT-LLM

NVIDIA announced the latest enhacements to its TensorRT-LLM acceleration library —> Read more.

🛠 Real World ML

Evolving GitHub Copilot

GitHub discusses their LLM experiments to evolve its Copilot platform —> Read more.

📡AI Radar

Mistral is raising new funds at about $2B valuation.
IBM, Meta and other 50 companies launched the AI Alliance as an alternative to the dominant closed source players.
AMD unveiled its new Instinct MI300 GPU.
Meta AI research arm(FAIR) is celebrating its 10th year anniversary.
MIT spinoff Liquid AI raised $37.5 million to build a new form of neural networks.
xAI started rolling out its AI agent Grok to premium subscribers.
Air Space Intelligence, a startup focused on AI for the aerospace, raised $300 million in new funding.
Vast Data raised $118 million to expand its data storage platform for AI workloads.
Sarwan AI raised $41 million to build LLMs for the India market.
Assembly AI announced a $50 million round to build next generation speech AI models.
Generative image platform Leonardo.ai raised $31 million in new funding.
EnCharge AI announced a $22.6 million round to build advanced AI chips.
Google unveiled a new generation of TPU chips.
Arcweave raised seed funding for its AI game narrative engine.
Kyron Learning raised $14.6 million for its AI-powered learning platform.
SuperDuperDB raised $1.75 million to enable AI for enteprise databases.
Airflow-based platform Astronomer unveiled new support for generative AI workflows.