Black Forest Labs

The startup powering image generation for xAI’s Grok.

Image Credit: Black Forest Labs

Next Week in The Sequence:

  • Edge 425: Our series about SSMs dives into Mamba, the best known SSM model. We review the original Mamba paper by Carnegie Mellon University and Princeton and dive into the GridTape framework for building LLM apps.

  • Edge 426: We discuss Gemma Scope and ShieldGemma, two new tools for interpretability and guardrailing released by Google DeepMind.

You can subscribe to The Sequence below:

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.

📝 Editorial: Black Forest Labs

One of the ideas I like about The Sequence is that it helps bring awareness to AI labs that may not have the media profile or the billions in fundraising of the big AI incumbents but are truly doing unique work in AI research. Today, I’d like to talk about a small startup called Black Forest Labs, which is loaded with world-class AI talent. Even though you might not have heard of Black Forest Labs, there’s a chance you’ve interacted with their work.

Have you used xAI’s Grok’s new image generation features in X? If so, then you’ve been using Black Forest Labs’ models. Grok-2’s new image generation capabilities are powered by a model called FLUX.1, created by Black Forest Labs. Who are these guys? Well, what if I told you that they are part of the team behind the famous Stable Diffusion model and also contributed to research breakthroughs like VQGAN and Latent Diffusion?

Black Forest Labs’ main model is FLUX, which comes in three main variants:

  • FLUX.1 [schnell]: The fastest model, mostly used for local development and personal use.

  • FLUX.1 [dev]: An open-weight model for non-commercial usage.

  • FLUX.1 [pro]: The largest, state-of-the-art image generation model available via APIs.

The company recently raised $31 million from marquee firms like Andreessen Horowitz and General Catalyst, with participation from renowned angel investors such as Michael Ovitz and Gary Tan. Given their research talent, top-tier backers, and partnership with xAI, Black Forest Labs is one of the new startups likely to make some noise in the near future. For now, Grok-2 images are incredibly entertaining.

🔎 ML Research

Phi 3.5

Microsoft published the technical report around Phi 3.5 family of small language models. The new release includes Phi-3.5-MoE as well as new versions of Phi-3.5-mini, Phi-3.5-vision —> Read more.

FermiNet

Google DeepMind published a paper discussing FermiNet, a neural network architecture that can solve fundamental equations of quantum mechanics. FermiNet is the first neural network applied to computing the energy of atoms and molecules —> Read more.

DeepSeek-Prover-V1.5

DeepSeek-AI published a paper unveileing DeepSeek-Prover-V1.5, an LLM optimized for theorem proving. The model uses DeepSeekMath-Base as a baseline and fine-tunes it in theorem proving adn proof generation usign reinforcement learning —> Read more.

xGen-MM (BLIP-3)

Salesforce Research published a paper introducing xGen-MM, also known as BLIP-3, a framework for developing multimodal LLMs. The model showcases strong in-context learning capabilities and includes versions fine-tuned for instructions and safety —> Read more.

Hermes 3

Nous Research published the technical report behind its Hermes 3 family of models specialized in reasoning and creative capabilities. Hermes 3 scales up to 405B parameters and leverages a 128k context windows —> Read more.

Speculative RAG

Google Research published a paper detailing Speculative RAG, a technique that tries to address the effectiveness vs. efficiency dilemma in RAG solutions. The method uses a RAG fine-tuned LLM to complement a generalist LLM in RAG workflows —> Read more.

🤖 AI Tech Releases

Jamba 1.5

AI21 released Jamba 1.5, an SSM-Transformer model that enables long context handling capabilities —> Read more.

NVIDIA Llama-3.1 Minitron

NVIDIA open sourced Minitron, an 4B and 8B distilled versions of Llama 3.1 —> Read more.

🛠 Real World AI

Google AI Edge’s MediaPipe

Google provided a deep dive into the techniques for serving 7B parameter models in the browser —> Read more.

AI Infrastructure Videos

The videos from the @Scale AI infrastructure conference are now online —> Read more.

📡AI Radar

TheSequence is a reader-supported publication. To receive new posts and support my work, consider becoming a free or paid subscriber.