The Most Open Open Source Generative AI Release

AllenAI just released all the components of its OLMo LLM model.

Next Week in The Sequence:

Edge 367: We dive into multi-chain reasoning in LLMs including the original research paper on this topic published by Allen AI. It also explores Gradio as a very effective tool for demoing LLM apps.
Edge 368: We review one of my favorite frameworks for autonomous agents: MemGPT!

You can subscribe below!

📝 Editorial: The Most Open Open Source Generative AI Release

Open source innovation is one of the most impressive aspects of the recent generative AI revolution. We constantly hear about new models being open-sourced to rival closed alternatives such as OpenAI or Anthropic. However, you might be surprised to learn that most of these releases are not completely open source. A more intriguing question is what open source means in the context of foundation models.

It’s important to understand that, when it comes to foundation models, the source code itself is very small and quite similar from one model to another. When you hear ‘open source’ in the context of foundation models, most of the time it refers to the weights of the models, which are huge files containing the neural network’s connectivity structure. Understanding the weights of a foundation model is nearly impossible, so the value of making them open relies on reproducibility. Other aspects of a foundation model, such as source code, training pipeline, evaluation code, and fine-tuning code (for instance, in instruction-following models), remain closed.

So, let’s just say that we have been using the term ‘open source’ a bit lightly in generative AI, to say the least.

Last week, researchers from the Allen Institute for Artificial Intelligence (AllenAI) released all the components of its OLMo LLM in a truly open fashion. You could call this release the most open open-source release in generative AI. The release includes:

Datasets used to train OLMo.
Full model weights.
Evaluation code.
Fine-tuning code.
Full model weights and training metrics.

Needless to say, getting OLMo to work and reproduce the results claimed in its technical report is way simpler than with other models. Let’s hope more open source AI proponents follow this practice. Let’s build a real open source generative AI.

🔎 ML Research

Fuyu-Heavy

Adept published details about Fuyu-Heavy, the newest version of its multimodal model optimized for autonomous agents scenarios. The model scores in close proximity to GPT4-V and Gemini Ultra being 10-20 times smaller —> Read more.

BootPIG

Salesforce Research published a paper detailing BootPIG, a model architecture and training pipeline for subject driven image generation. BootPIG extends text-to-image models with additional layers that allow them to accept new images during text time —> Read more.

MobileDiffusion

Google Research published a paper detailing MobileDiffusion, a text-to-image model optimized for mobile devices. The model relies on techniques such as DiffusionGAN to achieve and can generate 512×512 images in half a second —> Read more.

Multimodal LLMs

Researchers from Tencent AI Labs and Kyoto University published a paper detailing recent advancements in multimodal LLMs. The paper reviews the architecture, training models and recent developments in over 20 multimodal LLMs —> Read more.

Time Series Forecasting Decoder

Google Research published a paper introducing TimeFM, a foundation model for time series forecasting. TimeFM has been pretrained in 100 billion time points and is based on 200M parameters —> Read more.