Edge 386: Inside Yi, 01’s Model Leading the Chinese LLM Movement

Yi has achieved remarkable performance across language and image tasks.

The Chinese ecosystem around foundation models have been on fire recently. Releases from Alibaba, DeepSeek, Smaugand several others . One of the most ambitious foundation models effort in China comes from 01, the startup founded by former Microsoft and Google researcher Kai Fu Lee. 01’s first iteration came in the form of the Yi models. The release is based on a series of multimodal models optimized for both English and Chinese datasets. A few days ago, 01 published a technical report about the Yi models and we thought it would be interesting to share some details.

The Yi series models stands out for their bilingual capabilities. These models are founded on a massive, 3 trillion-word multilingual dataset, positioning them as one of the top-performing large language models globally. Yi made the headlines when the Yi-34B-Chat variant clinched the second spot, right after GPT-4 Turbo, surpassing competitors like GPT-4, Mixtral, and Claude on the AlpacaEval Leaderboard, as per records until January 2024. Furthermore, the Yi-34B model was ranked the highest among all accessible open-source models, including Falcon-180B, Llama-70B, and Claude, in both English and Chinese languages across different benchmarks like the Hugging Face Open LLM Leaderboard and C-Eval, with data up to November 2023.