The new method represents an important evolution of reasoning for SLMs.
Welcome to our five-hundredth edition!!! What a ride has been and this year is already looking like its going to be our best with our expanded content coverage. I regularly hear how The Sequence is in a category of its own when comes to AI deep tech coverage. Thanks a lot for your support.
The battle between SLM and big LLMs is one of the most interesting trends in generative AI. We are always fascinated by the claims of smaller models beating competitors on different benchmarks. Recently, this has become even trendier with areas such as reasoning gaining relevance. For a while, reasoning was considering a by product of the scaling laws but now we are seeing emerging SLMs able to reason across different domains. One of the most impressive examples came a few days ago when Microsoft published a paper outlining a rStar-Math, a method that validates SLMs can outperform models like GPT-o1 on math reasoning without any distillation.
rStar-Math is a novel approach that significantly boosts the mathematical reasoning capabilities of small language models (SLMs). This innovative system enables SLMs to achieve performance levels comparable to, and even exceeding, OpenAI’s o1, despite a significantly smaller model size. This is accomplished through a self-evolved System 2 deep thinking process that leverages Monte Carlo Tree Search (MCTS) guided by a carefully crafted Process Preference Model (PPM).