AI’s Trillion-Dollar Problem

AI’s Trillion-Dollar Problem

As we enter 2025, the artificial intelligence sector stands at a crucial inflection point. While the industry continues to attract unprecedented levels of investment and attention—especially within the generative AI landscape—several underlying market dynamics suggest we’re heading toward a big shift in the AI landscape in the coming year.

Drawing from my experience leading an AI startup and observing the industry’s rapid evolution, I believe this year will bring about many fundamental changes: from large concept models (LCMs) expected to emerge as serious competitors to large language models (LLMs), the rise of specialized AI hardware, to the Big Tech companies beginning major AI infrastructure build-outs that will finally put them in a position to outcompete startups like OpenAI and Anthropic—and, who knows, maybe even secure their AI monopoly after all.

Unique Challenge of AI Companies: Neither Software nor Hardware

The fundamental issue lies in how AI companies operate in a previously unseen middle ground between traditional software and hardware businesses. Unlike pure software companies that primarily invest in human capital with relatively low operating expenses, or hardware companies that make long-term capital investments with clear paths to returns, AI companies face a unique combination of challenges that make their current funding models precarious.

These companies require massive upfront capital expenditure for GPU clusters and infrastructure, spending $100-200 million annually on computing resources alone. Yet unlike hardware companies, they can’t amortize these investments over extended periods. Instead, they operate on compressed two-year cycles between funding rounds, each time needing to demonstrate exponential growth and cutting-edge performance to justify their next valuation markup.

LLMs Differentiation Problem

Adding to this structural challenge is a concerning trend: the rapid convergence of large language model (LLM) capabilities. Startups, like the unicorn Mistral AI and others, have demonstrated that open-source models can achieve performance comparable to their closed-source counterparts, but the technical differentiation that previously justified sky-high valuations is becoming increasingly difficult to maintain.

In other words, while every new LLM boasts impressive performance based on standard benchmarks, a truly significant shift in the underlying model architecture is not taking place.

Current limitations in this domain stem from three critical areas: data availability, as we’re running out of high-quality training material (as confirmed by Elon Musk recently); curation methods, as they all adopt similar human-feedback approaches pioneered by OpenAI; and computational architecture, as they rely on the same limited pool of specialized GPU hardware.

What’s emerging is a pattern where gains increasingly come from efficiency rather than scale. Companies are focusing on compressing more knowledge into fewer tokens and developing better engineering artifacts, like retrieval systems like graph RAGs (retrieval-augmented generation). Essentially, we’re approaching a natural plateau where throwing more resources at the problem yields diminishing returns.

Due to the unprecedented pace of innovation in the last two years, this convergence of LLM capabilities is happening faster than anyone anticipated, creating a race against time for companies that raised funds.

Based on the latest research trends, the next frontier to address this issue is the emergence of large concept models (LCMs) as a new, ground-breaking architecture competing with LLMs in their core domain, which is natural language understanding (NLP).

Technically speaking, LCMs will possess several advantages, including the potential for better performance with fewer iterations and the ability to achieve similar results with smaller teams. I believe these next-gen LCMs will be developed and commercialized by spin-off teams, the famous ‘ex-big tech’ mavericks founding new startups to spearhead this revolution.

Monetization Timeline Mismatch

The compression of innovation cycles has created another critical issue: the mismatch between time-to-market and sustainable monetization. While we’re seeing unprecedented speed in the verticalization of AI applications – with voice AI agents, for instance, going from concept to revenue-generating products in mere months – this rapid commercialization masks a deeper problem.

Consider this: an AI startup valued at $20 billion today will likely need to generate around $1 billion in annual revenue within 4-5 years to justify going public at a reasonable multiple. This requires not just technological excellence but a dramatic transformation of the entire business model, from R&D-focused to sales-driven, all while maintaining the pace of innovation and managing enormous infrastructure costs.

In that sense, the new LCM-focused startups that will emerge in 2025 will be in better positions to raise funding, with lower initial valuations making them more attractive funding targets for investors.

Hardware Shortage and Emerging Alternatives

Let’s take a closer look specifically at infrastructure. Today, every new GPU cluster is purchased even before it’s built by the big players, forcing smaller players to either commit to long-term contracts with cloud providers or risk being shut out of the market entirely.

But here’s what is really interesting: while everyone is fighting over GPUs, there has been a fascinating shift in the hardware landscape that is still largely being overlooked. The current GPU architecture, called GPGPU (General Purpose GPU), is incredibly inefficient for what most companies actually need in production. It’s like using a supercomputer to run a calculator app.

This is why I believe specialized AI hardware is going to be the next big shift in our industry. Companies, like Groq and Cerebras, are building inference-specific hardware that’s 4-5 times cheaper to operate than traditional GPUs. Yes, there’s a higher engineering cost upfront to optimize your models for these platforms, but for companies running large-scale inference workloads, the efficiency gains are clear.

Data Density and the Rise of Smaller, Smarter Models

Moving to the next innovation frontier in AI will likely require not only greater computational power– especially for large models like LCMs – but also richer, more comprehensive datasets.

Interestingly, smaller, more efficient models are starting to challenge larger ones by capitalizing on how densely they are trained on available data. For example, models like Microsoft’s FeeFree or Google’s Gema2B, operate with far fewer parameters—often around 2 to 3 billion—yet achieve performance levels comparable to much larger models with 8 billion parameters.

These smaller models are increasingly competitive because of their high data density, making them robust despite their size. This shift toward compact, yet powerful, models aligns with the strategic advantages companies like Microsoft and Google hold: access to massive, diverse datasets through platforms such as Bing and Google Search.

This dynamic reveals two critical “wars” unfolding in AI development: one over compute power and another over data. While computational resources are essential for pushing boundaries, data density is becoming equally—if not more—critical. Companies with access to vast datasets are uniquely positioned to train smaller models with unparalleled efficiency and robustness, solidifying their dominance in the evolving AI landscape.

Who Will Win the AI War?

In this context, everyone likes to wonder who in the current AI landscape is best positioned to come out winning. Here’s some food for thought.

Major technology companies have been pre-purchasing entire GPU clusters before construction, creating a scarcity environment for smaller players. Oracle’s 100,000+ GPU order and similar moves by Meta and Microsoft exemplify this trend.

Having invested hundreds of billions in AI initiatives, these companies require thousands of specialized AI engineers and researchers. This creates an unprecedented demand for talent that can only be satisfied through strategic acquisitions – likely resulting in many startups being absorbed in the upcoming months.

While  2025 will be spent on large-scale R&D and infrastructure build-outs for such actors, by 2026, they’ll be in a position to strike like never before due to unrivaled resources.

This isn’t to say that smaller AI companies are doomed—far from it. The sector will continue to innovate and create value. Some key innovations in the sector, like LCMs, are likely to be led by smaller, emerging actors in the year to come, alongside Meta, Google/Alphabet, and OpenAI with Anthropic, all of which are working on exciting projects at the moment.

However, we’re likely to see a fundamental restructuring of how AI companies are funded and valued. As venture capital becomes more discriminating, companies will need to demonstrate clear paths to sustainable unit economics – a particular challenge for open-source businesses competing with well-resourced proprietary alternatives.

For open-source AI companies specifically, the path forward may require focusing on specific vertical applications where their transparency and customization capabilities provide clear advantages over proprietary solutions.