For many of us innovating in the AI space, we are working in uncharted territory. Given how quickly AI companies are developing new technologies, one might take for granted the dogged work behind the scenes. But in a field like XR, where the mission is to blur the lines between the real and digital worlds — there is currently not a lot of historical data or research to lean on; so we need to think outside the box.
While it is most convenient to rely on conventional machine learning wisdom and tried-and-true practices, this often isn’t possible (or the full solution) in emerging fields. In order to solve problems that have never been solved before, they need to be approached in new ways.
It’s a challenge that forces you to remember why you entered the engineering, data science, or product development field in the first place: a passion for discovery. I experience this every day in my role at Ultraleap, where we develop software that can track and respond to movements of the human hand in a mixed reality environment. So much of what we thought we knew about training machine learning models gets turned on its head in our work, as the human hand — along with the objects and environments it encounters — is extremely unpredictable.
Here are a few approaches my team and I have taken to reimagine experimentation and data science to bring intuitive interaction to the digital world, that’s accurate and feels as natural as it would in the real world.
Innovating within the lines
When innovating in a nascent space, you are often faced with constraints that seem to be at odds with one another. My team is tasked with capturing the intricacies of hand and finger movements, and how hands and fingers interact with the world around them. This is all packaged into hand tracking models that still fit into XR hardware on constrained compute. This means that our models — while sophisticated and complex — must take up significantly less storage and consume significantly less energy (to the tune of 1/100,000th) than the massive LLMs dominating headlines. It presents us with an exciting challenge, requiring ruthless experimentation and evaluation of our models in their real-world application.
But the countless tests and experiments are worth it: creating a powerful model that still delivers on low inference cost, power consumption and latency is a marvel that can be applied in edge computing even outside of the XR space.
The constraints we run into while experimenting will impact other industries as well. Some businesses will have unique challenges because of subtleties in their application domains, while others may have limited data to work with as a result of being in a niche market that large tech players haven’t touched.
While one-size-fits-all solutions may suffice for some tasks, many application domains need to solve real, challenging problems specific to their task. For example, automotive assembly lines implement ML models for defect inspection. These models have to grapple with very high-resolution imagery that is needed to identify small defects over a large surface area of a car. In this case, the application demands high performance, but the problem to solve is how to achieve a low frame rate, but high resolution, model.
Evaluating model architectures to drive innovation
A good dataset is the driving force behind any successful AI breakthrough. But what makes a dataset “good” for a particular objective, anyway? And when you are solving previously unsolved problems, how can you trust that existing data will be relevant? We cannot assume the metrics that are good for some ML tasks translate to another specific business task performance. This is where we are called to go against commonly-held ML “truths” and instead actively explore how we label, clean and apply both simulated and real-world data.
By nature, our domain is challenging to evaluate and requires manual quality assurance – done by hand. We aren’t just looking at the quality metrics of our data. We iterate on our datasets and data sources and evaluate them based on the qualities of the models they produce in the real world. When we reevaluate how we grade and classify our data, we often find datasets or trends that we may have otherwise overlooked. Now with those datasets, and countless experiments that showed us which data not to rely on, we’ve unlocked a new avenue we were missing before.
Ultraleap’s latest hand-tracking platform, Hyperion, is a great example of this. Advancements in our datasets helped us to develop more sophisticated hand tracking that is able to accurately track microgestures as well as hand movements even while the user is holding an object.
One small step back, one big leap ahead
While the pace of innovation seemingly never slows, we can. We’re in the business of experimenting, learning, developing and when we take the time to do just that, we often create something of much more value than when we are going by the book and rushing to put out the next tech innovation. There is no substitute for the breakthroughs that occur when we explore our data annotations, question our data sources, and redefine quality metrics themselves. And the only way we can do this is by experimenting in the real application domain with measured model performance against the task. Rather than seeing uncommon requirements and constraints as limiting, we can take these challenges and turn them into opportunities for innovation and, ultimately, a competitive advantage.