Bryon Jacob is the CTO and co-founder of data.world – on a mission to build the world’s most meaningful, collaborative, and abundant data resource. Prior to data.world, he spent ten years in roles of increasing responsibility at HomeAway.com, culminating in a VP of Tech / Technical fellow role. Bryon has also previously worked at Amazon, and is a long-time mentor at Capital Factory. He has a BS/MS in computer science from Case Western University.
What initially attracted you to computer science?
I’ve been hooked on coding since I got my hands on a Commodore 64 at age 10. I started with BASIC and quickly moved on to assembly language. For me, computer science is like solving a series of intricate puzzles with the added thrill of automation. It’s this problem-solving aspect that has always kept me engaged and excited.
Can you share the genesis story behind data.world?
data.world was born from a series of brainstorming sessions among our founding team. Brett, our CEO, reached out to Jon and Matt, both of whom he had worked with before. They began meeting to toss around ideas, and Jon brought a few of those concepts to me for a tech evaluation. Although those ideas didn’t pan out, they sparked discussions that aligned closely with my own work. Through these conversations, we hit upon the idea that eventually became data.world. Our shared history and mutual respect allowed us to quickly build a great team, bringing in the best people we’d worked with in the past, and to lay a solid foundation for innovation.
What inspired data.world to develop the AI Context Engine, and what specific challenges does it address for businesses?
From the beginning, we knew a Knowledge Graph (KG) would be critical for advancing AI capabilities. With the rise of generative AI, our customers wanted AI solutions that could interact with their data conversationally. A significant challenge in AI applications today is explainability. If you can’t show your work, the answers are less trustworthy. Our KG architecture grounds every response in verifiable facts, providing clear, traceable explanations. This enhances transparency and reliability, enabling businesses to make informed decisions with confidence.
How does the knowledge graph architecture of the AI Context Engine enhance the accuracy and explainability of LLMs compared to SQL databases alone?
In our groundbreaking paper, we demonstrated a threefold improvement in accuracy using Knowledge Graphs (KGs) over traditional relational databases. KGs use semantics to represent data as real-world entities and relationships, making them more accurate than SQL databases, which focus on tables and columns. For explainability, KGs allow us to link answers back to term definitions, data sources, and metrics, providing a verifiable trail that enhances trust and usability.
Can you share some examples of how the AI Context Engine has transformed data interactions and decision-making within enterprises?
The AI Context Engine is designed as an API that integrates seamlessly with customers’ existing AI applications, be they custom GPTs, co-pilots, or bespoke solutions built with LangChain. This means users don’t need to switch to a new interface – instead, we bring the AI Context Engine to them. This integration enhances user adoption and satisfaction, driving better decision-making and more efficient data interactions by embedding powerful AI capabilities directly into existing workflows.
In what ways does the AI Context Engine provide transparency and traceability in AI decision-making to meet regulatory and governance requirements?
The AI Context Engine ties into our Knowledge Graph and data catalog, leveraging capabilities around lineage and governance. Our platform tracks data lineage, offering full traceability of data and transformations. AI-generated answers are connected back to their data sources, providing a clear trace of how each piece of information was derived. This transparency is crucial for regulatory and governance compliance, ensuring every AI decision can be audited and verified.
What role do you see knowledge graphs playing in the broader landscape of AI and data management in the coming years?
Knowledge Graphs (KGs) are becoming increasingly important with the rise of generative AI. By formalizing facts into a graph structure, KGs provide a stronger foundation for AI, enhancing both accuracy and explainability. We’re seeing a shift from standard Retrieval Augmented Generation (RAG) architectures, which rely on unstructured data, to Graph RAG models. These models convert unstructured content into KGs first, leading to significant improvements in recall and accuracy. KGs are set to play a pivotal role in driving AI innovations and effectiveness.
What future enhancements can we expect for the AI Context Engine to further improve its capabilities and user experience?
The AI Context Engine improves with use, as context flows back into the data catalog, making it smarter over time. From a product standpoint, we’re focusing on developing agents that perform advanced knowledge engineering tasks, turning raw content into richer ontologies and knowledge bases. We continuously learn from patterns that work and quickly integrate those insights, providing users with a powerful, intuitive tool for managing and leveraging their data.
How is data.world investing in research and development to stay at the forefront of AI and data integration technologies?
R&D on the AI Context Engine is our single biggest investment area. We’re committed to staying at the bleeding edge of what’s possible in AI and data integration. Our team, experts in both symbolic AI and machine learning, drives this commitment. The robust foundation we’ve built at data.world enables us to move quickly and push technological boundaries, ensuring we consistently deliver cutting-edge capabilities to our customers.
What is your long-term vision for the future of AI and data integration, and how do you see data.world contributing to this evolution?
My vision for the future of AI and data integration has always been to move beyond simply making it easier for users to query their data. Instead, we aim to eliminate the need for users to query their data altogether. Our vision has consistently been to seamlessly integrate an organization’s data with its knowledge—encompassing metadata about data systems and logical models of real-world entities.
By achieving this integration in a machine-readable knowledge graph, AI systems can truly fulfill the promise of natural language interactions with data. With the rapid advancements in generative AI over the past two years and our efforts to integrate it with enterprise knowledge graphs, this future is becoming a reality today. At data.world, we are at the forefront of this evolution, driving the transformation that allows AI to deliver unprecedented value through intuitive and intelligent data interactions.
Thank you for the great interview, readers who wish to learn more should visit data.world.