Bridging the Gap Between Language and Action: How Buffaly is Revolutionizing AI
The rapid advancement of Large Language Models (LLMs) has brought remarkable progress in natural language processing, empowering AI systems to understand and generate text with unprecedented fluency. Yet, these systems face a critical limitation: while they excel at processing language, they struggle to execute concrete actions or provide actionable insights grounded in real-world scenarios. This gap between language comprehension and practical execution is a fundamental challenge in AI development.
Enter Buffaly, powered by the groundbreaking Ontology-Guided Augmented Retrieval (OGAR) framework. Buffaly redefines how AI systems access, analyze, and act upon data by combining the structured clarity of ontologies with the dynamic reasoning capabilities of modern LLMs.
Why Buffaly Matters
Traditional LLMs operate as black boxes, generating outputs based on statistical patterns from vast datasets. While powerful, these systems often fall short when required to:
- Handle complex reasoning.
- Integrate structured and unstructured data sources.
- Execute actions grounded in real-world contexts.
Buffaly addresses these limitations by introducing ontology-based AI, which brings structure, control, and transparency to AI systems. With Buffaly, organizations can seamlessly bridge the divide between language understanding and action execution, unlocking new possibilities in fields like healthcare, finance, and aerospace.
How Buffaly Works
Buffaly’s OGAR framework is built around three core innovations:
- Structured Ontologies
Buffaly uses ontologies — graph-based representations of knowledge — to define concepts, relationships, and rules in a precise and transparent manner. This structure provides a foundation for reasoning and decision-making, enabling Buffaly to interpret and act on complex queries with clarity and accuracy. - ProtoScript
At the heart of Buffaly lies ProtoScript, a C#-based scripting language designed to create and manipulate ontologies programmatically. ProtoScript allows developers to map natural language inputs into structured actions, bridging the gap between language and execution effortlessly.
ou might decide to keep using the old embeddings to save on costs. But over time, you miss out on improvements and possibly pay more for less efficient models. - Dual Learning Modes
Buffaly handles both structured data (e.g., database schemas) and unstructured data (e.g., emails, PDFs) with equal ease. This dual capability allows Buffaly to populate its knowledge base dynamically, learning incrementally without the need for costly retraining.
se new embeddings for new documents and keep the old ones for existing data. But now your database is fragmented, and searching across different embedding spaces gets complicated.
What Sets Buffaly Apart?
Unlike traditional AI solutions, Buffaly integrates:
- Actionability: Translates language into executable actions for real-world systems.
- Dynamic Reasoning: Combines LLM insights with ontology-driven logic for advanced decision-making.
- Industry-Specific Applications: Tailors solutions for sensitive fields, ensuring secure, domain-specific results.
By serving as both a semantic and operational bridge, Buffaly creates a transparent interface that not only interprets language but also understands its implications and executes relevant actions.
A Glimpse Into the Future
The integration of Buffaly’s structured ontology with the power of LLMs represents a paradigm shift in AI. It paves the way for systems that are not only capable of understanding human language but also of acting on it with precision and accountability. Over the next series of blog posts, we’ll explore Buffaly’s unique features, diving deeper into its transformative potential and how it is shaping the future of AI applications.
Are you ready to see what’s next? Stay tuned as we unravel the layers of Buffaly’s OGAR framework and its implications for AI innovation!
If you want to learn more about the OGAR framework, download the OGAR White Paper at
OGAR.ai.
When Retrieval Augmented Generation (RAG) Fails
Retrieval Augmented Generation (RAG) sounds like a dream come true for anyone working with AI language models. The idea is simple: enhance models like ChatGPT with external data so they can provide answers based on information beyond their original training. Need your AI to answer questions about your company's internal documents or recent events not covered in its training data? RAG seems like the perfect solution.
But when we roll up our sleeves and implement RAG in the real world, things get messy. Let's dive into why RAG isn't always the magic fix we hope for and explore the hurdles that can trip us up along the way.
The Allure of RAG
At its heart, RAG is about bridging gaps in an AI's knowledge:
- Compute Embeddings: Break down your documents into chunks and convert them into embeddings—numerical representations that capture the essence of the text.
- Store and Retrieve: Keep these embeddings in a database. When a question comes in, find the chunks whose embeddings are most similar to the question.
- Augment the AI: Feed these relevant chunks to the AI alongside the question, giving it the context it needs to generate an informed answer.
In theory, this means your AI can tap into any knowledge source you provide, even if that information isn't part of its original training.
The Reality Check
Despite its promise, implementing RAG isn't all smooth sailing. Here are some of the bumps you might hit on the road.
1. The Ever-Changing Embeddings
Embeddings are the foundation of RAG—they're how we represent text in a way that the AI can understand and compare. But here's the catch: embedding models keep evolving. New models offer better performance, but they come with their own embeddings that aren't compatible with the old ones.
So, you're faced with a dilemma:
- Recompute All Embeddings: Every time a new model comes out, you could reprocess your entire document library to generate new embeddings. But if you're dealing with millions or billions of chunks, that's a hefty computational bill.
- Stick with the Old Model: You might decide to keep using the old embeddings to save on costs. But over time, you miss out on improvements and possibly pay more for less efficient models.
- Mix and Match: Use new embeddings for new documents and keep the old ones for existing data. But now your database is fragmented, and searching across different embedding spaces gets complicated.
There's no perfect solution. Some platforms, like SemDB.ai, try to ease the pain by allowing multiple embeddings in the same database, but the underlying challenge remains.
2. The Pronoun Problem
Language is messy. People use pronouns, references, and context that computers struggle with. Let's look at an example:
Original Text: "Chocolate cookies are made from the finest imported cocoa. They sell for $4 a dozen."
When we break this text into chunks for embeddings, we might get:
Chunk 1: "Chocolate cookies are made from the finest imported cocoa."
Chunk 2: "They sell for $4 a dozen."
Now, if someone asks, "How much do chocolate cookies cost?", the system searches for embeddings similar to the question. But Chunk 2 doesn't mention "chocolate cookies" explicitly—it uses "they." The AI might miss this chunk because the embedding doesn't match well with the question.
Solving It
One way to tackle this is by cleaning up the text before creating embeddings:
Chunk 1: "Chocolate cookies are made from the finest imported cocoa."
Chunk 2: "Chocolate cookies sell for $4 a dozen."
By replacing pronouns with the nouns they refer to, we make each chunk self-contained and easier for the AI to match with questions.
3. Navigating Domain-Specific Knowledge
Things get trickier with specialized or branded products. Imagine you have a product description like this:
"Introducing Darlings—the ultimate cookie experience that brings together the timeless flavors of vanilla and chocolate in perfect harmony... And at just $5 per dozen, indulgence has never been so affordable."
Extracting key facts:
Darlings are cookies.
Darlings combine vanilla and chocolate.
Darlings cost $5 per dozen.
Now, if someone asks, "How much are the chocolate and vanilla cookies?", they might not mention "Darlings" by name. The embeddings might prioritize more general chunks about chocolate or vanilla cookies, missing the specific info about Darlings.
4. The Limits of Knowledge Graphs
To overcome these issues, some suggest using Knowledge Graphs alongside RAG. Knowledge Graphs store information as simple relationships:
(Darlings, are, cookies)
(Darlings, cost, $5)
(Darlings, contain, chocolate and vanilla)
In theory, this structure makes it easy to retrieve specific facts. But reality isn't so tidy.
The Complexity of Real-World Information
Not all knowledge fits neatly into simple relationships. Consider:
"Bob painted the room red on Tuesday because he was feeling inspired."
Trying to capture all the nuances of this sentence in a simple graph gets complicated quickly. You need more than just triplets—you need context, causation, and temporal information.
Conflicting Information
Knowledge Graphs also struggle with contradictions or exceptions. For example:
(Richard Nixon, is a, Quaker)
(Quakers, are, pacifists)
(Richard Nixon, escalated, the Vietnam War)
Does the graph conclude that Nixon is a pacifist? Real-world logic isn't always straightforward, and AI can stumble over these nuances.
5. The Human vs. Machine Conundrum
Humans are flexible thinkers. We handle ambiguity, context, and exceptions with ease. Computers, on the other hand, need clear, structured data. When we try to force the richness of human language and knowledge into rigid formats, we lose something important.
The Database Dilemma
All these challenges highlight a broader issue: how we store and retrieve data for AI systems. Balancing the need for detailed, accurate information with the limitations of current technology isn't easy.
Embedding databases can become unwieldy as they grow. Knowledge Graphs can help organize information but may oversimplify complex concepts. We're still searching for the best way to bridge the gap between human language and machine understanding.
So, What Now?
RAG isn't a lost cause—it just isn't a one-size-fits-all solution. To make it work better, we might need to:
- Develop Smarter Preprocessing: Clean and prepare text in ways that make it easier for AI to understand, like resolving pronouns and simplifying sentences.
- Embrace Hybrid Approaches: Combine embeddings with other methods, like traditional search algorithms or domain-specific rules, to improve accuracy.
- Accept Imperfection: Recognize that AI has limitations and set realistic expectations about what it can and can't do.
Final Thoughts
Retrieval Augmented Generation holds a lot of promise, but it's not a magic wand. By understanding its limitations and working to address them, we can build better AI systems that come closer to meeting our needs. It's an ongoing journey, and with each challenge, we learn more about how to bridge the gap between human knowledge and artificial intelligence.
SemDB: Solving the Challenges of Graph RAG
In the beginning there was keyword search.
Eventually word embeddings came along and we got Vector Databases and Retrieval Augmented Generation (RAG). They were good for writing blog posts about topics that sounded smart, but didn’t actually work well in the real world. Fast forward a few years and some VC hungry individuals bolted Graph Databases onto the Vector Databsaes and Graph RAG was born.
It’s still great for blog posts. Still doesn’t work well in the real world.
Enter SemDB.ai.
SemDB is an abbreviation for Semantic Database. It’s a database of “semantics” – a database of meaning. SemDB strives to go beyond mathematical tricks and triples. It stores “meaning”. It allows us to index, retrieve, and act upon data by its meaning – not just its cosine similarity.
Behind the scenes, SemDB uses Ontology-Guided Augmented Retrieval (OGAR); a leap forward, enabling faster, more cost-effective, and scalable solutions for real-world applications.
In this post we will focus on a few shortcomings of the Graph RAG approach and how SemDB solves them. Take a look at this article Graph RAG Has Awesome Potential, But Currently Has Serious Flaws | by Troyusrex | Generative AI for an overview of both Graph RAG and some of its problems.
Advantages of Graph RAG
Graph RAG is a huge advance over traditional Vector search.
- Enhanced Contextual Understanding: By leveraging graph structures, Graph RAG can capture complex relationships between entities, leading to more accurate and context-aware information retrieval. This is particularly useful for tasks requiring deep understanding and reasoning.
- Improved Retrieval Precision: Graph RAG can improve retrieval precision by using graph-based indexing and retrieval methods. This ensures that the most relevant information is retrieved, even if it is buried within a large dataset.
- Mitigation of Hallucination: Traditional language models sometimes generate "hallucinated" information, which is not accurate or relevant. Graph RAG helps mitigate this issue by referencing structured knowledge bases, ensuring the generated content is grounded in factual data.
- Domain-Specific Knowledge: Graph RAG can be tailored to specific domains by incorporating domain-specific knowledge graphs, making it highly effective for specialized applications such as legal research, medical diagnostics, and technical documentation.
Problems with Graph RAG
But, real world Graph RAG applications have a couple significant problems:
- Speed: Graph RAG is horrendously slow for real world applications, often taking minutes to respond.
- Cost: Data preparation can cost many thousands of dollars for moderately sized datasets.
- Scalability: The reliance on clustered communities makes scaling challenging.
- Accuracy: Testing has shown little increase in search accuracy compared to traditional RAG.
SemDB to the Rescue
If the progression has been
Keyword Search → Vector Search (RAG) → Graph Search (Graph RAG)
Then let’s skip ahead a few progressions and get the end:
Keyword Search → Vector Search (RAG) → Graph Search (Graph RAG) → ??? →OGAR - Ontology Guided Augmented Retrieval
You gotta admit, it’s an awesome acronym, right? OGAR…. Grrr.
Vector Search and Graph RAG attempt to allow us to search by meaning. Before the arrival of ChatGPT, scientists used to think about things like “How do we represent meaning? What does it mean “to mean”?” There is a rich history of meaning representation that goes beyond word embeddings (vectors) and triples (graphs). Unfortunately, it’s now easier to outsource every task to a multi-hundred gigabyte neural network, than it is to write code. When all you have is an LLM, everything looks like a prompt engineering task.
In contrast to Graph RAG, Semantic Database (SemDB) is designed to handle complexity effortlessly. Its ontology-driven framework and Local Understanding solve the problems of Graph RAG.
Local Understanding
As I previously mentioned, not everything needs to be outsourced to ChatGPT. SemDB is able to understand somewhere around 80-90% of sentence inputs without the use of an LLM. That means it can do 80-90% of the processing work without paying a per-token fee.
One of the greatest challenges with traditional Graph RAG systems is the prohibitively high cost of entity extraction, driven by heavy reliance on LLMs. Each data chunk and cluster requires multiple LLM calls, quickly adding up to tens of thousands of dollars for large datasets. SemDB, however, does most of this work locally, without involving Big Brother Open AI.
Why is that important?
- Cost: Less LLM calls mean less $$$.
- Accuracy: Local Understanding allows for Organization Specific vocabularies.
- Speed: Local Understanding means local processing… and that’s fast.
- Security: Not every piece of data needs to be sent to our AI overlords, so that they may use it to train their next models
- Note: Open AI and Google both super-duper promise not to ever use your data to train their models. Seriously, they pinky-sweared and everything.
Cost Advantages of Local Understanding
With Local Understanding, SemDB significantly reduces the dependency on costly LLM calls, allowing organizations to process larger datasets at a fraction of the price:
- Reduced External LLM Calls:
- Traditional systems require 1 LLM call per data chunk and 1 per cluster. SemDB’s Local Understanding handles these tasks algorithmically, bypassing the need for external calls entirely.
- This approach slashes costs, making large-scale projects financially viable.
- Scalable Data Extraction:
- Because Local Understanding operates within the organization’s infrastructure, there is no incremental cost for scaling. SemDB can handle datasets with millions of entities without ballooning expenses.
- For example, where traditional methods might cost $60,000 for a million records, SemDB achieves the same results at a fraction of the cost, with no ceiling on dataset size or complexity.
- Optimized Processing for Domain-Specific Graphs:
- By tailoring its Local Understanding capabilities to the specific needs of the organization, SemDB enables the creation of more complex, richly detailed graphs without incurring additional costs.
Beyond Cost Savings: Enabling Richer Graphs
SemDB’s ability to extract more data for less cost doesn’t just save money—it also empowers organizations to build bigger, more detailed, and more accurate graphs:
- Incorporating Nuanced Relationships: Local Understanding allows SemDB to detect subtle, domain-specific relationships that external systems might overlook, enriching the knowledge graph with deeper insights.
- Expanding Data Coverage: By lowering costs, organizations can afford to process larger datasets, capturing more entities and relationships that drive value.
- Iterative Improvement: SemDB’s architecture allows for ongoing refinement of graphs as new data becomes available, further enhancing accuracy and depth.
- Organization Specific Vocabularies: Every company has their own lingo, vocabulary, and internal speak that the LLMs don’t fully understand. SemDB is able to capture that meaning, store it, and operate upon it like any other semantic nugget.
Organizations form their own vocabularies
Conclusion
At Intelligence Factory we use SemDB as the backbone of our applications. It allows us to build complex graphs for various domains. Honestly, our customers don’t care one bit about the advantages of Ontologies over Graphs. Some projects we’ve built on SemDB:
- HIPAA Compliant Chat Bots: That don’t hallucinate give dieting advice to anorexics.
- Iterative Improvement: SemDB’s architecture allows for ongoing refinement of graphs as new data becomes available, further enhancing accuracy and depth.
- Sales Tools: To discover mine thousands of conversations for missed opportunities
What’s most important, however, is that you can take advantage of these technologies with our consumer focused products:
FeedingFrenzy.ai and
SemDB.ai. Both are built on this infrastructure and offer features that make running your business easier. For the more technical side of things, feel free to check out
Buffa.ly.