OGAR

Ontology-Guided Augmented Retrieval

Large language models are excellent at producing language. They are not, by default, reliable systems for retrieving domain-specific truth, enforcing constraints, or taking actions safely in regulated environments. OGAR is an architecture pattern for closing that gap: it uses an explicit ontology to constrain retrieval, validate outputs, and separate what the user meant from what the system is allowed to do.

The problem OGAR is built for

Most retrieval-augmented systems optimize for plausibility. In regulated domains, plausibility is not an acceptable failure mode.

The core failure is structural:

  • Similarity search is not meaning. Embedding-based retrieval can confuse queries that share words but imply different intent (for example, “receiving test kits” versus “receiving too many test kits”).
  • Vector retrieval is hard to audit. The dimensions are not interpretable, chunking choices change outcomes, and it is difficult to explain why a particular source was selected.
  • LLMs are brittle control surfaces. Small changes in phrasing can produce materially different behavior, and prompt injection is a real operational risk.
  • Language to action is not a safe default. LLMs can generate instructions that are syntactically valid but semantically wrong, infeasible, or unsafe—especially when downstream actions have compliance or revenue implications.
  • Incremental knowledge updates are awkward. Most LLM-centric systems treat learning as retraining; real operations require continuous updates without destabilizing existing behavior.

OGAR starts from a different premise: if the output drives a decision, a workflow, or a billable action, you need an explicit representation of the domain and an execution boundary that is deterministic, inspectable, and versionable.

What “correct” must guarantee

OGAR is not “better RAG.” It is a correctness posture. A system implementing OGAR should guarantee:

  • Constrained answer space. Important outputs are selections over known entities, rules, or allowed values—not free-form inventions.
  • Separation of interpretation from execution. The system represents meaning explicitly, then uses that meaning to drive deterministic transforms and actions. The LLM may assist interpretation, but it does not own execution.
  • Deterministic validation and rejection. Claims that matter are checked against domain constraints. If proof does not close, the system rejects, asks for clarification, or escalates.
  • Auditability and reconstruction. The system emits a trace: what was retrieved, what candidates existed, what was selected, what validations ran, and why the final output is permitted.
  • Data sovereignty by design. Sensitive data and structured knowledge can remain within your boundary. If an LLM is used, you control what is shared and what is never exported.

These are not features. They are acceptance criteria for deploying language systems where correctness, compliance, and defensibility are required.

What OGAR is

OGAR (Ontology-Guided Augmented Retrieval) is a pattern for combining:

  • A domain ontology (concepts, entities, relationships, constraints, allowed actions)
  • A meaning representation layer (what the user meant, in terms the system can reason over)
  • A retrieval layer that is guided by ontology structure (not only similarity)
  • A transform and execution layer that produces deterministic outputs and actions
  • Optional LLM assistance where ambiguity exists (disambiguation, parsing, language normalization), without granting the model authority over final decisions

In short: OGAR treats the ontology as the control surface and uses models as helpers, not judges.

How OGAR works at a high level

A typical OGAR loop looks like this:

  • Interpretation (language to candidates). Parse the user request into a set of candidate meanings tied to ontology concepts (for example, entities, intents, time windows, constraints). This can use deterministic heuristics plus model assistance for ambiguous cases.
  • Ontology-guided retrieval (candidates to grounded context). Retrieve relevant entities, relationships, rules, and examples from the ontology and connected systems (databases, APIs, logs), guided by structure and constraints rather than pure similarity.
  • Constrained selection (grounded context to allowed outputs). Convert the problem from “generate an answer” into “select among allowed candidates” or “assemble an output from validated components.”
  • Validation (selection to proof). Validate the result against ontology rules and any required external ground truth. If validation fails, reject or request clarification.
  • Execution and trace (proof to action or artifact). Execute actions through deterministic transforms and emit a trace sufficient for review and reconstruction.

This architecture is designed to bridge the gap between language proficiency and reliable action in real systems.

What you get when this pattern is implemented correctly

OGAR produces operational artifacts that typical RAG systems struggle to provide:

  • Stable outputs under paraphrase: wording changes should not create meaningfully different outcomes when underlying facts are the same.
  • Explicit provenance: which sources and ontology relationships supported the output.
  • Reasoning traces: why an interpretation was chosen, and why an output was permitted.
  • Deterministic transforms: outputs that can be replayed, diffed, and versioned.
  • Controlled failure modes: rejection and escalation are first-class outcomes, not exceptions.

In regulated domains, these artifacts are the product. The language interface is just the front door.

Where OGAR fits (and where it does not)

OGAR is appropriate when:

  • correctness is defined by structured ground truth (codes, policies, program requirements, controlled vocabularies)
  • actions have compliance, safety, or revenue implications
  • you need to explain and reconstruct system behavior
  • knowledge evolves continuously and must be updated without destabilizing everything

OGAR is not the right tool when:

  • the task is inherently open-ended or creative
  • there is no stable ground truth to validate against
  • the goal is plausible conversation rather than defensible action

Relationship to Buffaly

Buffaly is an implementation of the OGAR pattern that was built specifically to control LLM behavior and connect language understanding to reliable retrieval and action execution. It uses an ontology-based meaning layer and a programmable transform/execution layer (ProtoScript) to keep interpretation and execution separable, inspectable, and adaptable as domains evolve.