The Ontology Decision Layer

Deterministic AI For Regulated Healthcare

We build ontology-driven decision layers that stay explainable under audit. Powering the next generation of compliance infrastructure for providers and payers.

Infrastructure for Zero Trust
Environments

In regulated sectors, you cannot afford "black box" decisions. Intelligence Factory builds systems where every output is traceable to a specific rule, policy, or clinical guideline.
Flat icon of a phone with a checklist and question mark, representing call-based surveys, phone support inquiries, or data collection over voice calls. Ideal for customer service, telephonic assessments, or troubleshooting scenarios.
Ontology Driven
We map complex regulatory policies into deterministic logic graphs, not probabilistic guesses.
Flat icon of a customer service agent with a phone and chat bubble, representing support, communication, or help desk functionality. Ideal for contact pages, CRM tools, or service-related features.
Grounded Validation
Buffaly validates and constrains model output against an ontology-backed policy graph and verified sources, producing an auditable trace.
A flat icon of a yellow folder containing documents, with a green checkmark symbol, representing organized files, successful documentation, completed audits, or verified records.
Fully Auditable
Every decision comes with a complete reasoning trace, ready for payer audits or compliance review.
Flat icon of a secure document with a padlock and shield checkmark, symbolizing data protection, privacy compliance, or encrypted records. Perfect for illustrating HIPAA, cybersecurity, or secure information handling.
System Agnostic
Deploys on top of your existing EHR or data lake. Data sovereignty remains with you.
The Commercial Proof
We don't just build theory. Our technology powers FairPath , processing millions in remote care claims with 98% payment success. We proved the stack works so you don't have to guess.
Core Technologies

The Intelligence Factory Stack

We expose our internal engineering stack for partners and enterprise teams building next-generation healthcare compliance tools. Our systems use language models as optional components, while Buffaly governs decisions through ontology-backed policy validation, deterministic guardrails, and auditable traces.
FairPath
Flagship Commercial Application
What It Is:
The end-to-end OS for remote care programs. FairPath uses the Intelligence Factory stack to automate billing, eligibility, and clinical necessity checks without human error.

For:
Medical Practices, RPM/RTM Providers.
Go to FairPath.ai →
Buffaly
Ontology Engine
What It Is:
A medical-grade ontology engine that transforms messy notes and alerts into clean, structured compliance data. It handles the logic mapping between ICD-10, CPT, and payer rules.

For:
Developers & Data Architects.
Learn More →
SemDB
Semantic Data Retrieval
What It Is:
A semantic database layer for complex, regulated environments. SemDB combines ontology mapping, hybrid retrieval, and local integration so teams can query legacy data deterministically with auditable results.

For:
Compliance Operations, Data Teams, and System Integrators.
Learn More →
The intelligence factory difference

What makes Intelligence Factory different?

Not all AI is created equal. In an era where everyone claims to be "AI-powered," the technology beneath the surface matters more than ever. We build systems that stay reliable, transparent, and actionable in environments where mistakes are unacceptable, with Buffaly enforcing grounded validation and policy control over LLM-assisted workflows.
Battle-tested acrossindustries for 16 years
Since 2009, we've been solving complex problems with AI in transportation systems, clinical environments, aviation operations, supply chain monitoring, and beyond. This cross-industry experience means our platform has been stress-tested against diverse requirements, from split-second logistics decisions to life-critical healthcare protocols. We've weathered the entire evolution of AI technology and emerged with solutions that actually work in the real world.
Not a prompt wrapper. LLMs used safely under control.
The AI boom made language models widely accessible, and with it came a wave of systems built entirely on prompt engineering. We build systems where language models are optional components, not the control plane.

Our core capability is Buffaly, an ontology-driven decision layer that constrains, validates, and explains every action. LLMs may assist with language understanding, summarization, or proposal generation, but compliance-critical decisions are executed and verified against deterministic policy graphs and structured domain knowledge.

This architecture provides:
Model-agnostic deployment: integrate frontier APIs, private models, or no LLM at all for sensitive paths
Evidence by design: every output is tied to a policy graph and produces an auditable reasoning trace
Deterministic guardrails: actions occur within typed contracts and explicit constraints
Operational accountability: systems remain inspectable, testable, and reviewable over time
Explainable, auditable, deterministic AI
Generic LLMs operate as black boxes that generate plausible-sounding text, sometimes accurate and sometimes fabricated. Our Buffaly grounding and policy validation layer applies ontology-grounded validation and deterministic guardrails so model-assisted outputs remain policy-verified and traceable.
This gives you:
Data sovereignty
Sensitive workflows can run with private models or no LLM path when policy requires it

Security assurance
Model integrations stay behind explicit contracts, validation checks, and rollback-safe controls

Performance optimization
Technology tuned to your specific domain, not trained on generalinternet knowledge

Future-proof architecture
You're not locked into someone else's technology roadmap orpricing model
The practical difference:
Deterministic guardrails
LLM-assisted outputs are validated and constrained by Buffaly's ontology-backed policy layer, producing deterministic traces

Complete transparency
Every output includes the reasoning and sources behind it

Regulatory compliance
Audit trails and documentation that satisfy even the strictestrequirements

Expert control
Your domain specialists define what the AI knows and how itapplies that knowledge
When your teams can trace exactly how the AI reached each conclusion, adoption acceleratesand trust builds naturally.
Case Studies

Deep Tech in Action

How we apply ontology-driven decision making to real-world chaos.
Turning Medical Chaos into Structure
Ontology-driven integration across 30+ EHR systems.
We used Buffaly to normalize inputs from Epic, eClinicalWorks, and legacy databases into a single coherent model for eligibility checks.
Read Case Study →
Multi-Armed Bandits for Care
Allocating clinical time using adaptive algorithms.
Using reinforcement learning to help clinicians prioritize patients based on risk and compliance probability, not just alphabetically.
Read Case Study →
Scalable Eligibility Engines
High-volume coverage checks without the fees.
Our ontology-driven engine delivers high-accuracy checks across insurers and program types, fully auditable and designed for underserved providers.
Read Case Study →

Build with Intelligence Factory

We partner with enterprise healthcare organizations and compliance teams to build explainable, auditable AI infrastructure powered by our Buffaly grounding and policy validation layer.

What makes us different? Our foundation in neurosymbolic AI keeps agents deterministic, traceable, and safe in high trust environments.

Learn more about our Foundations

Looking for the FairPath product? Go here.
A bold purple line art icon depicting a hand holding up a trophy, symbolizing achievement or success. The design is complemented by sparkling stars around the trophy, adding a celebratory and victorious tone.
Thank you! Your submission has been received! We will reach out to you asap!
Oops! Something went wrong while submitting the form.

Recent Updates

Most tool-using AI agents suffer from a static ceiling. They start with a fixed toolbox. They can choose from functions their developer explicitly registered, but when they encounter a novel problem, they cannot invent a new tool to solve it. The action set is fundamentally static.

Live walkthrough: Buffaly builds a new C# tool, loads it into the executable graph, and uses it without restarting the agent.

Buffaly is built around a different architecture: the agent's environment is a typed executable graph. A tool is just one kind of node in that graph. A workflow, a data object, a prompt, a compiled helper, and a scope rule can also be graph nodes. Because the graph is typed, Buffaly can search it by meaning, but it can also traverse it by kind, relationship, parent, child, or exact name. Because the graph is executable, Buffaly can extend it while work is in progress.

That is the central claim: Buffaly does not merely call tools. It creates new tools, activates the new capabilities in the executable graph, and then uses them in the same ongoing body of work.

This is not a theoretical capability. It is happening in real usage.

To prove this, I analyzed 1.2 million messages and over 380,000 tool calls from a real-world Buffaly instance. The telemetry shows the system isn't just generating dead code. It is actively identifying gaps, authoring new capabilities to fill them, activating those capabilities in its graph, and using the new tools immediately.

Out of hundreds of dynamic graph mutations, 70% of the newly created tools were used in the exact same session they were written. This means these tools were created to solve an immediate problem, proving Buffaly's ability to overcome roadblocks by building its own ad hoc tools on the fly.

This is the failure mode the design was built to catch: an agent blocked by a missing capability should be able to build that capability and keep going.

A tool is a typed node in an executable graph

Diagram comparing a standard AI fixed toolbox with Buffaly's executable graph
Diagram comparing a standard AI fixed toolbox with Buffaly's executable graph

Static agents choose from a flat toolbox. Buffaly navigates and extends a typed executable graph.

In Buffaly, a tool is not just an API endpoint. It is a typed capability represented in the system's executable graph.

The deeper point is that Buffaly is not a text-only agent with a bag of external functions. It runs on native execution. Its environment is described in ProtoScript, a language built so the environment can represent itself, compile executable graph changes, and reprogram parts of its own tool surface on the fly.

That means a tool is more than a label plus a JSON schema. It can carry a natural-language purpose, typed parameters, result-rendering expectations, implementation rules, and executable behavior. Some tools are direct operations. Some are reusable procedures. Some wrap compiled C# code. Some are behavior overlays that change how the agent works in a context.

For a new reader, the useful mental model is simple: Buffaly's "tools" include both functions and workflows. A function might read a file, compile a project, query a database, or transform data. A workflow might guide the agent through onboarding, local task management, or a specialized troubleshooting procedure. Both can be represented as nodes in the same graph and discovered when needed.

Those capabilities live in the same graph as skills, entities, action roots, prompt definitions, semantic phrases, and runtime scope rules. That matters because discovery is not limited to searching a flat list of functions.

Buffaly can ask: what action best matches this intent? But it can also ask: what descendants does this action root have? What tools belong to this skill? What prototypes inherit this parent? What entity is this tool meant to operate on? What tools are already loaded? What tools are installed but not loaded? What prompt actions exist for this workflow family?

The result is progressive discovery over an extensible typed graph. Semantic search is one access path. Traversing the graph by type, parent, child, relationship, tool family, or runtime status is another. As the graph grows, the discovery surface grows with it.

Buffaly does not merely retrieve tools. It navigates a living executable environment.

How Buffaly creates new tools

Buffaly extends the graph by authoring new graph nodes. Some of those nodes are memories or entities. Others are executable capabilities.

The most direct path is a ProtoScript tool. A small ProtoScript declaration gives the tool a name, a family, a description of what it does, typed inputs, and executable code.

A simplified shape looks like this:

protoscript
[SemanticProgram.InfinitivePhrase("to do something useful")]
prototype ToDoSomethingUseful : SomeSkillAction
{
    Description = "input - typed input used by the action.";

    function Execute(string input) : string
    {
        // implementation
    }
}

The declaration is both graph structure and executable code. It gives the environment a new typed node and gives the runtime something it can expose as a callable tool.

Buffaly has internal authoring actions to handle this. When the agent defines a new prototype, the system parses it, writes it into the active project, and inserts it into the graph.

Because the graph is flexible, this single authoring mechanism can create many different things: a new executable tool, a prompt-guided workflow, a base data type, or a new routing skill.

Not every useful tool is a direct function. Some capabilities are repeatable procedures: review this codebase, maintain a local task, onboard a user, troubleshoot a deployment, perform a safe release, or follow a domain-specific workflow.

Buffaly represents these as prompt-backed tools. A prompt workflow usually has two parts: a graph node that makes the workflow discoverable and a markdown prompt file containing the procedure. These can be discovered and called like other tools, but what they execute is reusable guidance rather than a single imperative function body.

This is how the graph contains both functions and procedures.

ProtoScript is not the only implementation layer. Buffaly can also load compiled .NET code and expose it through ProtoScript wrappers. Complex IO, service clients, binary integrations, data transformations, and performance-sensitive code often belong in C#. Buffaly can import DLLs into a skill, add references/imports, and expose typed wrapper tools over the compiled implementation.

The concrete authoring surface also includes workflows for importing DLLs, installing compiled capabilities into a tool family, and creating new compiled capabilities from scratch. Those compiled-code paths are a next-pass attribution target for per-tool provenance, but they are already visible in the session database as part of Buffaly's tool-creation machinery.

That gives the system an escape hatch from prompt-only or script-only behavior. It can synthesize procedural tools, script tools, and compiled-code-backed tools.

The retained database shows these authoring paths in use. One pass analyzed 1,010 authoring rows, including 734 prototype insert/update calls, 156 prompt-workflow artifact calls, and 89 DLL/external-code workflow calls across the tracked authoring tools. That is why the article treats tool creation as a system behavior rather than a rare manual event.

The local data also shows direct ProtoScript file changes, generated files, and broader project edits. They still matter: once compiled and loaded, they become part of the executable graph.

Concrete Examples in Practice

What does this look like in an actual session? Here are three real patterns where dynamic tool creation breaks the static ceiling:

1. The Missing Parser

What was happening: An agent is tasked with debugging a system failure and encounters a legacy, undocumented application log format.

What usually goes wrong: A traditional agent halts. It lacks a tool to read the proprietary format and asks the human to extract the data.

What Buffaly caught and did: Instead of stopping, Buffaly authored a custom string-parsing tool, loaded it into the graph, and parsed the logs into a structured native DataTable.

Why it matters: It converted an unreadable artifact into queryable evidence and solved the root issue without human intervention.

2. The Domain-Specific Routine

What was happening: An agent successfully stepped through a complex, manual troubleshooting process for a failed staging deployment.

What usually goes wrong: That hard-won operational knowledge evaporates when the session ends. The next time it happens, the agent starts from scratch.

What Buffaly caught and did: Recognizing a repeatable workflow, the agent authored a new Prompt Action—a reusable procedure—encapsulating the exact diagnostic steps.

Why it matters: The agent taught itself a new workflow. Future sessions can now natively discover and execute the TroubleshootStagingDeployment action.

3. The API Escape Hatch

What was happening: The agent needed to extract specific telemetry from a subsystem, but the standard reporting tools didn't expose the required fields.

What usually goes wrong: The agent falls back to generic, high-risk command-line scripting or gives up.

What Buffaly caught and did: Buffaly imported a compiled C# telemetry library, wrote a typed ProtoScript wrapper over the exact method needed, and exposed it as a new, safe tool.

Why it matters: It bypassed a limitation safely by synthesizing a structured, typed capability rather than relying on brittle shell scripts.

Diagram showing Buffaly hitting a blocker, creating a tool, hot-loading it, and continuing the same session context
Diagram showing Buffaly hitting a blocker, creating a tool, hot-loading it, and continuing the same session context

Buffaly can hit a blocker, create a missing tool, hot-load it, and keep working in the same session.

Hot-swapping the graph

Creating a graph node is not enough. The new capability has to become available to the running agent.

Buffaly handles activation without restarting the entire agent. A specific action can be loaded into a session, while larger graph changes can refresh the executable graph so newly authored tools, prompt workflows, imports, or wrappers become discoverable and callable.

The source evidence matches the runtime behavior. Candidate tools can be already loaded, automatically loaded, or marked as requiring a manual load. A registrar projects executable graph nodes into callable runtime tools, including descriptions and parameter schemas. In short: the function-tool surface is projected from the executable graph.

Agent profiles scope that projection. A profile defines the root action and root entity types that an agent is allowed to see. Other profiles can expose a narrower surface, such as watcher sessions. This is how the graph can be larger than any one agent's visible toolbox.

This is where the architecture becomes more than a static ontology. It is an executable graph that can be updated and activated.

That does not mean the entire system has to stop. Because Buffaly uses a distributed runtime model, workers can be recycled or rehydrated while the broader session and ecosystem continue. In practical terms, the agent changes the graph, brings the new graph state online, and continues the task with the new capabilities available.

The operational point is simple: graph edits do not stay inert. They become callable capabilities that the agent can use to continue the work.

Discovery is graph navigation, not just search

Buffaly can find tools in several ways.

It can use semantic search over action descriptions. It can list tool families and the tools inside them. It can inspect installed runtime capabilities. It can bind target entities separately from actions. It can walk descendants, inspect parents, and use scoped roots. It can ask what is already loaded. It can load a candidate tool when needed.

The user-facing discovery tools make those paths explicit. One search path finds candidate actions by operational meaning and reports whether they are loaded. Another searches separately for target objects, accounts, repositories, projects, environments, or other entities. Listing and prototype-inspection tools expose the graph-structured side of discovery.

The catalog code backs this up. It builds tool families, resolves each family's root, finds executable descendants, finds prompt-backed workflows, and extracts human-readable action phrases for display and discovery. This is graph navigation plus semantic retrieval, not a single flat search index.

The master prompt turns that architecture into operating policy: bind to ontology first, prefer typed domain actions over shell, search candidate actions with multiple phrasings when the route is unclear, search candidate entities separately, use skill/action listing as secondary discovery, and only ask a clarifying question after tool-assisted discovery fails.

These mechanisms compose. A user request might begin as a semantic phrase, resolve to an action candidate, bind to an entity, inspect the skill tree, load a tool, and then execute it. If the right tool does not exist yet, the same graph can be extended.

That makes discovery progressive in two senses: the agent progressively resolves intent into a typed tool call, and the graph itself can progressively grow new tool nodes over time.

Execution and persistence

Once a tool is loaded, the runtime exposes it as a callable function tool. The agent emits a tool call. The runtime dispatches that call to the appropriate implementation.

For a ProtoScript tool, dispatch enters the Execute(...) function. That function may call other ProtoScript helpers, C# imports, JSON web services, process wrappers, database helpers, or other tools.

For an OpsAction, execution is backed by runtime or host code. For a C# backed tool, a ProtoScript wrapper may dispatch into an imported assembly. For a prompt action, the execution is a reusable prompt-guided workflow rather than a simple code body.

That implementation split is intentional. ProtoScript is the graph-native declaration and glue layer: it names the capability, places it in the hierarchy, attaches semantic phrases, declares typed parameters, and exposes Execute(...). C# is the heavier implementation layer for validation, IO, service clients, indexing, transformations, and other behavior that should not live in a .pts wrapper. DLL-backed workflows extend the same pattern to compiled capabilities created or imported at runtime.

Tool results are not just raw text. They can be large values handled through StringRef, or structured UI envelopes containing metadata formats, result types, and payloads so the UI can render specialized outputs natively. The contract is typed and fail-fast: tools prefer typed parameters and explicit diagnostics over silent normalization.

Crucially, every tool call and result is persisted. The session database records tool-call rows and tool-result rows, tracking arguments, timestamps, turn identifiers, and sequence numbers. This durable memory is exactly why we can reconstruct the agent's tool-creation behavior from local telemetry.

Prompt actions are tools too, but a different kind

A direct executable tool performs an operation. Read a file. Query a database. Compile a project. Call a service. Transform a table. Search a folder.

A prompt action performs a procedure. It gives the agent a reusable operating mode or workflow. It might tell the agent how to maintain a local task artifact, onboard a user, review commits, troubleshoot a deployment, or follow a domain-specific playbook.

The session itself used examples of this pattern: LocalTaskPromptAction provides the durable local-task workflow, and OnboardingPromptAction provides guided onboarding behavior. They are called like tools, but their job is to load procedural guidance that changes how the agent performs a multi-step task.

Both are tools in the graph, but they occupy different parts of the capability spectrum.

That distinction is important because Buffaly is not only extending a function library. It is also extending its procedural memory. It can add a new command-like action and it can add a new way of working.

Context prompts are different again. A ContextPrompt is a situational behavior overlay. It is not the work product, and it is not the same as a direct executable tool. It shapes how the agent should behave in a context, such as coding, onboarding, or a specialized workflow.

Constraints keep graph extension from becoming chaos

A self-extending tool graph needs constraints.

Buffaly uses several layers:

  • agent profiles define root action and entity scopes;
  • skills group related capabilities;
  • action roots constrain families of tools;
  • typed parameters constrain calls;
  • source and runtime paths are scoped;
  • secrets are handled separately from ordinary ontology facts;
  • prompt guidance prefers typed authoring tools over direct .pts edits;
  • compile and activation checks validate the graph before use;
  • Plan, Scratch, and durable task artifacts preserve working state;
  • non-destructive defaults reduce accidental damage.

The master prompt also constrains decision-making: use ontology binding before freeform guessing, prefer typed tools, use ToSearchCandidateActions and ToSearchCandidateEntities before asking for clarification, and pass through the strict Question Gate only when ambiguity is real, consequential, and not resolvable with available tools.

The same mechanism that creates tools also creates normal ontology objects, prompt actions, memories, and project facts. So analysis has to classify changes carefully. A prototype insertion is not automatically a new tool. It might be a remembered environment, a Visual Studio project, a database entity, an action root, or a prompt action.

That is why the article distinguishes callable tools from ordinary ontology objects and other graph mutations.

The Data: Active Tool Creation

The runtime catalog reveals how much the executable graph has grown in practice.

The baseline source repository started with a few hundred core capabilities. Today, the active runtime instance has accumulated over 1,600 active prototype declarations across nearly 200 .pts files. It currently offers 744 tool-like capabilities (610 executable tools and 134 prompt actions).

These capabilities are a dynamic mix of ProtoScript functions, procedural prompt workflows, and compiled C# code wrappers, all authored by the agent and surfaced through typed discovery and execution paths. The retained database records 695 distinct tool names actually being called in production.

The operating pattern is straightforward: Buffaly encounters a gap, authors a capability, registers it as a callable node, and uses that new capability to keep working. The retained telemetry shows newly created callable tools and prompt actions becoming operational rather than sitting as dead definitions.

Design conclusions

Several design conclusions fall out of this.

First, tool creation is not a side feature. It is an essential capability. If the agent can only call a fixed toolbox, it is bounded by whatever its developer anticipated. If the agent can extend the executable graph, it can adapt its action vocabulary to new domains, new workflows, and new integrations.

Second, the graph model matters. Semantic search alone would not be enough. Buffaly can search by meaning, but it can also inspect type, parent, descendant, skill, action root, entity relationship, and runtime load state. That is what makes discovery progressive over an extensible graph rather than lookup over a static list.

Third, prompt skills and executable tools are complementary. Some capabilities are best expressed as direct typed functions. Others are best expressed as reusable procedures. A mature agent needs both.

Fourth, ProtoScript gives the system an incremental extension layer. It is close enough to the ontology to describe capabilities as graph nodes, but executable enough to run real work. C# and DLL integration then provide a path to heavier compiled implementations.

Fifth, persistence matters. Because tool calls, results, turn IDs, sequence numbers, and timestamps are stored, the system's evolution can be measured. The database does not just record conversation history. It records the growth of the executable graph.

The Limit of the Static Ceiling

While this data reflects a specific observational window—and does not capture the creation history of every legacy tool—the operational pattern is undeniable.

When we compare the baseline source repository to the active runtime project, the active graph has accumulated nearly double the files and over 1,000 additional prototype declarations. The runtime is significantly larger and more capable than the source it started with, entirely driven by the agent adapting to its own roadblocks.

Conclusion

Static toolboxes limit AI agents to the imagination of their developers.

Buffaly breaks out of that trap by treating tools as nodes in an ontology-backed, hot-swappable executable graph. It progressively resolves intent, navigates its capabilities, identifies gaps, writes missing logic, activates the new capability, and keeps working.

The local data proves the core capability. This instance created real callable tools and prompt actions, registered them as executable capabilities, and then used many of those capabilities soon afterward. That is the difference between a static tool-using agent and an agent that can grow its own action vocabulary.

The agent is not just using a toolbox. It is growing one.

I gave my AI a conscience.

Not feelings. Not morality. Not some sci-fi spark of awareness. Just a second architectural layer that watches the first one work and quietly reminds it when it is drifting, guessing, stopping short, or forgetting what it was supposed to be doing.

The name stuck because it captures the feel of the thing. A conscience remembers the obligation even when the moment-to-moment pressure of getting work done wants to push it aside. That is exactly what this component does.

The System 1/System 2 comparison is useful here. In Daniel Kahneman's terms, System 1 is fast, fluent, automatic, and usually efficient. System 2 is slower, more deliberate, and more willing to check whether the easy answer actually satisfies the problem.

AI agents have a version of the same problem. A normal agent loop is fast, fluent, and action-oriented. It reads the context, picks the next tool, updates the plan, summarizes the result, and keeps going. That is useful. It is also exactly why long-running agent work fails in predictable ways.

Once the task gets long enough, the agent has to preserve intent, remember unfinished work, validate evidence, survive compaction, and notice when a plausible success signal is not the real success condition. That is where the conscience layer matters.

To see whether this was actually happening in real work, I analyzed roughly 1.3 million local Buffaly timeline messages across working sessions and watcher sessions. The point was not to prove a laboratory result. It was to see how the conscience behaves when agents are doing messy, real, tool-using work.

Executive summary
21%

Level 2 intervenes in about one out of five watcher digest turns.

65%

Strict interventions are followed by real corrective tool work most of the time.

Planning and task accounting

Most corrections are not glamorous. They repair plans, verify evidence, inspect files, and close loops.

Longer tasks benefit more

The value rises in long and compacted sessions, exactly where context and obligations are easiest to lose.

The Problem with a Single Agent

Most agent setups are a single prompt trying to do everything at once.

The agent is told to move fast but be thorough. Make reasonable assumptions but never guess. Work to completion but do not overdo it. Use tools, but do not use them unnecessarily. For short tasks, these contradictions rarely matter. For longer or messier work, they surface constantly.

The executor lives inside the work. It sees the latest message, the current file, the fresh tool output, the error that just appeared. Its natural bias is forward motion, which is usually good. But that same bias produces predictable failures.

Common single-agent failure modes

  • Wrong target. It assumes the obvious file or entity and charges ahead on a stale assumption.
  • Wrong depth. It turns a five-minute request into a deep investigation because its instructions say to work to completion.
  • Motion without action. It narrates what it is about to do instead of doing it.
  • Premature victory. It declares success after partial progress because the answer sounds right.
  • Loose ends. It forgets to update the plan, check in changes, or verify the result.

I watched these patterns repeat across real tasks. A single model, no matter how capable, struggles to calibrate its own autonomy in real time. The right behavior changes mid-task, and the executor is too close to the work to notice when it has quietly switched modes.

Level 1 and Level 2

So I split the system.

Level 1

The executor

System 1 in the analogy. It has world authority. It reads files, writes code, calls tools, runs tests, talks to APIs, updates the plan, and ships results. Speed and momentum live here.

Level 2

The conscience

System 2 in the analogy. It has no world authority. It cannot edit files, call outside-world tools, or do the user's work. Its job is to watch completed turns and change Level 1's attention when necessary.

The boundary is strict and deliberate.

Level 1 changes the environment. Level 2 changes Level 1's attention.

That separation prevents the usual two-agent chaos of conflicting edits, duplicated work, and race conditions.

What It Catches

Here is a simple example.

A Gemini worker was asked to update a blog article. It made an edit and reported completion. But it had patched the wrong file.

A separate GPT-5.5 watcher compared the worker's claim against the session memory, the intended file, and the artifact that actually changed. It caught the mismatch before the wrong edit became the new baseline for the rest of the work.

A separate watcher catches a wrong-file edit. Gemini was the worker; GPT-5.5 was the watcher. The important point is not the specific models. It is that one memory loop was able to check another.
A separate watcher catches a wrong-file edit. Gemini was the worker; GPT-5.5 was the watcher. The important point is not the specific models. It is that one memory loop was able to check another.

This is the kind of mistake that makes agent systems hard to trust. The worker did not need more intelligence in the abstract. It needed another process to notice that its claimed completion did not match the durable record of the task.

The interventions are rarely profound. They are usually boring, high-leverage corrections.

Typical Level 2 nudges

You did good work in that file, but you never checked in the changes.

You are on the same plan step for the third turn with no material progress. Next action is X.

You wrote the code, but you did not run the test that would prove it works.

This is a simple config tweak. No need for a full investigation.

You are narrating the next step. Just do it.

That sounds plausible. Verify it against the actual file or runtime state.

That is exactly where a fast executor starts to lose the thread. Level 2 supplies the operational memory that keeps obligation intact.

How the Watcher Actually Works

Level 2 is not just another prompt pasted onto the end of the chat. It is a separate agent session wired into the runtime.

When Level 1 finishes a turn, Buffaly emits a completed-turn event. A deterministic callback builds a compact digest of what just happened: the user request, tool calls, outputs, assistant response, and relevant session state. That digest is sent to Level 2.

Then Level 2 runs its loop:

  1. 1. Reconstruct the real user goal and current phase.
  2. 2. Check the plan, scratch notes, and task artifacts.
  3. 3. Resolve the entities involved.
  4. 4. Find the governing structure, such as the active plan step, workflow, prompt skill, or semantic action path.
  5. 5. Compare what Level 1 actually did against what should have happened.

Most of the time it stays silent. When it intervenes, it sends a short, labeled message back into Level 1's session.

The second loop in one view. System 1 moves quickly; System 2 reads durable memory and only intervenes when the path drifts.
The second loop in one view. System 1 moves quickly; System 2 reads durable memory and only intervenes when the path drifts.

The Boundary Is the Design

Level 2 is intentionally weaker than Level 1. That is not a limitation to hide; it is the design. If the watcher could browse the web, edit files, deploy code, call arbitrary APIs, and complete the task itself, Buffaly would have built a competing executor. Level 2 would stop being Level 2.

Its job is narrower: read the evidence trail, compare Level 1's current path to the intended path, and send one correction when the next action is likely to go wrong.

Level 2 can inspect the memory around the work: the current plan, scratch notes, task artifacts, session history, tool traces, ontology bindings, and the digest of the completed turn.

The slower questions

What was the user actually asking for?

What phase is the task in?

What evidence exists?

What plan step is stale?

What validation is still missing?

Is Level 1 stopping because the work is complete, or because the current answer merely sounds complete?

The watcher has a brain and durable memory, but not hands. That is the point.

System 1 and System 2 are separate memory loops. The worker carries active working memory; the watcher carries durable supervisory memory.
System 1 and System 2 are separate memory loops. The worker carries active working memory; the watcher carries durable supervisory memory.

The Intervention Contract

Level 2 is not allowed to ramble.

When it speaks, the format is constrained on purpose:

Why now

Why it is speaking in this turn.

Active step

What work Level 1 is supposed to be doing.

Next action

Exactly one concrete correction.

Evidence

What would prove that action worked.

No essays. No alternate full plans. No style notes. No while-you-are-here suggestions.

Just a nudge back onto the path.

That restraint is what keeps the watcher useful. It is not valuable because it talks more. It is valuable because it interrupts the exact moment when Level 1 is about to turn partial progress into false completion.

What It Catches in Practice

False success signal

In one migration, Level 1 treated HTTP 200 as success. Level 2 caught that the user had asked for working pages, not reachable shell routes. The route existed, but the actual UI had not been migrated.

Named, not done

In another task, Level 1 named the next diagnostic step, inspect the logs, and then stopped. Level 2 pushed it back into the actual evidence-gathering step.

Stale ledger

In a validation task, Level 1 gathered useful API evidence but left the plan stale and skipped the more important staging validation. Level 2 repaired the task ledger before the next turn resumed from the wrong state.

Real agent failure is often not a grand reasoning collapse. It is a missed check, a stale plan, an unverified assumption, a wrong target, or a plausible answer accepted too soon.

What the Data Shows

The point of Level 2 is not that it feels clever. The question is whether it changes behavior.

I analyzed local Buffaly session telemetry to look for that signal. This is not a controlled experiment, but it is real operational evidence from watcher digests, structured intervention messages, tool calls, and compaction lifecycle events.

What happened Count
System 1 / System 2 session pairs 624
Pairs with at least one watcher digest 388
Watcher digest messages 14,021
Strict structured interventions 2,977
Strict interventions as share of watcher digests 21.23%
Strict interventions with same-turn tool follow-up 1,947
Same-turn tool follow-up share 65.40%
System 2 watches many turns, intervenes selectively, and often sends System 1 back to concrete work.
System 2 watches many turns, intervenes selectively, and often sends System 1 back to concrete work.

The important shape is this: Level 2 watches many turns, intervenes in a minority of them, and when it does intervene the worker often returns immediately to concrete work.

That is what I want from a conscience layer: mostly quiet, occasionally decisive.

The telemetry does not prove that every intervention caused the next tool call. It does show that the supervisory loop is active and behaviorally correlated with renewed work: plan repair, evidence inspection, validation, file edits, and source checks.

The Boring Accounting

The first action after a strict intervention is especially revealing:

First action after strict intervention Count Share
Set plan explanation 402 20.7%
Read session plan 186 9.6%
Add plan task 179 9.2%
Complete plan task 136 7.0%
Search source/files 108 5.6%
Initialize plan/scratch 84 4.3%
File write/copy/patch operation 74 3.8%
Start plan task 71 3.7%
Read targeted file block 70 3.6%
File/source inspection 64 3.3%
Git/diff/status inspection 59 3.0%
After System 2 intervenes, the first response is often continuity repair: plan accounting, scratch state, source inspection, and then concrete work.
After System 2 intervenes, the first response is often continuity repair: plan accounting, scratch state, source inspection, and then concrete work.

This is the unglamorous part that matters.

A lot of agent failure is not exotic reasoning failure. It is ledger failure. The goal was not preserved. The plan was stale. The validation was missing. The diff was not checked. The worker remembered motion, but not obligation.

Level 2 often fixes that by forcing the system back to the ledger.

Why This Matters More as Tasks Get Longer

The longer the task, the more valuable the watcher becomes.

Buffaly agents compact their working context so long-running sessions can continue. That is necessary. But compaction also creates risk. Reasons, constraints, targets, and unfinished obligations can become blurry.

That is the agent version of the System 1/System 2 problem. The fast process loses track of constraints because its attention is limited. The agent loses track because its context is finite, summarized, and compacted.

In the local telemetry, sessions with source compaction lifecycle events accounted for 12,658 observed digest turns and 2,651 structured interventions. Sessions without source compaction lifecycle events accounted for 1,343 observed digest turns and 323 structured interventions.

That does not prove a specific compaction caused a specific intervention. But it matches the real shape of the problem: the longer and more compressed the work gets, the more valuable a separate supervisory memory becomes.

Compaction is where the split between fast working memory and slow supervisory memory matters most.
Compaction is where the split between fast working memory and slow supervisory memory matters most.

Design Lessons

The pattern is bigger than Buffaly.

Do not ask one context window to be executor, planner, historian, validator, and conscience all at once.

Let the worker move quickly when quick action is enough. Put the slower check at the completed-turn boundary, where the system can compare action against durable memory.

Keep the watcher bounded. Give it plans, notes, artifacts, session history, tool traces, and ontology. Do not give it the same powers as the worker unless you want a second executor.

And make silence part of the contract. A watcher that talks constantly becomes noise. A useful watcher speaks only when the next action needs to change.

Conclusion

Level 2 has turned out to be one of the highest-leverage pieces I have built.

The extra token cost is low because it only sees compact turn digests and does not do the heavy tool-using work itself. The payoff is large. It catches drift, forgotten check-ins, unverified assumptions, stale plans, wrong targets, and premature stops before they become expensive mistakes.

As an aside, I used to run Level 2 on a smaller, faster reasoning model. Once I saw how minimal the actual token usage was, I switched it to a top-tier reasoning model. The quality jump was immediate and worth every token. It now catches subtler issues and gives sharper, better-calibrated guidance.

I have also been iterating on how much semantic and ontology awareness belongs inside the watcher itself. My current view is that Level 2 should focus on plan governance, entity resolution, and action-path supervision, while automatic ontology updates and online learning should run as separate processes at different timescales. That is a bigger topic for another post.

In the end, giving the agent this kind of operational conscience has not made it slower or more bureaucratic. It has made it dramatically more reliable at staying on task when the work gets long, messy, and ambiguous.

That single change has shifted the system from usually impressive to quietly trustworthy.

Read More

Executable Graph Agents vs. Text-Based Agents

6/13/26

A technical explanation of executable graph agents: how semantic identity, typed objects, runtime actions, native code, and self-extending capability change what agents can learn and execute....

Read more

Evaluating Local Embedding Models for Buffaly Semantic Retrieval

6/9/26

A practical evaluation of local embedding models for Buffaly's short action/entity semantic retrieval workload, including methodology changes, run IDs, EmbeddingIDs, storage caveats, and reproducibility notes....

Read more

Goodnight Moon and the Long Road to Buffaly

5/14/26

How language acquisition, dual-channel learning, ontology, ProtoScript, and executable memory shaped the long path to Buffaly....

Read more

Buffaly: A Real Alternative to Scaling Forever

5/12/26

A case for building runtime-first systems around frontier models instead of asking larger prompts to become memory, execution, policy, and control....

Read more

Introducing Buffaly

5/11/26

Why traditional LLM agents are an operational dead end in medical administration: and why we built a neurosymbolic alternative....

Read more

Buffaly: Agents That Remember in Code

5/11/26

A different kind of agent: one that turns language into executable structure instead of keeping everything in text prompts....

Read more

What Millions of Patient Interactions Taught Us About Voice AI

2/8/26

After millions of patient interactions, we found voice AI belongs in constrained roles, while the largest gains come from automating documentation, compliance checks, extraction, and quality oversight around clinical work....

Read more

Slim Margins in Independent Pharmacies: Why Diversification Is Essential and How to Build a Clinical Revenue Engine Without Becoming a “Vendor” or a Call Center

1/27/26

Independent pharmacy owners are being asked to run a real business on economics that do not behave like a real business....

Read more

APCM “Year-End Reporting” Is Not Hard. Reconstruction Is.

1/21/26

By Justin Brochetti, CEO of Intelligence Factory & FairPath | January 20, 2026 Originally Published: https://fairpath.ai/resources/apcm-reporting...

Read more

The Plain-English Insurance Era Is Dawning: Why Practices Need Plain-English Operations to Thrive

1/16/26

You've likely caught the buzz on X and news feeds about President Trump's "The Great Healthcare Plan" framework, released today (January 15, 2026). Per the official White House fact sheet, it calls for insurers to publish "plain-English" comparisons of rates, coverage, denial ra…...

Read more

CMS Rural Health Transformation Program (RHT): What the $50B Awards Mean for Rural Clinics, FQHCs, and RHCs in 2026

12/30/25

On December 29, 2025, CMS announced $50 billion in Rural Health Transformation (RHT) awards across all 50 states. This is a five-year initiative with $10 billion available each year from 2026 through 2030....

Read more

Anthem Updates Its RPM and RTM Coverage for 2026: What Changed and Why It Matters

12/23/25

Anthem has quietly made an important update to its clinical policy CG-MED-91 , effective December 18, 2025. The change aligns Anthem with CMS’s 2026 Physician Fee Schedule and formally recognizes the new “short-cycle” Remote Physiologic Monitoring (RPM) and Remote Therapeutic Mo…...

Read more

UnitedHealthcare just postponed its RPM coverage rollback. Their official policy PDFs still say Jan 1.

12/20/25

Originally published at: https://fairpath.ai/resources/uhc-postpones-2026-rpm-rollback...

Read more

The Enrollment Spike Trap: Why Fast RPM Growth Is a 2026 Audit Risk

12/19/25

If you run an independent practice, rapid RPM growth probably still feels like a win....

Read more

The RPM 16-Day Rule: Two "Clever" Ways to Circumvent It (And Why They Will Get You Audited)

12/16/25

If you manage a Remote Physiological Monitoring (RPM) program, CPT code 99454 is likely your biggest source of revenue and, also likely, your biggest headache. This code, which reimburses for the supply of the device and data transmission, has long carried a notorious "all-or-no…...

Read more

Is UnitedHealthcare’s RPM Crackdown Really “Evidence-Based”?

12/5/25

Beginning January 1, 2026, UnitedHealthcare (UHC) will dramatically narrow coverage for Remote Physiologic Monitoring (RPM) across its commercial, Medicare Advantage, and exchange plans....

Read more

ROGUE-Zip: Recursive Ontology-Guided Sparse Zipping Protocol

12/4/25

Artificial Intelligence is currently fractured between two powerful but incompatible paradigms....

Read more

Red Alert: UnitedHealthcare Restricting RPM Coverage to Heart Failure & Pregnancy (Effective Jan 1, 2026)

12/3/25

If you are billing RPM for Diabetes, Hypertension, or COPD under UHC, your claims will likely be denied starting January 1st....

Read more

What CMS Is Actually Doing With RPM And APCM

12/1/25

Originally published at: https://fairpath.ai/resources/cms-rpm-apcm-2025-26 ‍...

Read more

The Hidden Pressure No One Talks About in RPM: What Happens at 18 Minutes

11/25/25

Most independent practices didn’t launch remote care programs so they could track timers, chase scattered documentation, or argue with spreadsheets at the end of every month. They adopted RPM and CCM because they believed these programs would keep patients out of the hospital, c…...

Read more

Inside the Remote Care Collapse — and the Path to Recovery

11/4/25

Over the past several years, I’ve heard it all. Remote patient care is a scam. It doesn’t work. RPM is designed to fail. I’ve listened to the frustrations from doctors, managers, and administrators who swear that remote care is nothing but another profit scheme wrapped in good i…...

Read more

The 8% Problem: Why State-of-the-Art LLMs Are Useless for High-Stakes Precision Tasks

10/30/25

In the race to solve complex problems with AI, the default strategy has become brute force: bigger models, more data, larger context windows. We put that assumption to the ultimate test on a critical healthcare task, and the results didn’t just challenge the “bigger is better” m…...

Read more

CMS’s 2026 Updates Signal a New Era for In-House Remote Care Coordination

10/21/25

Healthcare is on the brink of a fundamental shift. The forthcoming 2026 CMS Physician Fee Schedule updates are far more significant than mere billing adjustments, they signal a new era in remote care coordination. Practices that adapt early will not only enhance patient care but…...

Read more

CMS Brings Behavioral Health into the APCM Model: What It Means for Primary Care

10/9/25

‍ CMS is quietly reshaping how primary care teams can be paid for mental and emotional health support. Starting in 2026 (if finalized), practices using the new Advanced Primary Care Management (APCM) codes will be able to add small, monthly payments for behavioral health integra…...

Read more

Stop Choosing Between APCM and Your RPM/RTM Revenue

10/7/25

If your practice adopted APCM by shutting down RPM and RTM programs, you left money on the table. If you're running all three programs separately, you're burning cash on duplicate documentation and exposing yourself to compliance risk....

Read more

APCM vs. CCM Explained: Medicare’s 2025 Coding Shift Every Primary Care Leader Must Understand

10/1/25

On January 1, CMS introduced a brand-new benefit called Advanced Primary Care Management (APCM), a monthly payment designed to roll up the core elements of care coordination under a single code. For primary care leaders, this changes the landscape in profound ways. APCM overlaps…...

Read more

Neurosymbolic Ontologies with Buffaly

9/24/25

This blog outlines a groundbreaking proof of concept for reimagining medical ontologies and artificial intelligence. Buffaly demonstrates how large language models (LLMs) can unexpectedly enable symbolic methods to reach unprecedented levels of effectiveness. This fusion deliver…...

Read more

APCM and the “Coordination of Care Transitions” Requirement: How To Get It Right

9/23/25

Advanced Primary Care Management (APCM) represents one of the more meaningful changes in the CMS Physician Fee Schedule. As of January 1, 2025, practices that adopt this model will be reimbursed through monthly, risk-stratified codes rather than only episodic, time-based billing…...

Read more

APCM, Explained: What It Is, Why It Matters, What Patients Gain

9/18/25

Primary care is carrying more risk, more responsibility, and more expectation than ever. The opportunity is that we finally have a model that pays for the work most teams already do between visits. The risk is jumping into tooling and tactics before we agree on the basics. Advan…...

Read more

Noncompete Clauses In Healthcare: The FTC Warning, APCM Staffing, And Platform Partnerships

9/16/25

The Federal Trade Commission’s Sept. 12 warning to healthcare employers is a simple message with real operational consequences. Overbroad noncompetes, no‑poach language, and “de facto” restraints chill worker mobility and can limit patients’ ability to choose their clinicians. F…...

Read more

The APCM Quick Start Guide: Converting Medicare's Complex Care Program Into Practice Growth

9/9/25

Advanced Primary Care Management represents Medicare's most ambitious attempt to transform primary care economics. Unlike previous programs that nibbled at the margins, APCM fundamentally restructures how practices organize, deliver, and bill for comprehensive care....

Read more

13 Things You Need To Implement Advanced Primary Care Management (APCM)

9/5/25

Advanced Primary Care Management (APCM) is Medicare’s newest program, introduced in 2025 with three billing codes: G0556, G0557, and G0558. This represents a pivotal shift toward value-based primary care by offering monthly reimbursements for delivering continuous, patient-focus…...

Read more

When Women's Health Can't Wait: How Remote Care Creates Presence in Life's Most Critical Moments

8/26/25

At 2 AM, a new mother in rural Alabama feels her heart racing. She's two weeks postpartum, alone with a newborn while her husband works the night shift. Her blood pressure reading on the home monitor shows 158/95. Within minutes, her care team receives an alert. By 6 AM, a nurse…...

Read more

Medical Remote Care: How Vendor Models Shift Margin and When to Bring RPM In-House

8/18/25

Many health systems pay $40–$80 per patient per month (PMPM) for full-service remote patient monitoring while Medicare's 2025 national averages reimburse approximately $91–$129 monthly depending on engagement time. When clinical teams can deliver the same services internally, th…...

Read more

Why 73% of Practices Still Fear Remote Care and How the Winning 27% Think Differently

8/11/25

A few months ago, a physician at a 12-doctor practice in rural California called me frustrated. His practice was hemorrhaging money on readmissions, his nurses were burning out from phone tag with chronic disease patients, and his administrator was getting pressure from their he…...

Read more

Reclaiming Revenue: How Smart Medical Executives Are Transforming Remote Care into Sustainable Profit Centers

8/6/25

Medical executives today face an uncomfortable reality: while navigating shrinking margins and mounting operational pressures, many are unknowingly surrendering millions in Medicare reimbursements to third-party vendors. The culprit? Poorly structured Remote Patient Monitoring (…...

Read more

RPM’s $16.9B Gold Rush: Why 88% of Claims Skip CMS Review (And How Industry Leaders Are Responding)

7/23/25

Remote Patient Monitoring (RPM) has rapidly evolved from emerging healthcare innovation into a strategic necessity. Driven aggressively by CMS reimbursement policies, RPM adoption has accelerated at unprecedented rates, reshaping market dynamics and creating compelling strategic…...

Read more

Medicare's $4.5 Billion Wake-Up Call: What the VBID Sunset Reveals About Risk, Equity, and the Next Era of Value

7/17/25

In a single December blog post, CMS just rewrote the playbook for $400 billion in annual Medicare Advantage spending. The termination of the Medicare Advantage Value-Based Insurance Design (VBID) Model (after it generated $4.5 billion in excess costs over two years) isn't just a…...

Read more

Why the AMA’s 2026 RPM Changes Are Exactly What Your Practice Needs

7/8/25

If you've spent any time managing a remote patient monitoring (RPM) program, you already know the drill: juggling the 16-day rule, keeping track of clinical minutes, chasing compliance, and often wondering if this is really what patient-centered care was meant to feel like....

Read more

Healthcare Needs a Group Chat, And Digital Twins Are the Invite

7/1/25

Let’s be honest. Managing your health today feels like trying to coordinate a group project where nobody checks their messages. Your cardiologist, endocrinologist, and PCP are all working on the same assignment, but nobody’s sharing notes. The result? Confusion, overlap, and som…...

Read more

The Great Code Shift: Turning the ICD-11 Mandate into a Competitive Advantage

6/25/25

The healthcare industry still has scars from the ICD-9 to ICD-10 transition. The stories are legendary in Health IT circles: coder productivity plummeting, claim denials surging, and revenue cycles seizing up for months. It was a painful lesson in underestimation....

Read more

Beyond the Box: Finding the Signal in RPM's Next Chapter

6/19/25

In my work with healthcare organizations across the country, I see two distinct patient profiles coming into focus. They represent the past and future of remote care, and every successful practice must now build a bridge between them. The first is the patient for whom technology…...

Read more

The Living Echo: How Digital Twins Are Reshaping Personalized Healthcare and Operational Excellence

6/11/25

The healthcare landscape is continuously evolving, and among the most profound shifts emerging is the concept of the Digital Twin for Patients. This technology isn't merely an abstract idea; it represents a fundamental change in how we approach individual health and broader heal…...

Read more

Why the MIPS MVP Model is the Future—and How Your Practice Can Win

6/2/25

Change is inevitable in healthcare. Often, it feels overwhelming—but occasionally, a new shift arrives that genuinely makes things simpler. The upcoming CMS shift toward the MIPS Value Pathways (MVPs) represents precisely that kind of beneficial change....

Read more

Does RPM Miss What Patients Really Need?

5/27/25

It starts with a data spike… a sudden drop in movement, a rise in reported pain. The alert pings the provider dashboard, hinting at deterioration. But what if that signal isn’t telling the whole truth?...

Read more

Transforming Chronic Pain: The Power of RPM, RTM, and CCM

5/19/25

Chronic pain isn’t just a condition, it’s a thief. It steals time, joy, and freedom from over 51 million Americans, according to the CDC, costing the economy $560 billion a year. As someone passionate about healthcare innovation, I’ve seen how this silent struggle affects patien…...

Read more

Introduction: Demystifying Ontology—Returning to the Roots

5/16/25

In the tech industry today, we frequently toss around sophisticated terms like "ontology" , often treating them like magic words that instantly confer depth and meaning. Product managers, software engineers, data scientists—everyone seems eager to invoke "ontology" to sound info…...

Read more

APCM Codes: The Quiet Revolution in Primary Care

5/13/25

Picture Mary, 62, balancing a job and early diabetes. Her doctor, Dr. Patel, is her anchor—reviewing labs, coordinating with a nutritionist, tweaking her care plan. But until 2025, Dr. Patel wasn’t paid for this invisible work. It was just “what doctors do.” If you’re in healthc…...

Read more

It Always Starts Small: Lessons from the Front Lines of Healthcare Audits

4/28/25

In healthcare, most of the time, trouble doesn't announce itself with sirens and red flags. It starts quietly. A free dinner here. A paid talk there. An event that feels more like networking than education....

Read more

Unveiling RPM Fraud Risks—A Technical Dive into OIG Findings and FairPath’s AI Fix

4/24/25

The Office of Inspector General’s (OIG) 2024 report, Additional Oversight of Remote Patient Monitoring in Medicare Is Needed (OEI-02-23-00260) , isn't just an alert—it's a detailed playbook exposing critical vulnerabilities in Medicare’s Remote Patient Monitoring (RPM) system. R…...

Read more

The Cost of Shortcuts: Lessons From a $4.9 Million Mistake

4/21/25

When the Department of Justice announces settlements, many of us glance at the headlines and move on. Yet, behind those headlines are real stories about real decisions, choices that felt minor at the time but led to serious consequences. Like the recent settlement involving Live…...

Read more

One Biller, One Gap: How a Missing Piece Reshapes Everything

4/14/25

There’s a quiet agreement most of us make in business. It’s not in a contract. It’s not written on a whiteboard. But it runs everything: trust. ‍ We trust that what worked yesterday will still work tomorrow. We trust that people we’ve known for years will keep showing up the way…...

Read more

The System Is Rigged: How AI Helps Independent Docs Fight Back

4/10/25

Feeling like you’re drowning in regulations designed by giants, for giants? If you're running a small practice in today's healthcare hellscape, it damn sure feels that way. And maybe "feeling" isn't the right word – maybe it's just reality....

Read more

Trust Is the Real Technology: A Lesson in Healthcare Partnerships

4/7/25

When people ask me what Intelligence Factory does, they often expect to hear about AI, automation, or billing systems. And while we do all those things—we do them well—I’ve come to believe something deeper: we’re in the business of trust. And in healthcare, that’s the most valua…...

Read more

Million Dollar Surprise

4/3/25

“They’re going to put me out of business. They want over a million dollars. I don’t have a million dollars”, his voice cracked over the phone....

Read more

Unlocking AI: A Practical Guide for IT Companies Ready to Make the Leap

12/22/24

Artificial intelligence isn’t just a buzzword anymore—it’s a transformative force reshaping industries worldwide. Yet for many IT companies, the question isn’t whether to adopt AI but how . If you're scratching your head wondering where to start, you're not alone. For businesses…...

Read more

Agentic RAG: Separating Hype from Reality

12/18/24

Agentic AI is rapidly gaining traction as a transformative technology with the potential to revolutionize how we interact with and utilize artificial intelligence. Unlike traditional AI systems that passively respond to commands, agentic AI systems operate autonomously, making d…...

Read more

From Black Boxes to Clarity: Buffaly's Transparent AI Framework

11/27/24

Large Language Models (LLMs) have ushered in a new era of artificial intelligence, enabling systems to generate human-like text and engage in complex conversations. However, their extraordinary capabilities come with significant limitations, particularly when it comes to predict…...

Read more

Bridging the Gap Between Language and Action: How Buffaly is Revolutionizing AI

11/26/24

The rapid advancement of Large Language Models (LLMs) has brought remarkable progress in natural language processing, empowering AI systems to understand and generate text with unprecedented fluency. Yet, these systems face a critical limitation: while they excel at processing l…...

Read more

When Retrieval Augmented Generation (RAG) Fails

11/25/24

Retrieval Augmented Generation (RAG) sounds like a dream come true for anyone working with AI language models. The idea is simple: enhance models like ChatGPT with external data so they can provide answers based on information beyond their original training. Need your AI to answ…...

Read more

SemDB: Solving the Challenges of Graph RAG

11/21/24

In the beginning there was keyword search . Eventually word embeddings came along and we got Vector Databases and Retrieval Augmented Generation (RAG) . They were good for writing blog posts about topics that sounded smart, but didn’t actually work well in the real world. Fast f…...

Read more

Metagraphs and Hypergraphs with ProtoScript and Buffaly

11/20/24

In Volodymyr Pavlyshyn's article , the concepts of Metagraphs and Hypergraphs are explored as a transformative framework for developing relational models in AI agents’ memory systems. The article highlights how these metagraphs can act as a semantic backbone, enabling AI to reta…...

Read more

Chunking Strategies for Retrieval-Augmented Generation (RAG): A Deep Dive into SemDB’s Approach

11/19/24

In the ever-evolving landscape of AI and natural language processing, Retrieval-Augmented Generation (RAG) has emerged as a cornerstone technology. RAG systems allow large language models (LLMs) to access vast knowledge bases by retrieving relevant snippets of information, or "c…...

Read more

Is Your AI a Toy or a Tool? Here’s How to Tell (And Why It Matters)

11/7/24

As artificial intelligence (AI) becomes a powerful part of our daily lives, it’s amazing to see how many directions the technology is taking. From creative tools to customer service automation, AI can be both a powerhouse and, at times, a bit of a playground. At Intelligence Fac…...

Read more

Stop Going Solo: Why Tech Founders Need a Business-Savvy Co-Founder (And How to Find Yours)

10/24/24

Hey everyone, Justin Brochetti here, Co-founder of Intelligence Factory. We're all about building cutting-edge AI solutions, but I'm not here to talk about that today. Instead, I want to share some hard-earned wisdom about a challenge that I see many tech founders facing: findin…...

Read more

Why Buffaly is the Future of AI-Driven Data Retrieval

9/26/24

When it comes to data retrieval, most organizations today are exploring AI-driven solutions like Retrieval-Augmented Generation (RAG) paired with Large Language Models (LLM) . These systems have certainly made strides in helping businesses pull information from large datasets an…...

Read more

The AI Mirage: How Broken Systems Are Undermining the Future of Business Innovation

9/18/24

You’ve heard the pitch: AI will revolutionize your operations, cut costs, and deliver results you didn’t even know you needed. But after the vendor leaves, and the system is plugged in, reality hits hard. Companies are discovering that AI solutions too often fail to live up to t…...

Read more

A Sales Manager’s Perspective on AI: Boosting Efficiency and Saving Time

8/14/24

AI-driven call routing can analyze incoming calls in real time and direct them to the most appropriate agent based on skill set, availability, and past interactions. This ensures customers are connected with the right person quickly, improving satisfaction and reducing wait time…...

Read more

Prioritizing Patients for Clinical Monitoring Through Exploration

7/1/24

RPM (Remote Patient Monitoring) CPT codes are a way for healthcare providers to get reimbursed for monitoring patients' health remotely using digital devices. Think of it like having a virtual nurse keeping an eye on you between doctor visits. These codes cover the time spent se…...

Read more

10X Your Outbound Sales Productivity with Intelligence Factory's AI for Twilio: A VP of Sales Perspective

6/28/24

As VP of Sales, I'm constantly on the lookout for ways to empower my team and maximize their productivity. In today's competitive B2B landscape, every interaction counts. That's why I'm here to share a game-changer: integrating Intelligence Factory's AI package with our existing…...

Read more

Practical Application of AI in Business

6/24/24

In the rapidly evolving tech landscape, the excitement around AI is palpable. But beyond the hype, practical application is where true value lies. As someone who relishes in crafting customized solutions for clients and building internal tools, I've found immense value in creati…...

Read more

AI: What the Heck is Going On?

6/19/24

We all grew up with movies of AI and it always seemed to be decades off. Then ChatGPT was announced and suddenly it's everywhere....

Read more

SQL for JSON

4/22/24

Everything old is new again. A few years back, the world was on fire with key-value storage systems. I think it was Google's introduction of MapReduce that set the fire. It's funny because I remember reading in the '90s that the debate had been settled and that relational databa…...

Read more

Telemedicine App Ends Gender Preference Issues with AWS Powered AI

4/19/24

Mount Dora, Florida, 2019: AWS machine learning enhances MEDEK telemedicine solution to ease gender bias for sensitive online doctor visits. Visiting a doctor is personal, and now Medek Health Health Systems (MEDEK) along with Amazon Web Services (AWS) is using AI to make it a b…...

Read more