Blog
Open Source
Palantir Foundry
Data Pipelines
Agent Substrate

Open-sourcing YedaFlow: the pipeline substrate for governed agents

A public Foundry pipeline pattern for resolving messy order data, detecting cited exceptions, and giving governed agents clean operational state.

May 31, 2026
·6 min read·Yeda AI Team

A bureau system, an office-goods system, and a customer master disagree on who the customer is, when an order is due, and who owns it. YedaFlow_Palantir - the public slice we just open-sourced - turns that into one typed ontology an agent can safely read on Palantir Foundry. It is the substrate the Yoca and Yopa agents stand on.

The use case is ordinary on purpose. Today, an operations or data engineer exports both feeds, fuzzy-matches rows by hand, files one ticket per anomaly, and hopes the next ingest did not silently hide a missing source. YedaFlow makes that reconciliation a code-reviewed DAG with cited exceptions and a run inspector.

The repo maps cleanly onto Foundry surfaces: Pipeline Builder mirrors in pipelines, ontology specs, deterministic Functions, AIP Logic flows, a React operator app over an OSDK fixture, Automate sketches, an operator companion, and evals. It runs locally with synthetic data and deterministic fallbacks, so you can inspect the pattern without a Foundry tenant or an API key.

What the pipeline proves

Multi-source unification

Four Pipeline Builder mirrors resolve customers, normalize orders, detect typed exceptions, and emit quality signals from two order feeds plus a customer master.

Governed ontology

Five typed objects - Customer, Order, OrderException, PipelineRun, and DataQualitySignal - become the contract between pipelines, operators, Automate, and agents.

Eval-gated AI logic

AIP Logic drafts explanations, run summaries, and threshold-tuning suggestions while deterministic functions own identity, scoring, and persisted rationales.

Read-only agent hand-off

YedaAgents read the cleaned ontology through AgentReadSurface. The operator companion can recommend a route or acknowledge step, but the operator applies it.

The trust rules are the product

The interesting part is not that an LLM can explain an exception. The interesting part is that the exception already exists as typed, cited operational state before the model shows up. The model can summarize, explain, or suggest a threshold tune; it cannot invent a row, hide a partial run, or write through an action boundary.

No uncited exceptions
Every OrderException carries non-empty source_refs back to the rows and fields that produced it.
No silent partial runs
A missing source feed sets PipelineRun.status to partial and names the missing input for the operator.
No LLM writeback
Model flows draft language and suggestions; ontology writes still go through typed actions.
No free-text parsing
Streaming chunks are typed, malformed chunks are dropped, and the UI parses fields instead of prose.

That is why the public repo ships with 142 tests, 16 golden cases, smoke evals, a prompt-injection check, and a replayable operator-companion path. The commercial version adds tuned thresholds, real connectors, larger benchmark grids, tenant scaffolding, and enterprise auth; the public version keeps the architecture inspectable.