Building trust in numbers, one reconciliation at a time
Data
Explore why automated, deterministic reconciliation is the bedrock of trust in any data system, from classic data warehouses to modern LLM-powered agentic systems.
The conference room was always too cold. A senior vice president was staring at a printout of my team’s dashboard, his finger tapping on a single number. “This doesn’t match what finance sent me,” he said. It wasn’t a question. It was a verdict. In that moment, the elegant architecture and clever data models were irrelevant. Trust was broken, and the only way to rebuild it was to go back to the beginning, line by line.
The Gulf Between Source and Summary
Every data system has a fundamental divide. On one side, you have the systems of record—the transactional databases, the event logs, the application APIs. This is the ground truth. On the other, you have the summary—the dashboard, the analytical model, the quarterly report. This is the abstraction we use to make decisions.
The gulf between them is where trust dies. Every join, aggregation, and transformation is a place for error to creep in. A slightly different definition of an "active user," a mishandled timezone, a null treated as zero—these small missteps compound. The final number is the product of hundreds of tiny decisions, and a single flaw can invalidate the entire chain.
Stakeholders who have been burned before instinctively know this. Their demand to "see the raw data" isn't an insult; it's a rational response to the inherent fragility of abstraction. They don't trust the pipeline because they’ve seen too many pipelines break.
A Brutal Lesson in the Weeds
I was working on a large retail system with a sales reporting warehouse that was the pride of the engineering team. But one quarter, our numbers were consistently a few percentage points off from the third-party payment gateway’s reports. The gap was small enough to be subtle, but large enough to erode all confidence.
My task was to find the discrepancy. It was a week of pure data archaeology, pulling gigabytes of raw transaction logs from both systems and writing ad-hoc scripts to normalize them for a line-by-line comparison. The work was brutal.
The error wasn't one big thing. It was a dozen tiny edge cases the ETL process never accounted for: a specific type of partial refund, transactions authorized but not captured before midnight UTC, promotional codes applied post-tax in one system and pre-tax in another. The big-picture logic was correct, but the details that only reconciliation could expose were wrong.
From Manual Audits to Deterministic Guardrails
That experience taught me a core architectural principle, one hammered home by the foundational work of practitioners like Ralph Kimball in the early days of data warehousing: reconciliation cannot be an occasional fire drill. It must be an automated, first-class feature of any data system.
This approach stands in contrast to some modern data observability platforms, which excel at using machine learning to detect *unknown unknowns*—unexpected drifts and anomalies. Those tools are valuable. But they don't replace the non-negotiable need for deterministic checks that validate *known-knowns*. Your business has hard rules, like "revenue in the warehouse must match the payment gateway, period." That requires an explicit, automated reconciliation job, not just anomaly detection.
The pattern is simple. For any two datasets that are supposed to match, a separate job runs on a schedule. It starts with high-level checks—row counts, sums of financial columns, counts of distinct categories. If these fail, it escalates, firing an alert directly to the engineering team. This isn't business intelligence; it's system intelligence.
Reconciliation in the Age of Agents
This principle is more relevant than ever with agentic systems. An LLM-based agent that summarizes legal documents or classifies support tickets is a new, powerful, and probabilistic transformation pipeline. You can’t trace its reasoning like a SQL query, but you absolutely can and must audit its results.
The modern MLOps discipline has formalized this. Instead of generic sampling, we use patterns like continuous evaluation against a "golden set" of human-verified examples. As described in Google's frameworks for MLOps and continuous delivery, tools can automatically compare an agent's output to this ground truth and measure metrics like precision and recall. If an agent categorizing sentiment suddenly drifts, a deterministic check on its output distribution flags the problem before it pollutes downstream systems.
The agent operates on probabilities, but it must be contained by a world of deterministic, auditable checks. The reconciliation process is the anchor that connects the agent's powerful, fuzzy capabilities back to the ground truth the business requires.
What This Means for Builders
Trust isn't a feature you add at the end. It's a property of a well-architected system, built on the unglamorous work of reconciliation. The takeaways are simple and durable:
- Treat reconciliation as a product, not a task. This idea echoes Zhamak Dehghani’s concept of Data as a Product; the systems that verify your data are as critical as the systems that produce it. They deserve a place in the backlog and their own monitoring.
- Check at every major seam. Don't just reconcile the final output against the source. Build checks at each boundary—staging, integration, presentation—to catch errors closer to where they occur.
- Embrace the boring. A simple SQL query that counts rows and sums a critical column, running every hour forever, will save you more pain than the most sophisticated alternative. Start there.
- Apply the same skepticism to AI. The probabilistic nature of LLMs doesn't absolve us of our responsibility to verify. The more powerful the transformation, the more rigorous our deterministic checks must be. Trust is an output of a process, not an input.
That cold conference room taught me that the most beautiful architecture is worthless if the numbers are wrong. Trust isn't built with vendor logos. It's built in the weeds, with systems designed to prove their own correctness, one painstaking reconciliation at a time.