The unglamorous truth about data quality

The number on the slide was 14.3%. The Senior VP of Sales stopped me mid-sentence. "That can't be right," he said. "My team's dashboard shows churn is under 10%. Where are you getting that number?"

The air in the conference room went still. Every head turned from the screen to me. This is the moment data practitioners dread. It’s not a technical debate; it’s a crisis of confidence. My number contradicted his reality, and in that room, his reality had more political capital. For the next ten minutes, I wasn't a systems architect. I was a trial lawyer, and my client was a single, stubborn floating-point number. This scenario isn’t just about dashboards; it’s the exact same challenge faced by any LLM agent or autonomous system that operates on data it can’t defend.

The Number Is an Artifact, Not the Truth

We spend so much time building systems to produce numbers that we forget the most important truth: the number itself is just the end-product of a long, complex manufacturing line. It’s an artifact. When someone challenges it, they aren't just questioning the number; they are questioning the entire factory that produced it.

In my experience, a dashboard has two audiences. The first is the business user who consumes the output. The second, and far more critical, is the skeptic who audits it. We almost always build for the first and are caught unprepared by the second. The "unglamorous truth" is that the value of a data system isn't in its final output, but in its ability to prove how that output came to be. For AI systems, this takes on an even sharper edge: an agent's "reasoning" is only as credible as the data it consumes and the processes that shaped it.

Trust isn't a feature you add at the end. It has to be designed in from the beginning. It's the sum of a thousand small, deliberate architectural choices that favor defensibility over simple delivery.

Building Trust in Data

Architecture for Defense, Not Just Delivery

When I was starting out, a "data pipeline" felt like a one-way street: get data from A, do something to it, and put it in B. It worked, until it didn't. The moment a number was questioned, the only answer I had was to manually dig through logs and scripts—a process that could take hours or days and looked, from the outside, like I was trying to find an answer that fit the question.

A defensible architecture looks different. It's built on three pillars that have nothing to do with speed or scale, and everything to do with proof.

Verifiable Lineage: For any given metric, you must be able to trace its journey back to the source transaction or event. This isn't just a nice-to-have diagram. It needs to be a queryable, automated capability of the platform. The goal is to answer "Where did this come from?" in seconds, not days.
Explicit Contracts: Data crossing a boundary between systems—or even teams—needs a formal contract. This isn't a document in a wiki; it's a schema, a set of assertions, and a clear owner. As Zhamak Dehghani advocates with Data Mesh principles, treating data as a product means defining clear interfaces and agreements. When an upstream application team decides to change a `status` field from an integer to a string, the contract breaks, an alert fires, and the pipeline halts. It turns a future business argument into a present-day engineering incident, which is infinitely easier to solve.
Deep Observability: We’re good at monitoring infrastructure (is the database up?). We’re less good at monitoring the data itself. A defensible system profiles the data at every stage. It knows the expected distribution, cardinality, and freshness. A sudden drop in `orders_processed` from an average of 10,000 per hour to 100 isn't a failure the system should ignore; it's a data quality bug that must be caught before it pollutes the final number.

Interconnected nodes representing data relationships.

Good Fences Make Good Data Neighbors

Back in that conference room, the discrepancy between my 14.3% and the sales team's sub-10% wasn't because of a bug. It was because we had different definitions of "churn." My system, serving a financial planning use case, defined a customer as churned the day their subscription expired without renewal. The sales system, designed for tracking account manager performance, didn't count a customer as churned until a 30-day grace period had passed.

Both numbers were correct within their own contexts. The problem was that these contexts were implicit. A data contract would have made it explicit. The finance data model would have a field named `churn_date_financial`, governed by a contract that clearly defined the event. The sales model would have `churn_date_sales_ops`. The two would never be confused.

Many organizations, prioritizing a single source of truth, might try to force a monolithic definition, leading to endless political battles and a loss of trust when no single number can serve all needs. This is the "socio-technical" part of the problem. Data quality issues are often organizational boundary disputes disguised as technical problems. Instead of trying to force everyone to agree on one "master" definition, a better architectural pattern is to build good fences. Use contracts and clear naming conventions to let different contexts co-exist safely, making their assumptions explicit and their outputs unambiguous.

You Can't Test Quality In at the End

We learned decades ago in software that you can't write an entire application and then hand it to a QA team to "test the quality in." Quality is the result of continuous, automated checks throughout the development lifecycle. The same exact principle applies to data.

A post-mortem report on a data quality failure is useless. The real work is in the automated checks that run every time the pipeline executes. This shift from reactive fixes to proactive monitoring and alerting is the core of data observability, a field increasingly championed by practitioners like Barr Moses and her team at Monte Carlo. It's about monitoring the health of your data, not just your infrastructure.

At Ingestion: Does the data match the source schema contract? Is it fresh?
After Staging: Are key fields populated? Are values within expected ranges? Does the distribution look normal?
After Transformation: Do the joins produce the expected number of rows? Does the business logic hold (e.g., `shipping_date` must be after `order_date`)?

Tools like dbt tests or libraries like Great Expectations make this declarative and routine. Every one of these checks is a small act of building confidence. When they all pass, the final number inherits that confidence. It becomes defensible.

Abstract representation of data integrity and protection.

The AI Connection and the Boring Truth

The rise of LLM agents and sophisticated AI systems doesn't diminish the importance of data quality; it amplifies it. An autonomous agent making decisions or generating insights is only as reliable as its underlying data. If a customer service agent is trained on flawed data, its responses will be flawed. If an automated trading agent bases its decisions on an incorrect market signal, the consequences can be catastrophic. The "garbage in, garbage out" principle holds more weight than ever.

Deterministic automation and agentic work must cooperate. The deterministic pipelines, with their rigorous data quality gates, provide the reliable, defensible context upon which agents can act. Without this foundational layer of trustworthy data, AI becomes a black box of unpredictable outcomes, impossible to audit or defend when challenged.

Defensible Data Architecture for AI

The Real Takeaway: Confidence, Not Perfection

I was able to defend my 14.3% that day. Not by arguing that my number was "right" and his was "wrong," but by walking him through the system that produced it. I showed him the exact source table in the production database, the explicit business rule in the transformation code that defined churn, and the test cases that verified it.

He didn't have to like the number, but he couldn't deny its process. We agreed to add a new metric to the finance dashboard that reflected the sales definition, giving both views visibility. The crisis became a collaboration.

The goal is never perfect data; such a thing doesn't exist. The goal is a system that produces numbers with a high, and provable, degree of confidence. This is the bedrock. Whether you're building a simple dashboard or a complex LLM-based agent, the quality of the output is bounded by the integrity of the data system beneath it. It’s unglamorous, painstaking work. But it's the only work that matters when the room goes quiet and all eyes turn to you.