jcardena.com Blog The data contract that ended a long-running argument
145 posts
EN ES

The data contract that ended a long-running argument

Software

How a machine-readable data contract, enforced in CI, ended the chronic conflict between data producers and consumers by shifting accountability upstream.

I remember the weekly data quality meeting. It always had the same, sinking rhythm. The analytics team would present a broken dashboard, pointing to a column of null values that appeared overnight. The application team, who produced the data, would counter that they’d only shipped a small feature. Fingers were pointed, Slack messages were screen-shotted, and we’d end with a vague promise to “be more careful.”

The problem wasn't our people; it was our process. We were treating a systems problem as a communication problem. The real fix wasn't another meeting. It was to make the implicit agreement between our systems explicit and, most importantly, automatically enforceable.

Code Pushed to CIContractValidation FailsDeveloper FixesBefore Merge
Shifting the Point of Failure

The Anatomy of a Doomed Argument

The conflict was baked into our architecture. On one side, application engineers work to ship user-facing features. The data their service emits is often just exhaust—a side effect. They might change a field from an integer to a string to accommodate a new input, and from their perspective, it’s a trivial change.

On the other side are the data consumers: analytics, machine learning, and data science teams. For them, that stream isn't exhaust; it’s the primary input. A silent change from order_value as an integer to a string like "19.99" would break their ingestion pipeline or, worse, silently cast values to null, corrupting every downstream model and report.

The resulting arguments were unwinnable. Producers would say, "We documented the change in the wiki." Consumers would reply, "That document is six months out of date." There was no authoritative source of truth for both humans and machines. We were operating on handshake agreements, and handshakes don't prevent production outages at 3am.

From Handshake to Enforced Contract

The turning point came when we decided to stop talking about the data's structure and start codifying it. We adopted the concept of a formal, machine-readable data contract, an idea that practitioners like Chad Sanderson have helped popularize. This wasn't a PDF. It was a YAML file that lived inside the source code repository of the data-producing service.

This is a fundamentally deterministic pattern. It creates predictable, reliable systems, which are the bedrock for any advanced automation or agentic work. If your foundational data streams are brittle, any AI system you build on them will simply inherit that chaos. The contract was defined using JSON Schema, similar in spirit to the Open Data Contract Standard (ODCS), making it language-agnostic and easy to validate.

By placing a data-contract.yml file in the producer's git repository, we established ownership. The contract evolved with the code, and any change to it was a formal, auditable commit. Data was no longer an afterthought; it was part of the service's explicit interface.

The Contract's Teeth: The CI Pipeline

A contract without enforcement is just a suggestion. The real power came from wiring it into the producer's continuous integration (CI) pipeline. We added a mandatory step to their build process in GitLab CI: "Validate Contract."

In a new unit test, the application would generate a sample event. The validation step would then check that event against the contract's schema. If a developer changed a user_id to be an integer instead of a string, the test would fail. The build would turn red. The pull request would be blocked from merging.

The impact on our workflow was immediate. The argument was no longer between two teams in a meeting room; it was between a developer and their failing build. The feedback loop shrank from days to minutes. The debate was no longer about who was right, but about an objective, automated check. "It passed the contract validation" became the standard for shipping a change.

Why This Pattern Won Out

I've seen organizations try to solve this problem in other ways. They usually fall into two camps, both of which I find less effective.

One approach is heavyweight governance: a central committee that manually reviews all schema changes. This creates a bottleneck, slows everything down, and still relies on people not making mistakes. The other is to build hyper-resilient, defensive consumers that can handle any garbage data thrown at them. This creates massive duplication of effort and is purely reactive—it accepts that bad data will enter the system and tries to clean up the mess afterward.

Enforcing the contract at the source, in CI, is proactive. It is the cheapest and fastest place to catch an error. It treats the cause, not the symptom.

The Honest Trade-offs

This pattern doesn't solve everything. It adds friction to the development process, and it's important to be honest about that. To make a breaking change, a producer now had to version the contract and coordinate with consumers. This feels like a slowdown at first, but it's the planned, predictable "slowness" of engineering discipline, not the chaotic, unpredictable "slowness" of cleaning up a production incident.

It also only validates syntax, not semantics. The contract can ensure an order_total is a positive number, but it can't know if the business logic for calculating it is correct. It raises the quality floor, but it doesn't guarantee perfection.

DATA SOURCESApplication EventsDatabase CDCThird-Party APIsINGESTION AND VALIDATIONEvent StreamContract Validator(CI/CD)Schema RegistryDead Letter QueuePROCESSING AND STORAGEDeterministicPipelinesAgentic SystemsData WarehouseSERVING LAYERAnalytical APIsBI DashboardsML Model Training
Contract-Enforced Data Architecture

The Real Win: Predictability at Scale

Despite the overhead, the stability we gained was worth it. The consumer teams could finally build with confidence. The endless, defensive code to handle unexpected data types began to disappear. They could trust the data's shape.

More importantly, it shifted the producer teams' culture. Data became a first-class product with a formal, versioned API they were responsible for. This mindset, treating data with the same respect as a REST API, is a core tenet of architectural patterns like Data Mesh. It's the boring, essential work that makes the exciting work—like building reliable agentic systems—possible.

The long-running argument didn't end because one team won. It ended because we replaced the argument with an automated system. We took an ambiguous, high-conflict process and made it a clear, low-friction, objective one. In my experience, that's as close to a win as you can get in enterprise architecture.

Concrete Takeaways

  • Codify your interfaces. Don't rely on documentation or meetings. If two systems communicate, their agreement should be a machine-readable artifact in source control.
  • Enforce contracts at the source. The cost of catching a breaking change is lowest in the producer's CI pipeline, before code is ever merged.
  • Shift the conversation from people to process. A failing build is objective evidence. It removes blame and focuses effort on the fix, not the fault.
  • Accept the friction as the cost of stability. The upfront work of contract negotiation is far cheaper than a production outage and the erosion of trust between teams.
JC
Juan Cardena
Enterprise Architect, Data & AI

Enterprise architect with 25 years across web, software, data, and AI. MIT CDAO ’25. Writing on agentic AI in production.