The first time I unified data from systems that disagreed

The printout felt hot in my hand. On the left, data from our billing system. On the right, the customer record from our CRM. They both described the same company, a multi-million-dollar account, but they listed different corporate headquarters. My first thought was simple: which one is wrong?

That question sent me down a three-month rabbit hole that permanently changed how I think about data architecture. The answer, it turned out, was that neither was wrong. They were both right. And that was a much bigger problem.

The first time I unified data from systems that disagreed

The Myth of the Single Source of Truth

We're often taught to pursue a "Single Source of Truth" (SSOT). The goal is noble: find the authoritative source, cleanse duplicates, and forge a golden record for clarity. For some data, this is absolutely correct. There should only be one official quarterly revenue number for SEC filings. But for operational data in a complex enterprise, the SSOT is often a dangerous oversimplification.

My project was to build a unified customer view for a large B2B company. The relationship with any single customer was managed by at least three departments, each with its own valid model of that customer:

Finance cared about the legal entity for invoicing. Their system held the tax-registered address.
Logistics cared about the physical warehouse for deliveries. Their system had the loading dock address, which might change quarterly.
Sales cared about the regional office their contacts worked in. Their CRM held that specific office park address.

Forcing these three legitimate, purposeful, and conflicting addresses into a single address field wasn't unifying data; it was destroying valuable context.

The Problem of Conflicting Truths

An Architecture of Context

The breakthrough came when we stopped asking "Which address is true?" and started asking "True for what purpose?" We abandoned the single, flat customer record. Instead, we designed a model that made context a first-class citizen.

The core was a stripped-down, immutable CustomerIdentity entity. It contained only universally true information: a unique internal ID, the legal company name, and a creation timestamp. Linked to this identity were multiple CustomerContextProfile records, one for each source system's view. A profile contained the source data and, crucially, metadata about its purpose, like INVOICING, SHIPPING, or SALES_OUTREACH.

This wasn't a golden record. It was a federation of truths, held together by a common identity but never collapsed into a single, compromised version.

A Simple, Deterministic Router

With this data model, the integration became surprisingly simple and robust. We built a Customer Data API that refused to return "the customer." Instead, the endpoint required a purpose parameter in the request.

A call looked like: GET /api/customer/cust-123?purpose=INVOICING

The API’s logic wasn't a complex system trying to infer intent. It was a dead-simple, deterministic router. It received the request, found the CustomerIdentity, looked up the CustomerContextProfile that matched the requested purpose, and returned that specific record. While the technical solution was simple, the hard part was getting the three department heads to agree on the official purpose tags—a negotiation that took weeks.

The final system was boring, and it worked flawlessly at 3 AM. We weren't guessing the truth; we built a pipeline to deliver the correct truth for a given task.

The Real Lesson: Context is King

This experience taught me that the hardest part of data architecture isn't technology; it's epistemology. This is the core challenge that Domain-Driven Design addresses. As Eric Evans describes, each of those departments was operating in its own "Bounded Context," with its own "Ubiquitous Language." Our mistake was trying to force one context's model onto the others. You can read more on this from a great primer by Martin Fowler.

This principle also animates modern architectures like the Data Mesh, where Zhamak Dehghani argues for treating data as a product owned by its domain. Instead of a central team creating one truth, each domain publishes its own truths for others to consume. You can find her original article, also on Fowler's site.

This pattern becomes critical when building with LLM agents. An agent tasked with sending a legal notice needs 100% certainty it has the legal address, not the loading dock. Feeding an agent a flattened "golden record" without context is begging for expensive, real-world failures.

Architecture for Contextual Data

Concrete Takeaways

These principles have held up for years, from data warehousing to building RAG systems for LLMs today.

Question the Golden Record. Before merging data, ask if you are destroying context. A "duplicate" might be a different, valid perspective. Acknowledge that for some domains, like finance, a canonical record is non-negotiable.
Model Context Explicitly. Make "purpose" or "domain" a first-class field in your data model. Build your architecture around it.
Favor Deterministic Systems. When contexts are clear, you don't need fuzzy logic. Simple, rule-based routing is more reliable, observable, and cheaper to run than any "smart" system trying to guess intent.
Architects as Cartographers. Our job isn't to be a philosopher-king, decreeing the one true record. It's to be a cartographer, mapping the different business realities and making them legible.