jcardena.com Blog The first system I designed that outlived its original purpose
145 posts
EN ES

The first system I designed that outlived its original purpose

Software

Discover the architectural pattern that allowed a data integration system to outlive its purpose and how it applies to building reliable AI agentic systems today.

Most code I’ve written is gone, and that’s a good thing. Software is a tool for a specific time and problem, not a monument. But I was recently reminded of a system I designed over a decade ago and discovered it was not only still running but doing a job we never imagined for it. The feeling wasn't nostalgia. It was the quiet satisfaction of seeing an architectural bet pay off long after the original players had left the table.

The system outlived its purpose. It survived not because it was clever, but because it was simple, decoupled, and built around a principle more relevant today than ever: stable contracts between unstable components.

The Messy Inbox Problem

The original mission was painfully mundane. I was leading a project for a large retailer that needed to ingest product catalog data from dozens of suppliers. Each had their own format: CSVs from an FTP server, esoteric XML dialects, weird fixed-width files. The business logic for each was a hornet's nest of exceptions. "If the supplier is X, the 'price' column is the wholesale price, unless the 'category' is Y." It was a classic data integration nightmare.

The pressure was to just write a script for Supplier A, then copy, paste, and modify it for Supplier B. It would have worked, for a while. But it would have created a tangled mess of one-off jobs, each a liability. That approach meant we'd need a new developer for every five suppliers just to keep the lights on. I argued for a different path, one that felt slower at the start: building a generic ingestion engine, not a collection of custom scripts.

The Canonical Model Bet

The core of the design was a non-negotiable, internal data model. My team called it the Canonical Product Record. It was a strict, idealized representation of what "a product" meant to our business, regardless of how any single supplier saw it. This was our source of truth.

The system was then composed of two distinct parts, separated by a message queue:

  1. Adapters: A set of small, isolated services, one for each supplier. An adapter’s only job was to receive a supplier's messy file, translate it into a perfect Canonical Product Record, and put that record onto the queue. In hindsight, these were a textbook example of what Eric Evans describes as an Anti-Corruption Layer in Domain-Driven Design, protecting our core model.
  2. The Core Processor: A single, stable service that pulled clean records from the queue. It applied universal business logic and loaded data into the main e-commerce platform, never needing to know about supplier-specific exceptions.
Supplier DataAdapterMessage QueueCore ProcessorE-commerce DB
Original Ingestion Flow

This pattern, which Gregor Hohpe and Bobby Woolf codified as the Canonical Data Model in their book Enterprise Integration Patterns, required upfront discipline. It felt abstract and unnecessarily formal at first. To be honest, for a tiny startup with only two data sources, it would have been over-engineering. But for the scale we were facing, it was a bet on managing future complexity.

The First Unexpected Pivot

About two years later, the business launched a marketplace for third-party sellers. Suddenly, we needed to onboard hundreds of smaller sellers with far lower data quality and even more unpredictable formats. This would have been a crisis for a system of one-off scripts.

For the ingestion engine, it was just another day. The core processor didn't change at all. The Canonical Product Record remained our stable contract. All the team had to do was write new adapters for the new seller formats. The system scaled to meet a business need we never anticipated because its components were loosely coupled.

The message queue between the adapters and the processor also proved to be a critical shock absorber. When a new, buggy seller adapter started sending malformed data, we could just shut that one adapter off without affecting the flow of data from any other source. Resilience was a side effect of the design.

A Second Life as a Data API

The truly surprising evolution came years later. A new data science team needed a clean feed of all product data for a recommendation engine. Then, a finance team needed to run analytics on product profitability. Instead of building new pipelines from the original messy sources, they all plugged into the system we'd built.

They added new consumers to the message queue that was already carrying the stream of Canonical Product Records. The system I designed for one-way data ingestion had, without any changes to its core, become a de-facto event bus and source-of-truth API for product information. The most valuable asset we created wasn't the code; it was the stable, trusted data contract at its center.

Durability in the Age of Agents

This story isn't about a specific technology; the original tools are obsolete. The lesson is in the pattern. I wasn't smart enough to predict the company's future, but my team was disciplined enough not to hardcode its present into the architecture.

I see this same pattern as the primary challenge in building agentic AI systems today. An LLM agent is, by nature, an unpredictable, non-deterministic component. It’s like an infinitely variable data supplier. The key to building reliable systems that use them is to create the same architectural separation. We need a deterministic, stable core—the business logic, the databases, the APIs that don't fail—and a loosely coupled layer of adapters that mediate the chaos of the agentic fringe.

The contract between them becomes a kind of "Canonical Action Record." Instead of a product, it’s a request for a well-defined action. Modern LLMs are already designed for this through features like OpenAI's Function Calling, which forces the model to output a structured JSON object. That object is the contract. It might look like {"action": "update_inventory", "params": {"sku": "ABC-123", "quantity": 50}, "trace_id": "agent-run-xyz"}. The agent's job is just to produce that record; the deterministic core's job is to execute it safely.

VOLATILE & AGENTIC SOURCESExternal APIsUser PromptsUnstructured FilesLLM AgentsDECOUPLING & CONTRACTSAdapters (ACL)Canonical ModelsMessage BusFunction CallingDETERMINISTIC COREBusiness LogicDatabasesCore APIsValidation RulesCONSUMERS & OUTPUTSApplicationsAnalyticsMonitoringUser Interfaces
Durable Architecture for Hybrid Systems

The goal is to design systems where one part can be fluid, experimental, and intelligent, while the other remains boring, reliable, and correct. That's how you build something that might just outlive its purpose.

Key Takeaways

  • Isolate volatility. Your greatest architectural leverage is the boundary between what you control (your core logic) and what you don't (external data, user input, LLM agents).
  • Define the stable contract. A canonical data model or action model is the non-negotiable interface at that boundary. It is the most valuable asset you will build.
  • Apply the pattern to AI. Treat an LLM agent as an unpredictable "adapter" that produces a canonical action request. Your deterministic core then executes that request safely.
  • Acknowledge the upfront cost. This approach is slower initially than writing a one-off script, but it pays dividends in resilience, scalability, and adaptability to futures you can't predict.
JC
Juan Cardena
Enterprise Architect, Data & AI

Enterprise architect with 25 years across web, software, data, and AI. MIT CDAO ’25. Writing on agentic AI in production.