jcardena.com Blog The first time a stranger emailed to say my site helped them
145 posts
EN ES

The first time a stranger emailed to say my site helped them

Web

Juan Cardena discusses a real-world problem of fragile state management between LLM agents and deterministic pipelines, and how a classic event-log pattern provided a resilient solution, emphasizing d

Every so often, an email cuts through the noise. Not a marketing pitch, not a notification, but a genuine, human-to-human message. I remember receiving one recently, titled simply "A quick thank you." It detailed how an old post of mine had helped a data architect on the other side of the world solve a persistent production bug. The surprising part wasn't just the gratitude, but the enduring relevance of the solution: not a shiny new tool, but a boring, durable architectural pattern.

The Appeal of Direct State Transfer

The architect described a common challenge: orchestrating deterministic data pipelines with an LLM-based agent. Their system had a rules-based workflow that prepared a dataset, handed it to an agent for complex summarization and entity extraction, and then expected the agent to pass control and mutated data back to the deterministic pipeline for validation and storage.

The first time a stranger emailed to say my site helped them
The first time a stranger emailed to say my site helped them

The problem was state. The workflow engine would serialize a massive JSON object representing the job's state and pass it to the agent. The agent, in turn, was expected to modify that object and return it. This direct, RPC-style state transfer felt clean and simple on the whiteboard—a straightforward function call. For trivial cases, it often works. It minimizes intermediate components and can offer lower latency in single request-response flows.

But in production, it was brittle. The agent would occasionally drop a field, malform the JSON, or return a state that the rigid deterministic process couldn't parse. Jobs would fail without a clear audit trail inside the agent's black box, and retrying them was a gamble. The elegant direct call was a single point of failure and opacity.

The first time a stranger emailed to say my site helped them
The first time a stranger emailed to say my site helped them
Pipeline PreparesSerializes StateJSONAgent MutatesAgent ReturnsCorruptPipeline Fails
Fragile Direct State Transfer

A Durable Pattern for State Management

The post that helped him wasn't about AI at all. It described a pattern I'd used years ago for integrating legacy batch systems, rooted in principles formalized decades ago in works like Gregor Hohpe and Bobby Woolf's Enterprise Integration Patterns. The core idea was simple: stop passing a giant, mutable state object directly. Instead, treat state transfer as a series of immutable facts written to a log, a concept elegantly explored in Martin Kleppmann's Designing Data-Intensive Applications.

This means System A doesn't "call" System B and expect a direct mutation. It writes an event—TASK_REQUESTED—with all necessary data to an append-only log. System B (the LLM agent) subscribes to that event, processes it, and when finished, writes its own event—TASK_COMPLETED or TASK_FAILED—with the results as its payload. The original system listens for the response event and proceeds.

The two systems don't need to know anything about each other's internal state beyond the event contract. The log becomes the single source of truth and the contract. The architect realized his "deterministic pipeline" and "LLM agent" were just modern incarnations of System A and System B, and this pattern applied directly.

From Fragility to Resilience

He implemented the change. The pipeline now writes a job request to a lightweight log (using a simple database table as a queue). The agent polls for new records, processes them, and writes its output back as a new, separate record linked by a correlation ID. The main pipeline then picks up the result. This shift from mutable state to immutable facts is fundamental to building resilient distributed systems, a point Pat Helland influentially argued in his paper "Life beyond Distributed Transactions."

The benefits were immediate and precisely what the pattern is designed for:

  • Enhanced Resilience: If the agent fails, the input event is still in the log. The job can be safely retried idempotently from the last known good state, preventing data loss.
  • Improved Observability: They now had a perfect, immutable audit trail of every state transition. Debugging became trivial because they could see the exact data that went into the agent and the exact, raw data that came out.
  • Decoupled Evolution: The agent and pipeline could now be updated independently. As long as the event schema remained consistent, the other side didn't care about internal changes.

This is durability in action. The initial "elegant" RPC-style call proved fragile, but the "boring" event-based pattern, despite adding an intermediary component, delivered a system robust enough to survive real-world production demands like traffic spikes or malformed responses from model updates.

First Principles Endure

That email was a powerful signal. It reinforced that a relentless focus on architectural first principles is never a wasted effort. The specific technologies change rapidly—from web services to microservices, data warehouses to lakehouses, and now deterministic software to agentic systems. But the underlying challenges of state management, system coupling, and failure domains are eternal.

The patterns that solve these challenges are durable. An event log that reliably decoupled mainframes decades ago is the same pattern that can decouple a Kubernetes service from an LLM agent today. The implementation details evolve, but the architecture of reliability and intellectual honesty remains remarkably constant.

SOURCESUser AppsIoT EventsExternal FeedsPROCESSINGIngest PipelineEvent BusLLM AgentsValidationStorage DBSERVINGAPI GatewayAnalyticsDashboardNotifications
Resilient Event-Driven Architecture

What that stranger's email truly conveyed was that real value often lies not in chasing the latest trend, but in a quiet explanation of something that works—something battle-tested. This is why I keep writing: to document and share these durable blueprints for the boring, reliable systems that actually get the job done when the demos are over and the alarms go off at 3 AM.

Key Takeaways

  • Prefer Immutable Events for State Transfer: When integrating disparate systems, especially unpredictable LLM agents, move away from direct, mutable state passing.
  • Embrace Event Logs for Resilience: An append-only log provides a robust contract, an immutable audit trail, and enables safe retries and clearer debugging.
  • Acknowledge Trade-offs: Direct integration might seem simpler initially, but durable patterns often introduce necessary indirection to achieve production-grade reliability and observability.
  • Ground Work in Established Principles: Connect your architectural decisions to canonical works and established patterns; this reinforces credibility and provides a deeper understanding.
JC
Juan Cardena
Enterprise Architect, Data & AI

Enterprise architect with 25 years across web, software, data, and AI. MIT CDAO ’25. Writing on agentic AI in production.