Clean architecture inside a codebase older than my teammates

The git blame on the file was older than the engineer sitting next to me. It was an archaeological artifact, a digital strata of every architectural trend from the last two decades. For years, the challenge was simply to keep it running. Today, the challenge is different: how do you connect a non-deterministic LLM agent to a foundation this fragile and expect anything other than chaos?

You can't build reliable agentic systems on a foundation of mud. Before you can let an agent call tools and manipulate business logic, that logic must be isolated, testable, and completely deterministic. The real work of modern AI architecture isn't just about wiring up APIs to models; it's about the painstaking work of carving out a clean, deterministic core from a legacy monolith.

The New Failure Mode: Agents on a Brittle Core

In old systems, the primary failure mode is a profound lack of boundaries. Business rules are welded to SQL queries. UI event handlers contain complex financial calculations. The blast radius of any change is unknowable, and every deployment is a high-risk event. This is a maintenance headache we've learned to live with.

But when you introduce an LLM agent, this headache becomes a catastrophic liability. An agent is a probabilistic system. It might hallucinate an input, call functions in an unexpected order, or misinterpret a result. If the underlying code it calls is a tangled mess of side effects, you have an unpredictable system controlling another unpredictable system. It’s a recipe for corrupted data and production incidents that are impossible to debug.

Carving Out a Deterministic Core

To build a safe and effective AI system, the agentic, probabilistic parts must be constrained by a rigid, deterministic core. The agent can propose actions, but the core validates and executes them. This requires a clean separation of concerns that simply does not exist in most legacy codebases.

A Beachhead of Determinism

The goal isn't a big-bang rewrite, which is almost always a fatal mistake. The goal is to create a beachhead of sanity. We picked one critical, buggy domain—pricing calculations—and set out to isolate it. The approach is a direct application of principles from Robert C. Martin's Clean Architecture, with a heavy debt to the tactical patterns of Domain-Driven Design.

The central rule is the Dependency Rule: core business logic must not depend on infrastructure details. Our pricing logic, the pure essence of the domain, should have no knowledge of a database, a web framework, or an API client.

The process was surgical. We defined a `CalculatePrice` use case and its entities (like `Product` and `User`) as plain, simple data structures. When the use case needed data, it didn't call a database driver; it called an interface we defined, like `IProductRepository`. The domain depended only on this abstract contract.

Then, in the outer infrastructure layer, we built the bridge: a `LegacyDbProductRepository` class that implemented our interface. This class contained all the ugly, old code to fetch data from the ancient database. The dependency was inverted. The moment it clicked was writing the first unit test. We could mock the repository and test every nuance of the pricing logic in milliseconds, completely isolated from the database. We had created a testable, deterministic island.

Scaling with an Internal Strangler Fig

This first success became our playbook. For each new feature or bug fix in the monolith, we applied an internal variant of a pattern that Martin Fowler calls the Strangler Fig Application. Instead of strangling a whole service over the network, we were strangling tangled procedural code from the inside with clean domain boundaries.

Over months, the new, testable code began to grow. The old code was slowly demoted, relegated to thin data-access adapters behind repository interfaces. It's still there, but its role is diminished. It becomes a dumber, thinner layer that can eventually be replaced entirely, piece by piece, without a massive flag-day migration. This growing deterministic core becomes the trusted set of tools our new AI agents can safely use.

This isn't just refactoring for its own sake. It is the essential groundwork for the next architectural wave. Each isolated domain becomes a robust, reliable capability that can be exposed to either a traditional API or an LLM agent with equal confidence.

Architecture for Hybrid AI/Deterministic Systems

The Honest Costs and The Real Payoff

This approach is an investment, and it is not free. It has real, non-perceived costs. The first is a tax on velocity, especially at the beginning. Explaining that you're building interfaces and adapters before adding the new button is a hard sell. You are deliberately slowing down to go faster and safer later.

The second is the boilerplate. The ceremony of interfaces, dependency injection, and data mapping adds more files and a learning curve for developers accustomed to transaction scripts. For simple CRUD operations, it can be over-engineering. A failure mode here is creating "anemic domain models" where the clean layer is just a collection of data bags with no real logic. Discipline is required to ensure the complexity buys you real, valuable decoupling.

But the payoff is a system that is not only more maintainable but also ready for the future. The real takeaways from this journey are about building foundations:

Isolate logic before automating it. Before you hand a capability to an LLM agent, make sure it is a well-defined, testable, and deterministic tool.
Let testability be your guide. The primary reason for this separation is to gain confidence through fast, isolated tests. If you can't test it easily, the boundary is wrong.
This is a process, not a project. Grooming the core of a system is continuous work. It's a commitment to craftsmanship that enables the next generation of software, where deterministic and agentic systems must safely cooperate.