Three lives in one year: day work, building, and learning
AI
A practitioner's guide to de-risking agentic AI for enterprise use. How to use a personal lab to test concepts like deterministic outputs and turn theory into credible, production-ready architectural
The pressure to integrate LLM agents into production systems is immense. So is the risk. The demos look magical, but enterprise architecture runs on reliability, cost control, and predictable failure modes—three things notoriously absent from the average agentic proof-of-concept. This creates a dilemma: how do you develop a credible, earned opinion on these systems without betting the company's production stability on a non-deterministic black box?
For me, the answer has been to run three parallel lives: the enterprise architect, the student, and the hands-on builder in my own lab. This isn't a productivity strategy; it's an architectural one for building expertise.

The Software 2.0 Dilemma
The neat separation between software, data, and ML is gone. As Andrej Karpathy articulated in his essay Software 2.0, we are increasingly writing systems where behavior is learned from data, not explicitly programmed. This convergence demands a composite skill set. Yet, our best practices for high-value work often preach the opposite. The philosophy of "deep work," popularized by Cal Newport's book Deep Work, argues for long, uninterrupted focus on a single, hard problem. And it's right—you can't architect a complex system by skimming headlines.
Herein lies the tension. Deep work is essential, but focusing on a single domain—only the production code, or only the research papers—creates blind spots. The architect who only reads papers can't speak to cost curves. The engineer who only builds can't see the next platform shift coming. The only sustainable path is to build a system that integrates all three.

De-Risking Agents in the Lab
My approach is to treat this as a flow of value. The day job provides the most valuable resource: real, high-stakes problems. I don't try to "learn AI"; I scope a specific, urgent question from my professional work. For example: "How can we guarantee an LLM agent returns valid, structured JSON without hallucinating fields or breaking the schema?"
This question becomes the focus of my learning. I'm not aimlessly browsing; I'm hunting for specific solutions. This leads me to papers and documentation on techniques like constrained decoding, function calling, and managing agentic state with finite-state machines. The theory provides the patterns.
But theory isn't enough. The next step is to build a small, disposable prototype in the lab whose only purpose is to make that theory tangible. This is my proving ground for ideas too new or risky for a production roadmap.
From Theory to a Testable System
For the JSON problem, the lab project becomes a simple API endpoint. It takes an unstructured text blob and uses an agent to parse it into a rigid, predefined JSON object. Here, I can implement the patterns I just learned about. I build the orchestration logic, define the function calls the model can use, and add a validation layer that acts as a circuit breaker.
Most importantly, I instrument everything. I measure the latency, the token cost, and the failure rate. I see exactly what happens when the model, despite constraints, refuses to comply. My lab projects are messy and have zero tests. They aren't products; they are high-fidelity learning tools designed to be thrown away. Their goal is to generate one thing: a hard-won, specific insight.
The Payoff: An Earned Opinion
This process culminates in the ability to contribute a credible, specific opinion back in my professional role. The next time the topic of agentic systems comes up, I'm not just repeating vendor marketing. I can state an earned position. I can say, "In my tests with a mid-sized model, forcing a JSON schema with function calling was highly reliable for structured data extraction, but it consistently added over 100ms of latency per call and required explicit retry logic for the 5% of cases where the model would return a valid function call with invalid parameters."
This is the payoff. It's a specific, defensible, and practical piece of knowledge grounded in all three contexts: the real-world problem, the theoretical solution, and the practical implementation. It respects the complexity and avoids the hype.
An Integrated System for Expertise
This integrated cycle is demanding. It requires ruthlessly prioritizing what to learn and ignoring 90% of the industry's weekly hype. It means being comfortable with building messy, disposable things instead of polished side-projects. But the result is a durable skill stack, one that's grounded in the timeless realities of production systems.
When you’ve personally built and observed the failure modes of these new architectural patterns, you develop an intuition for what will last. You know what it takes to keep a system running at 3am. In an industry drowning in abstraction and hype, that integrated, hands-on perspective is the most valuable asset an architect can build.