Prompt engineering for a SQL brain

For twenty years, my world was one of predictable structure. If I needed to know which customers in the `west` region had a lifetime value over $10,000, I knew exactly what to write. A `SELECT` statement is a contract with a well-defined schema. The database doesn't get creative; it returns the rows that satisfy the predicates, every single time.

My first months with large language models felt like the opposite. It was gut feel and fuzzy incantations, more like negotiating with a clever but unreliable intern than engineering a system. The breakthrough wasn't a new prompting trick; it was realizing I already had the right mental framework, just with the wrong vocabulary.

Context as a Transient Schema

In a relational database, the schema is the law. It dictates tables, columns, and data types, guaranteeing consistency. In an LLM-based system, the context you provide in the prompt serves the same purpose for a single request. It defines the world as the model should see it for that one interaction.

When I started thinking of context as a transient schema, my approach changed. Instead of just asking a question, I began by architecting the context—the system prompt, few-shot examples, and retrieved data—to constrain the model's world so tightly that the correct answer was the most probable outcome.

Mapping SQL Concepts to LLM Prompts

Constraints as Probabilistic Predicates

Once you have a table, you use a `WHERE` clause to filter results. These are hard, non-negotiable predicates. In a prompt, constraints like "Respond only in JSON format" or "The summary must be under 100 words" are the equivalent. They are strong suggestions, not inviolable rules.

In production, I've found the most reliable systems don't just state these constraints once. They reinforce them in the system prompt, reiterate them in the user prompt, and demonstrate them in few-shot examples. It's like adding an index to your database to make sure the query planner uses the right predicate efficiently.

RAG as a Semantic JOIN

The power of SQL is in the `JOIN`, which combines related data from multiple tables. Retrieval-Augmented Generation (RAG) is the architectural pattern that serves as the LLM's `JOIN` clause. The user's query is the key, used to perform a semantic search against a vector database and retrieve relevant context. This pattern was first formalized in the 2020 paper Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks by Lewis et al., and it forms the backbone of most production AI systems today.

Thinking of RAG as a `JOIN` clarifies its failure modes. A bad `JOIN` brings back wrong data; a bad retrieval step does the same, causing hallucinations. The engineering challenge shifts from "prompt tuning" to the more durable problem of "retrieval strategy."

Where Mental Models Collide

This analogy is a powerful scaffold, but it's essential to know its limits. SQL is deterministic. An LLM is probabilistic. That difference is where systems break. The first time an otherwise reliable model started returning un-parseable JSON after a minor provider patch, I learned that the validation layer isn't optional. It’s the most important part of the deterministic shell around the agent.

The SQL analogy focuses on data structure, but it's not the only one. Andrej Karpathy frames the LLM as a new kind of operating system, a non-deterministic CPU that executes programs written in natural language. This model is also useful, focusing more on process and flow control, where patterns like the one described in the ReAct paper show how to synergize reasoning with deterministic tool use. My SQL-brain model helps structure the *data* for the agent; the OS model helps structure the *work* it performs. Both are needed.

Building the Deterministic Shell

We trade the iron-clad predictability of code for the flexibility of natural language. But to make it reliable, we must wrap the probabilistic core in deterministic logic. We trust, but we verify with code. This isn't just a `try/except` block; it's a formal contract.

This is where tools become critical. Forcing an LLM to return structured output is a common failure point. A library like Jason Liu's Instructor provides a practical way to enforce this, using Pydantic models to define the expected JSON schema. If the model's output doesn't validate against the Pydantic object, the library can automatically handle the retry logic. This turns a fuzzy "please give me JSON" request into a hard, verifiable contract.

Architecture for a Deterministic Agent

Takeaways for the Structured Mind

Shifting from a deterministic mindset doesn't mean abandoning decades of experience. It means adapting our principles to a new kind of component.

Architect Your Context Rigorously. Treat your system prompt, examples, and RAG pipeline with the same seriousness you would a database schema. It is the foundation.
View RAG as a Semantic JOIN. If your agent gives bad answers, investigate your retrieval first. The problem is often not the LLM's reasoning but the data it was given to reason about.
Acknowledge Competing Models. Your mental model is a tool, not the territory. Understanding other frames, like the "LLM as an OS," provides a more complete architectural view.
Build a Deterministic Shell. Never trust the model's output directly. Wrap your LLM calls in code that validates, parses, and retries using concrete schema definitions. This is how you build a reliable system from an unreliable part.