jcardena.com Blog Governance before hype: my rule for adopting any new tool
145 posts
EN ES

Governance before hype: my rule for adopting any new tool

Software

My rule for adopting new data and AI tools: answer three governance questions on provenance, access, and retention before any hype-driven PoC. A lesson from production reality.

Governance before hype: my rule for adopting any new tool

The demo ends and the room is buzzing. The vendor’s new agent framework is impossibly fast. Someone is already talking about a "paradigm shift." My engineers are itching to get their hands on the API. I just get a familiar, sinking feeling. We are all staring at a beautiful cathedral, and no one is asking about the foundation.

In these moments, I have to be the voice of deliberate architecture. I have learned, the hard way, that the most exciting features are rarely the ones that determine a system's long-term success. The real determinants are the un-glamorous mechanics of control and accountability.

The Pull of Hype vs. the Anchor of Reality

Technology hype creates its own gravitational field. It pulls teams toward a shiny object, promising to solve complex problems with elegant simplicity. The proof-of-concept is the first stage, designed to be a frictionless experience with clean sample data and wide-open permissions. It almost always works.

The problem is that production is nothing like a PoC. Production is a messy, high-stakes environment full of sensitive data, strict regulations, and auditors. A tool that shines in a sandbox can become a black hole for compliance and security. The hype cycle wants speed, but production reliability demands structure.

Hype and DemoRapid PoCProduction UseGovernance Crisis
The Reactive Hype-First Cycle

My Expensive Lesson in Ungoverned Data

This isn't a theoretical concern. Early in my career, I led a project to stand up a new, powerful data store for ingesting semi-structured logs. We ran a successful PoC in two weeks and got the green light. It felt like a massive win.

Eight months into production, a routine audit happened. An auditor asked, "This data contains user activity. Can you provide a list of every individual who has accessed it in the last 90 days?" We couldn't. The tool’s native security model was primitive, with no granular, user-level audit logs. Another question followed: "A customer exercised their 'right to be forgotten.' Can you prove you have deleted their records from this system and its backups?" Again, we couldn't.

What followed was a painful, six-month fire drill that halted all new feature development. We had to retroactively build a custom access-control proxy and a baroque process for handling deletions. The "fast and flexible" tool had created a costly, brittle piece of our infrastructure—an entirely self-inflicted wound born from chasing a cool demo.

A Three-Question Governance Litmus Test

That experience burned a rule into my professional DNA: governance before hype. Now, before any new data-handling tool gets a PoC, I require the team to answer three questions. They aren't meant to block innovation, but to frame it within a sustainable, production-ready context.

  • 1. Provenance: How do we trace the data? We need to show how we will track data lineage. If an analyst finds a strange value, can we trace it to its source? Tools like the open-source OpenLineage project set a high standard here. If a new system obscures lineage, it’s a non-starter.
  • 2. Access Control: How do we secure the data? We need a clear plan for role-based access control (RBAC) that integrates with our central identity provider. Can we grant access to a specific dataset or column, or is it all-or-nothing? Managing a separate set of users is a security silo waiting to happen.
  • 3. Retention: How do we dispose of the data? We must define the data's lifecycle. What is the technical process for ensuring data is permanently removed to comply with regulations like GDPR Article 17, the "right to be forgotten"? A system without a reliable DELETE verb is a liability.

If we have credible answers, we proceed. If the answers are "we'll figure it out later," the project is paused.

This Applies Tenfold to Agentic Systems

This governance-first principle is more critical than ever as AI agents and deterministic pipelines work together. An LLM-powered agent designed to query internal databases magnifies every governance weakness. Imagine an agent tasked with synthesizing a quarterly sales forecast. Without lineage, it might unknowingly pull from a deprecated database, creating a confident but utterly wrong projection. Without RBAC, it could leak details of an unannounced re-org from an HR system in response to a general query.

And if the data it was fine-tuned on is subject to a deletion request, how do you honor it once it's been absorbed into the model’s weights? You can't. The governance failure becomes permanent.

A Pragmatic Filter, Not a Dogma

My three-question test is a simple, centralized checkpoint. It's a pragmatic filter, not a universal architecture. I recognize that the industry is exploring more decentralized approaches. In her foundational work on Data Mesh, Zhamak Dehghani argues for distributing data ownership and governance to domain teams, a philosophy detailed in her article "Data Mesh Principles and Logical Architecture."

I see these ideas as compatible. A Data Mesh is a sophisticated strategy for scale. My test is the gate you must pass to even begin playing the game. Whether your governance is centralized or federated, you still must be able to trace, secure, and delete your data. Answering these questions first ensures you have the technical capabilities to build responsibly, at any scale.

SOURCESOperational DBsEvent StreamsThird-Party APIsGOVERNANCE & INGESTIONProvenanceTrackingAccess ControlLayerRetention PolicyEnginePROCESSING & STORAGEDeterministicPipelinesAgentic SystemsGoverned DataStoreSERVING LAYERBusinessIntelligenceAPIs for AppsAgent Responses
A Governance-First Data and AI Architecture

Guardrails Are What Enable Speed

Pushing for governance feels like slowing things down. The business wants features, and you're talking about compliance frameworks. But this initial, deliberate friction is what enables durability and speed later. It’s the architectural equivalent of "slow is smooth, and smooth is fast."

Good data governance isn't a set of rules designed to say "no." It's the technical and procedural foundation that allows an organization to say "yes" to innovation with confidence. The next time you see a demo that feels like magic, applaud the achievement. Then, be the one to ask the boring questions. Six months later, when the system is running smoothly at 3 a.m., everyone will be glad you did.

JC
Juan Cardena
Enterprise Architect, Data & AI

Enterprise architect with 25 years across web, software, data, and AI. MIT CDAO ’25. Writing on agentic AI in production.