dltHub
Blog /

So you vibe coded a data stack, now what?

  • Adrian Brudaru,
    Co-Founder & CDO

The Stack is Generated. Now What?

the tl;dr:

Yes, you can prompt your way to a data stack. It works! Great!

Until it doesn’t. Not great!

Why does it stop working and how to make it work?

In this blog post, I will describe the actual, hard real world barriers that make your LLM setup collapse, and propose principles for making your systems work.

Finally, I am inviting you to try our pre-release LLM native data platform, dltHub pro, our answer to high data quality LLM workflows scheduled for release in Q2.

The demo works. Let's be clear about that.

“In one word? GREAT! In two words? NOT GREAT!”

If you have clean source systems with public or well-documented schemas like Stripe, Salesforce, Shopify, Postgres with good naming, prompting a data stack is genuinely fast, pretty clean and easy now. You can go from concept to deployment in a single prompt, and you can even achieve “correctness” if you can define clear tests, or “good enough” if you had no strong opinion to boot.

This isn't 2023 "AI writes SQL" anymore. This is closer to: you describe the shape of your data product, and a working project shows up in your repo, your pipeline gets deployed and dashboards are created.

LLM’s aren’t “stupid” - they have highly complex world models and can do many tasks faster or better than human, so when does it fall apart?

In practice…

Most demos are reproducible, but business reality is not. The difficult part about simulating reality is that most of these demos assume all the information needed to create a data stack is known and available at the time of data stack creation.

If that were true, then all we would have to fix is the coding gap - alas, code was never the main problem and just a means to an end.

Why it doesn’t work: Uncertainty, complexity and contradiction are kings of the world, and your data model is clueless.

Your data model might be clean, but the real world just hits different. There’s no single source of truth in the world.

Most requests addressed to analysts are undefined because the stakeholders haven't normalized their own definitions yet. My job as a data person isn't just fetching data, but mapping the logic in their heads to a problem in the world and solving for that.

The data model is a map that assumes the territory is static, but the real world is ontologically fluid. We face a double-blind semantic gap: the user lacks a precise definition, and the system lacks the context to correct them.

Ontologies are to the world what semantics are to your data model, except there are many, overlapping, conflicting and unknown.

If you have a small data model that only has one financially sounding column “price” and you ask a LLM about “Revenue”, it will probably sum up the price of items sold.

But if you ask the finance department, they’ll say, to calculate revenue from sales you have to wait for payments, returns and cancellations. If you ask Marketing, they say they feed the revenue into google analytics with every sale.

The difference? that’s Semantics vs ontology. Semantics defines a calculation over a data model. Ontology talks about the real world. The real world has complexity that lies beyond definition.

The Cyc Project: The Warning from the 80s

In 1984, a team of researchers launched the Cyc Project. Their goal was to solve AI by building the world’s first and only Universal Ontology. They believed that if they could just sit down and manually code every single rule of "common sense" into a logical database, the computer would finally "understand" the world.

The Cyc team hit a wall that every data person eventually hits: The Contradiction. They found that if they wrote a rule saying "Humans cannot be in two places at once," it worked for a payroll system but broke for a "Logic of Travel" model or a "Fiction" model. Every time they tried to make a "Global Truth," they broke a "Local Truth."

To save the project, they had to invent Micro-theories. They realized that Global Consistency is a myth. To make the system work, they had to "wall off" different parts of the brain.

This realization created the AI we have today.

Modern AI (LLMs) dealt with the "Cyc Problem" by simply giving up on definitions. While Cyc tried to build a rigid skeleton of the world, modern AI builds a statistical map of how humans talk about the world.

How AI handles "Local Ontologies" today:

In the era of projects like Cyc, AI operated through “Micro-theories” or what is perhaps best described as “local ontologies”, essentially digital "rooms" where specific rules applied. If you are in the "Medical Room" or the “Sociology room” the word culture means a different thing. To maintain accuracy, the system had to manually switch between these discrete silos to ensure it didn't apply the wrong set of logic to a problem.

Today’s LLMs have replaced those physical rooms with a high-dimensional Latent Space. Instead of a user manually flipping a switch, the "room" is determined entirely by the vector trajectory of your prompt. The AI doesn't "know" it's in a different department; it simply follows the statistical gravity of your vocabulary.

For example, in the image below, by asking for a “desk and chair” vs “table and chair” we get a different type of chair, that is suitable for an ontology containing desk or table respectively.

“desk and chair” vs a “table and chair” anchor models in a different ontology, producing a rolling chair vs a dining chair.

This fluid nature creates what we might call the Hallucination Gap, a particular trap for the "vibe coder" or the casual business user. Because the AI is predicting the most likely ontology based on public training data, it defaults to a "Generic Average".

To give an example, if we give the above image to an LLM. and ask "Based on this image, calculate the 'Productive Capacity' of this house for a startup with 2 employees.” it will tell us it’s suitable for about 12 hours of work. What? that doesn’t make any sense, but it does if you think like an LLM and your head jumps from one dimension to another in the same thought. To give you that "12 hour" answer, it had to:

  • Step 1: Use the Medical Ontology to recognize the dining chair is bad for long-term work and only suitable for short sessions with breaks.
  • Step 2: Use the Startup/Hustle Ontology to assume that "some work is better than no work."
  • Step 3: Use a Workplace ontology to assume the desk chair will be used for 8h of work.
  • Step 4: Use a Physics/Math Ontology to sum them together as if they were the same "Type."

A good ontological answer would have been “this is currently only suitable for one person, unless you get another ergonomic chair for the second person”. Instead, by jumping from ontology to ontology, the model finds a “clever way” to work around the constraints of, well, reality, in order to give you an answer that might make you go “12h, that’s sensible” but is actually useless and misleading. If such a thought chain happened in the middle of a larger LLM answer chain, any following logic would be severely derailed.

How does this apply to our field?

To solve the Ontological Gap, we don't need to build a 'Universal Library' like Cyc. We need instead a variety of tools to understand what ontology an implementation used, and “right context at the right time” delivered though varying means. We need an infrastructure that forces the AI to stay within a specific 'local ontology' while solving specific problems. This can be done in many ways - when building a dlt pipeline, the ontology might be the docs of the source and of dlt, but when building a transformation pipeline, your ontology might be the business you build for, and the source metadata like the dlt-discovered schema.

Fundamentally it means we need a few other pieces besides “Vibes and prayers” in order to get LLMs to work well in our reality. Besides generation, we need great validation and observability - because the models make tons of assumptions, and it will make many, on the fly, from the start.

If your ontology is fluid, your desk chair could become a dining chair mid-prompt

Practically, it means our workflows cannot be along the lines “vibe a clean data model, then hope for the best”. It should be more like

  • Bootstrap an ontology that anchors the model end to end during development
  • Test the developed code thoroughly, observe its architecture, automatically test assumptions on each subsequent refresh.
  • With a clean implementation, go back and update the ontology along with its lineage to the data model. This serves as “data literacy” for the agent along with “problem space ontology” to produce reliable answers.
  • As you add problem spaces in your data products, create and maintain problem space ontologies by similar methods.

The Principles of AI-Ready Infrastructure

To prevent your data work from collapsing into hallucination, you don’t just need ontologies but software that gives LLMs the right tools to implement valid, observable code. The data platform must move from "Vibe Coding" to an infrastructure built on these three architectural pillars:

1. Transparency as an Anchor

Black boxes are where context dies and hallucinations begin. A system must be code-first and declarative, ensuring that business context isn't just a hidden prompt, but a flow of metadata, schemas, and traces. Context must flow from the source through ingestion and transformation, preventing "black holes" where intent is lost. When the system is explicit and inspectable, agents can perform work using metadata alone.

2. Composable Primitives

Asking an agent to generate a pipeline from first principles is a recipe for unmaintainable spaghetti code. High-quality systems are built from a composition of proven, reusable building blocks. By providing modular libraries, you ensure the agent follows platform-native best practices. If the "wheel" is already defined as a standard primitive, the agent cannot hallucinate a square or flat one. This modularity produces maintainable, adaptable code that integrates with the wider ecosystem rather than creating a silo.

3. Iterative Interrogation

You cannot trust a probabilistic model with deterministic logic on the first pass. The challenge is keeping development verifiable and safe from prototype through production and its subsequent maintenance. The workflow must be designed for an iterative "Generate → Inspect → Validate → Apply" loop, utilizing isolated, ephemeral workspace sandboxes. Developers must be able to inspect generated projects, traces, and execution logs in real-time. This allows the agent to recognize an ontological violation and self-correct based on human-guided instructions.

Build with us

We are opening a limited Design Partnership for small data teams ready to move beyond "Vibe Coding" and into a production-grade AI-native infrastructure.

By joining this next wave of design partners, you’ll get exclusive early access to our LLM native data platform, dltHub Pro, including our AI Workbench.

We are looking for architect-minded users who want to stop the vibes and switch to building trustworthy pipelines. In exchange for your feedback during this private beta, you will directly influence our product roadmap and secure a significant discount of dltHub Pro at our public launch coming this Q2.