Memory for AI Agents: Understanding Modeling for Unstructured Data
Adrian Brudaru,
Co-Founder & CDO
If you work with AI agents or RAG, you're already dealing with unstructured data at scale. For the more classic data engineering crowd, here’s an explainer of how unstructured AI memory works, though the lens of what we know from working with structured data.
A primer into structured data modeling
What is a canonical model?
A Canonical Data Model (CDM) is the technical realization of your organization's ontology—a shared, rigorous definition of what business entities like "Customer" or "Product" actually are, independent of any software vendor. Instead of relying on Salesforce’s definition of a client or SAP’s definition of a material, the CDM forces every system to map its raw data to this central, vendor-agnostic standard. For a Data Engineer, this means you are building pipelines against a stable business concept, not a fluctuating third-party API.
This approach achieves decoupling. When you replace an operational system, you don't break the downstream architecture; you simply map the new tool to the existing ontology. In a Data Warehouse, this ensures that every report reflects the same "Single Source of Truth." By strictly enforcing this ontology across the board, you eliminate data ambiguity, ensuring that "Gross Margin" means exactly the same thing to Finance as it does to Sales.
How the dimensional model fits in
The Dimensional Model (Star Schema) is the performance layer built on top of your Canonical foundation. While the Canonical Model/Ontology ensures strict data consistency across systems, it is often too complex and normalized for fast reporting—requiring dozens of slow joins to answer a simple question. The Dimensional Model solves this by intentionally denormalizing that data into Fact Tables (metrics like Revenue) and Dimension Tables (context like Region or Time). For the Data Engineer, this structure optimizes query speed; for the AE, it enables the "interactive dashboards" that clients demand.
The semantic layer resurgence
The Semantic Layer is seeing a massive resurgence because LLMs cannot reliably query raw databases. If you point an LLM at a messy schema with columns like amt_net_v2 and ask for "churn," it will hallucinate the SQL because it doesn't know your specific business rules (e.g., that "churn" requires a user to be inactive for 31+ days). The Semantic Layer fixes this by acting as the governed context provider. It abstracts the physical table complexity into clear, defined metrics like Revenue or Active_Users that the LLM can safely invoke without guessing the underlying logic.
How This Translates to Unstructured Data
A knowledge graph is a canonical model for unstructured data
For unstructured content you don’t have rows and columns, you have blobs of text and a need to query by meaning and relationship.
- A canonical model is a bunch of subjects, objects, and the links between them stored as joins, bridge tables, or fact tables.
- A knowledge graph is the same idea: subjects, objects, and the links between them, stored as nodes and edges.
The mapping is direct.
- Subjects and objects = node types (Person, Company, Document, etc.).
- Join-key-style relations = one edge from one node to another (e.g. Alice
works_atAcme). - Many-to-many = either many edges of the same type between two node types, or an intermediate node (e.g. Person → Enrollment → Course) like a bridge table.
- Fact-like structure = nodes or edges that represent events or measures (e.g. a Deal node linking Account, Contact, Date, and amount; or an edge with a quantity or timestamp).
How Cognee fits: layers for AI memory
Cognee is a Python SDK and knowledge engine that builds persistent, structured memory for AI agents. It's a concrete example of the above: it turns scattered data into a single memory system with a clear model.
From a data-pipeline perspective it runs four core operations:
- add() —> Ingestion. You point it at PDFs, CSVs, audio, code, S3 URIs, etc. It normalizes, deduplicates (e.g. via hashing), and organizes content into datasets with permissions.
- cognify() —> Graph construction (the main ETL). It classifies and chunks, uses an LLM to extract entities and relationships (e.g. Alice works_at Acme), summarizes and embeds for semantic search, and stores results across three stores: graph (e.g. Kuzu) for structure, vector (e.g. LanceDB) for similarity, relational (e.g. SQLite) for provenance and metadata.
- memify() —> Doesn't involve memes, sadly. It's graph maintenance, a background process that refines the graph: prunes stale information, strengthens frequently used connections, adds inferred facts. The memory model evolves over time.
- search() —> Hybrid retrieval. Queries hit both vector and graph; Cognee supports multiple retrieval modes (from classic RAG to graph-aware, multi-hop styles). Benchmarks show these can significantly outperform basic RAG on complex questions.

So you get a canonical-style model for unstructured data (the graph + ontology), a "dimensional" style of access (different search strategies and grains), and a single place where semantics and structure live, without running your own graph DB, vector DB, and ETL from scratch.
This might look a lot like a classic data stack layers - ingestion, canonical model, data mart and finally access.
Memory for AI Agents
Agents without memory are just toys. Our partner Cognee is turning scattered data into governed, self-improving memory for over 70 enterprises including Bayer's scientific research teams. Backed by $7.5M from the minds behind OpenAI and FAIR, this is the new standard for Context Engineering.
Of course, Cognee uses dlt for ingestion, and we use Cognee for unstructured data work.
If this framing resonates, and you want durable, structured memory for agents instead of ad-hoc RAG and vector stores, you can get started with Cognee quickly. Default config is file-based (SQLite + LanceDB + Kuzu), so no extra infrastructure is required.