Skip to content
Technology May 7, 2026 10 min read

AI Agent Memory Is Now Part of Your Security Architecture

Persistent AI memory can make assistants more useful, but it also creates a new attack surface. Companies need clear rules for what agents remember, retrieve and trust.

K

Kyluke McDougall

Software Architect & Founder

AI Agent Memory Is Now Part of Your Security Architecture

AI agents are becoming more useful because they remember things.

They remember previous conversations. They retrieve documents. They summarize tasks. They build a picture of a project, a codebase, a customer, a process or a team. They connect to Notion, Slack, Google Drive, GitHub, tickets, CRM systems, product specs and internal wikis.

That memory is what makes them feel less like chatbots and more like colleagues.

It is also what makes them risky.

Fresh security discussion around AI agents is increasingly focused on memory and context poisoning: the possibility that an agent stores, retrieves or trusts malicious, misleading or low-quality context, and then carries that influence into future work. OWASP’s Top 10 for Agentic Applications treats this as a distinct risk category: Memory & Context Poisoning.

For founders, CTOs and product owners, this is not an academic AI safety issue.

It is a software architecture issue.

If an agent can remember, memory is part of the system. If the agent can also use tools, memory becomes part of the control path. And if that memory is connected to company documents, customer data, repositories or workflows, it needs the same seriousness as any other production dependency.

Prompt injection was the first warning

Most teams have now heard of prompt injection.

An AI system reads untrusted content. That content contains instructions. The model may follow those instructions even though they came from a document, web page, email or user message rather than from the real operator.

That is already serious.

But prompt injection is often treated as a momentary problem: the model saw something bad in this interaction, so this answer or tool call may be unsafe.

Memory poisoning is more uncomfortable because it can persist.

Instead of influencing one answer, the malicious or misleading context gets written into a memory store, summarized into a project history, embedded into a vector database, added to a shared knowledge base or retained as a preference. The original input may disappear from view. The influence remains.

Later, the agent retrieves that context and treats it as useful background.

That changes the risk profile.

The problem is no longer only “Can the model ignore the instruction in this document?” It becomes “What is this system allowed to remember from this document, and how will we know whether that memory is trustworthy later?”

Memory is not neutral

It is tempting to think of AI memory as a convenience feature.

In practice, memory is a ranking system, a trust system and sometimes a decision system.

When an agent works on a software project, remembered context might tell it which architecture patterns to prefer, which service owns a domain concept, which tests are unreliable, which deployment path is normal or which customer requirement matters most.

When an internal AI assistant helps a business team, remembered context might influence which policy it applies, which customer data it surfaces, which supplier terms it considers valid or which workflow step it recommends next.

If that memory is wrong, the agent does not merely produce a wrong sentence. It may repeatedly make the wrong kind of decision.

The risk becomes sharper when memory is shared:

  • one employee uploads a document that later affects everyone;
  • a project summary carries an incorrect security assumption into future work;
  • a support conversation becomes training context for another customer;
  • a low-trust source is mixed into a trusted knowledge base;
  • an agent summarizes poisoned content into a cleaner-looking note that loses its warning signs.

That last case matters. Once untrusted content is summarized, it often looks more trustworthy than it is.

This is where normal application architecture instincts are useful. We already know that data needs source, ownership, validation, retention and access rules. AI memory is not exempt from that just because it sits next to a model.

The dangerous shortcut: “just connect the knowledge base”

A common AI-agent project starts innocently.

“Let’s connect it to our documents.”

That sounds sensible. Most companies have knowledge scattered across tools: Google Drive, SharePoint, Notion, Slack, Jira, GitHub, email, PDFs, spreadsheets and old wikis. Giving an assistant access to that context can genuinely reduce friction.

But a company knowledge base is rarely clean.

It contains drafts, obsolete processes, duplicated policies, conflicting decisions, customer-specific exceptions, old credentials pasted into documents, test data, private HR notes, sales promises, screenshots, comments, TODOs and opinions that were never meant to become operational truth.

Humans handle this mess through judgement. We know that a 2022 document may be stale, that a Slack message may be casual, that a draft proposal is not a signed contract, and that one engineer’s note is not necessarily the architecture.

An agent needs that judgement designed into the system.

If every retrievable item has the same weight, the agent will not reliably know the difference between:

  • approved policy and outdated draft;
  • production architecture and brainstorming note;
  • customer-specific workaround and general rule;
  • trusted source and uploaded attacker-controlled content;
  • internal fact and text copied from the public internet.

The result can look impressive in demos and still be unsafe in production.

What should be designed before persistent memory goes live

AI memory does not need a huge governance programme before any useful work can happen. But it does need some deliberate boundaries.

The first question is simple:

What is this agent allowed to remember?

That should be narrower than “anything it sees.” Many useful agents do not need permanent memory for every conversation, file or task. Some need only short-lived task context. Some need project memory but not customer memory. Some should retrieve documents but never write new long-term facts without approval.

The second question is:

Who or what is allowed to write memory?

There is a large difference between a human explicitly saving a project decision and an agent automatically summarizing an untrusted document into a long-term store. Automatic memory writes are convenient, but they are also a place where poisoned content can become persistent.

The third question is:

How is trust represented?

A memory entry should not be just text. It should carry provenance: source, timestamp, owner, scope, confidence, retention rule and, where relevant, approval state. A current signed contract should not be treated like a three-year-old meeting note. A production incident runbook should not be treated like an AI-generated summary that nobody reviewed.

The fourth question is:

Can memory be audited, corrected and rolled back?

If an agent starts behaving strangely, the team needs to inspect what it has been retrieving and what was recently added. Without logs and rollback, memory problems become difficult to diagnose. The agent may give plausible answers while quietly relying on bad context.

The fifth question is:

Which actions require fresh verification?

Even trusted memory should not be enough for high-impact actions. If an agent is about to deploy, delete, invoice, approve, email, modify permissions or expose customer data, it should verify against current authoritative sources and, in many cases, ask for human approval.

Memory is useful. It is not authority.

A practical memory architecture for internal agents

For many companies, a safe starting point is not complicated. It is structured.

Separate memory into categories.

There is working memory: temporary context for the current task. It can be useful and messy because it expires quickly.

There is project memory: decisions, architecture notes, accepted terminology, open risks and operational constraints. This should have provenance and review rules.

There is retrieved knowledge: documents the agent can search, but not necessarily trust as truth. Retrieval should respect access permissions, tenant boundaries and document freshness.

There is authoritative data: systems of record such as the CRM, ERP, billing system, production database or source repository. The agent may read these through controlled APIs, but it should not replace their authority with remembered summaries.

These categories should not blur together.

A bad pattern is one global memory bucket that stores everything: chats, summaries, uploaded documents, generated notes, customer-specific facts and operational instructions. It may be easy to build. It is hard to reason about.

A better pattern is memory with clear boundaries:

  • per-user and per-tenant isolation;
  • separate stores for temporary and durable context;
  • approval before important memory writes become durable;
  • source attribution on retrieved snippets;
  • expiry rules for low-confidence or temporary material;
  • quarantine for untrusted uploads;
  • logs for memory reads and writes;
  • tests that simulate poisoned documents and misleading summaries.

This does not make the system perfect. It makes the risk visible and manageable.

Why this matters for AI-assisted software development

Coding agents are an especially interesting case.

They work better when they understand the codebase. They need context: conventions, architecture decisions, failed approaches, test strategy, deployment rules and product constraints. A coding agent without memory wastes time rediscovering the same things.

But a coding agent with uncontrolled memory can also carry bad assumptions forward.

Imagine it remembers that a failing test is “probably flaky” when it is actually protecting a critical edge case. Or that direct database writes are normal because it saw an old migration script. Or that a service boundary can be ignored because one emergency patch did it six months ago. Or that a dependency is approved because a draft technical note said so.

The agent may not break the system in one dramatic action. It may slowly normalize the wrong pattern.

That is why memory design belongs next to architecture and review design. The team should decide which conventions are official, which notes are only historical, which tests are mandatory, which environments are off limits and which changes require human review.

McDougall Digital’s view is simple: AI-assisted development is valuable when it increases delivery capacity without weakening architectural ownership. Persistent context can help, but only when the team can see where that context came from and decide how much authority it has.

The client question: are you building an assistant or an operational actor?

The level of caution depends on what the AI system can do.

If an assistant only drafts internal text and has no persistent memory, the risk is limited.

If it retrieves company documents, the risk increases.

If it writes memory that affects later answers, the risk increases again.

If it can call tools, update systems, create pull requests, send messages, change data, approve workflow steps or interact with customers, memory becomes part of operational security.

This is the point where teams should stop treating the project as “an AI feature” and start treating it as software architecture.

Before connecting an agent to real business workflows, ask:

  • What sources can it read?
  • Which sources are authoritative?
  • What can it remember permanently?
  • Who can poison those sources, intentionally or accidentally?
  • How is memory separated between users, customers and projects?
  • What gets logged when memory changes?
  • Who can review or delete a memory entry?
  • Which actions require current source checks rather than remembered context?
  • What happens if the memory store is wrong?

These are not theoretical questions. They are the difference between a useful internal assistant and a system that quietly accumulates operational risk.

How McDougall Digital can help

AI agents can be genuinely useful in serious products and internal systems. They can reduce repeated work, make company knowledge easier to use, support development teams and automate narrow operational workflows.

But they need a clear architecture around memory, tools and responsibility.

McDougall Digital helps teams review AI-agent and RAG-based systems before they become production dependencies. That can include mapping trusted and untrusted context, separating temporary from durable memory, defining retrieval boundaries, checking access control, designing approval gates, adding audit trails and hardening the surrounding software.

The goal is not to make AI adoption slow.

The goal is to make it trustworthy enough to matter.

If your company is connecting agents to documents, repositories, customer data or internal tools, the right question is no longer only “Does the model answer well?”

The better question is:

What is this agent allowed to remember, and why should we trust it?

Continue Reading