Manthan

Towards Human like Memory for AI Agents

· Manthan Gupta

Memory has become one of the most talked about topics in the AI community. Every few weeks, we see new papers, frameworks, and posts, each claiming to “solve” memory for LLMs. Yet most of them are glorified caches that store tokens, not thoughts. Today’s memory systems rely on brute force retrieval: dumping everything into a vector database or rank fusing it with BM25. These approaches scale clutter, not cognition.

If we want agents that truly understand and evolve with users, we need something deeper: metacognition. In this blog post, I will share how I believe the memory layer of agents will evolve to become more human-like.

Why current memory systems fail?

Most memory systems in today’s AI applications are built on a simple idea: Store every message, event, or summary, and retrieve when needed. It sounds reasonable until you scale it. Soon, these vector stores become cluttered with redundant, low value embeddings, each competing for relevance. The result is context bloat, incoherent responses, and rising retrieval latency.

Most people equate “memory” with “retrieval”. But retrieval is only one piece of the puzzle. Whether it’s BM25, RRF, or hybrid search, the pipeline is still reactive. It finds data; it doesn’t understand or organize it.

Even when systems claim “long-term personalization”, what they really do is re-fetch old facts. There is no mechanism for memory consolidation or refinement. Over time, agents become stale mirrors of early data rather than adaptive representations of the user.

Real intelligence depends on what we choose to forget. Without selective forgetting, retrieval becomes slower, relevance drifts, and personalization breaks down. The system ends up remembering everything except what’s important.

How the Human Brain Actually Manages Memory?

Every piece of information we encounter goes through a pipeline of encoding -> storage -> consolidation -> retrieval -> forgetting.

Core Memory

Before we even think, perceive, or reason, we carry a sense of self, a stable, persistent representation of who we are. This core memory defines our personality, preferences, and biases. For humans, this includes core beliefs (“I am optimistic”), personality traits (“I am emotional, and not analytical”), and stable facts (“I live in Bangalore”). It’s what makes our memories coherent.

An agent should have its own core memory per user - a structured, evolving representation of the user’s identity and behavioral tendencies.

Sensory Memory

Every thought begins with a signal from our senses. Sensory memory is brief, lasting only for a few seconds, but acts as a filter.

Agents need a sensory layer that captures and filters signals from all modalities: messages, files, events, images, audio, etc., deciding what’s worth noticing before ingestion

Short-Term Memory

Once information is filtered, it enters the short-term memory. Here we reason, compare, and combine thoughts in real-time.

A human-inspired approach would actively manage focus, relevance, and topic continuity across reasoning sessions.

Long-Term Memory

Information that proves meaningful or repeatedly reinforced is encoded into long-term memory.

It’s categorized into:

  • Semantic memory: Facts, concepts, and relationships
  • Episodic memory: Personal experiences and events

Unlike vector dumps, these memories form conceptual graphs that grow over time.

Consolidation and Forgetting

At night, the human brain replays, reorganizes, and compresses the day’s experiences. This consolidation decides what to store permanently and what to discard. Meanwhile, forgetting prunes unused pathways, freeing up cognitive space.

Agents can run periodic memory consolidation cycles, deciding what becomes part of core memory, and what moves to long-term memory, and what’s deleted.

The Agentic Memory Architecture

If we think of an agent as a brain, it’s memory shouldn’t just store information, it should understand, reorganize, and evolve with the user. The architecture I envision follows how the human mind processes experience over time, moving from fleeting impressions to stable knowledge, guided by identity.

It consists of five interconnected layers: Sensory Memory, Short-Term Memory, Long-Term Memory, Memory Managers, and Core Memory.

Sensory Memory

Every interaction starts here, the sensory layer. It is responsible for capturing raw input and deciding what even deserves attention.

  • Filters noise using lightweight relevance heuristics or LLM scoring
  • Extract entities, topics, and relations from messages
  • Passes salient content to short-term memory for temporary storage

This step prevents vector database overload and ensures only meaningful units of cognition enter memory.

Short-Term Memory

Once sensory data is filtered, it lives in short-term memory - a buffer where active topics, reasoning traces, and transient context are stored.

  • Maintains continuity across short spans of interaction
  • Group related messages under shared topics
  • If a topic persists across multiple interactions, it’s promoted to long-term memory.

Long-Term Memory

When a topic stabilises, it’s transferred to long-term memory. Here information becomes indexed, semantic, and retrievable across sessions.

  • Stored consolidated knowledge of the user’s world
  • Organizes information by entity, topic, and relationship
  • Enables behavioral reuse and personalization

Memory Managers

This is the agent’s sleep cycle - a periodic background process that keeps memory balanced.

  • Periodically reviews short-term and long-term memory
  • Promotes significant topics to core memory or demotes outdated ones
  • Archives or deletes irrelevant memories to prevent clutter
  • Generates reflections and meta-summaries for reasoning improvement

Core Memory

At the foundation lies core memory, the most persistent layer that defines who the user is. It gives every retrieval and reasoning cycle a consistent perspective.

  • Stores stable facts, personality traits, and behavioral tendencies
  • Evolves slowly as new long-term patterns emerge
  • Shapes how all other memories are interpreted and retrieved

Putting it all together

  1. Input enters the sensory layer, gets filtered and tagged
  2. Short-term memory holds the topics temporarily
  3. If relevance persists, long-term memory encodes it permanently
  4. Memory managers periodically refine, compress, and prune memory
  5. Core memory anchors everything, shaping how the agent perceives, recalls, and adapts

Conclusion

The evolution of agentic memory won’t be about storing more. It will be about thinking better. Once agents can reflect on their own memories, deciding what helped, what didn’t, and what to learn, memory stops being passive storage and becomes an active part of reasoning.

That’s where intelligence begins: not in retrieval, but in reflection.

If you liked the blog, don’t forget to share it and follow me on Twitter, LinkedIn, and Peerlist, and tag me. You may also write to me at guptaamanthan01[at]gmail[dot]com to share your thoughts about the blog.