Generative AI

How to Build a Universal Long-Term Memory Framework for AI Agents Using Mem0 and OpenAI

In this tutorial, we build a long-term memory layer for AI users to use Mem0OpenAI models, and ChromaDB. We are building a system that can extract structured memories from natural conversations, store them mathematically, retrieve them intelligently, and integrate them directly into the responses of a personal agent. We go beyond simple chat history and use persistent, user-scoped memory with full CRUD control, semantic search, multi-user segmentation, and custom configuration. Finally, we develop a production-ready memory-optimized agent architecture that demonstrates how modern AI systems can reflect on context rather than operate arbitrarily.

!pip install mem0ai openai rich chromadb -q


import os
import getpass
from datetime import datetime


print("=" * 60)
print("🔐  MEM0 Advanced Tutorial — API Key Setup")
print("=" * 60)


OPENAI_API_KEY = getpass.getpass("Enter your OpenAI API key: ")
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY


print("n✅ API key set!n")


from openai import OpenAI
from mem0 import Memory
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from rich.markdown import Markdown
from rich import print as rprint
import json


console = Console()
openai_client = OpenAI()


console.rule("[bold cyan]MODULE 1: Basic Memory Setup[/bold cyan]")


memory = Memory()


print(Panel(
   "[green]✓ Memory instance created with default config[/green]n"
   "  • LLM: gpt-4.1-nano (OpenAI)n"
   "  • Vector Store: ChromaDB (local)n"
   "  • Embedder: text-embedding-3-small",
   title="Memory Config", border_style="cyan"
))

We install all necessary dependencies and securely configure our OpenAI API key. We start the Mem0 Memory instance and the OpenAI client and Rich console utilities. We are establishing the foundation of our long-term memory system with automatic configuration powered by ChromaDB and OpenAI embedding.

console.rule("[bold cyan]MODULE 2: Adding & Retrieving Memories[/bold cyan]")


USER_ID = "alice_tutorial"


print("n📝 Adding memories for user:", USER_ID)


conversations = [
   [
       {"role": "user", "content": "Hi! I'm Alice. I'm a software engineer who loves Python and machine learning."},
       {"role": "assistant", "content": "Nice to meet you Alice! Python and ML are great areas to be in."}
   ],
   [
       {"role": "user", "content": "I prefer dark mode in all my IDEs and I use VS Code as my main editor."},
       {"role": "assistant", "content": "Good to know! VS Code with dark mode is a popular combo."}
   ],
   [
       {"role": "user", "content": "I'm currently building a RAG pipeline for my company's internal docs. It's for a fintech startup."},
       {"role": "assistant", "content": "That's exciting! RAG pipelines are really valuable for enterprise use cases."}
   ],
   [
       {"role": "user", "content": "I have a dog named Max and I enjoy hiking on weekends."},
       {"role": "assistant", "content": "Max sounds lovely! Hiking is a great way to recharge."}
   ],
]


results = []
for i, convo in enumerate(conversations):
   result = memory.add(convo, user_id=USER_ID)
   extracted = result.get("results", [])
   for mem in extracted:
       results.append(mem)
   print(f"  Conversation {i+1}: {len(extracted)} memory(ies) extracted")


print(f"n✅ Total memories stored: {len(results)}")

We simulate real-time multi-threaded conversations and store them using Mem0's default memory drain pipeline. We add structured conversational data for a given user and allow LLM to extract valuable long-term facts. We verify how many memories are created, verifying that semantic information is successfully maintained.

console.rule("[bold cyan]MODULE 3: Semantic Search[/bold cyan]")


queries = [
   "What programming languages does the user prefer?",
   "What is Alice working on professionally?",
   "What are Alice's hobbies?",
   "What tools and IDE does Alice use?",
]


for query in queries:
   search_results = memory.search(query=query, user_id=USER_ID, limit=2)
   table = Table(title=f"🔍 Query: {query}", show_lines=True)
   table.add_column("Memory", style="white", max_width=60)
   table.add_column("Score", style="green", justify="center")
  
   for r in search_results.get("results", []):
       score = r.get("score", "N/A")
       score_str = f"{score:.4f}" if isinstance(score, float) else str(score)
       table.add_row(r["memory"], score_str)
  
   console.print(table)
   print()


console.rule("[bold cyan]MODULE 4: CRUD Operations[/bold cyan]")


all_memories = memory.get_all(user_id=USER_ID)
memories_list = all_memories.get("results", [])


print(f"n📚 All memories for '{USER_ID}':")
for i, mem in enumerate(memories_list):
   print(f"  [{i+1}] ID: {mem['id'][:8]}...  →  {mem['memory']}")


if memories_list:
   first_id = memories_list[0]["id"]
   original_text = memories_list[0]["memory"]
  
   print(f"n✏️  Updating memory: '{original_text}'")
   memory.update(memory_id=first_id, data=original_text + " (confirmed)")
  
   updated = memory.get(memory_id=first_id)
   print(f"   After update: '{updated['memory']}'")

We perform semantic search queries to retrieve relevant memories using natural language. We show how Mem0 compares memories stored at the same points and returns the most contextually aligned information. We also perform CRUD operations by listing, updating, and validating database entries.

console.rule("[bold cyan]MODULE 5: Memory-Augmented Chat[/bold cyan]")


def chat_with_memory(user_message: str, user_id: str, session_history: list) -> str:
  
   relevant = memory.search(query=user_message, user_id=user_id, limit=5)
   memory_context = "n".join(
       f"- {r['memory']}" for r in relevant.get("results", [])
   ) or "No relevant memories found."
  
   system_prompt = f"""You are a highly personalized AI assistant.
You have access to long-term memories about this user.


RELEVANT USER MEMORIES:
{memory_context}


Use these memories to provide context-aware, personalized responses.
Be natural — don't explicitly announce that you're using memories."""
  
   messages = [{"role": "system", "content": system_prompt}]
   messages.extend(session_history[-6:])
   messages.append({"role": "user", "content": user_message})
  
   response = openai_client.chat.completions.create(
       model="gpt-4.1-nano-2025-04-14",
       messages=messages
   )
   assistant_response = response.choices[0].message.content
  
   exchange = [
       {"role": "user", "content": user_message},
       {"role": "assistant", "content": assistant_response}
   ]
   memory.add(exchange, user_id=user_id)
  
   session_history.append({"role": "user", "content": user_message})
   session_history.append({"role": "assistant", "content": assistant_response})
  
   return assistant_response




session = []
demo_messages = [
   "Can you recommend a good IDE setup for me?",
   "What kind of project am I currently building at work?",
   "Suggest a weekend activity I might enjoy.",
   "What's a good tech stack for my current project?",
]


print("n🤖 Starting memory-augmented conversation with Alice...n")


for msg in demo_messages:
   print(Panel(f"[bold yellow]User:[/bold yellow] {msg}", border_style="yellow"))
   response = chat_with_memory(msg, USER_ID, session)
   print(Panel(f"[bold green]Assistant:[/bold green] {response}", border_style="green"))
   print()

We create a conversational loop with a complete memory that retrieves relevant memories before generating responses. We dynamically inject personal context into the system's information and store each new exchange back into long-term memory. We simulate a multi-opportunity session to demonstrate ongoing context and personalization in action.

console.rule("[bold cyan]MODULE 6: Multi-User Memory Isolation[/bold cyan]")


USER_BOB = "bob_tutorial"


bob_conversations = [
   [
       {"role": "user", "content": "I'm Bob, a data scientist specializing in computer vision and PyTorch."},
       {"role": "assistant", "content": "Great to meet you Bob!"}
   ],
   [
       {"role": "user", "content": "I prefer Jupyter notebooks over VS Code, and I use Vim keybindings."},
       {"role": "assistant", "content": "Classic setup for data science work!"}
   ],
]


for convo in bob_conversations:
   memory.add(convo, user_id=USER_BOB)


print("n🔐 Testing memory isolation between Alice and Bob:n")


test_query = "What programming tools does this user prefer?"


alice_results = memory.search(query=test_query, user_id=USER_ID, limit=3)
bob_results = memory.search(query=test_query, user_id=USER_BOB, limit=3)


print("👩 Alice's memories:")
for r in alice_results.get("results", []):
   print(f"   • {r['memory']}")


print("n👨 Bob's memories:")
for r in bob_results.get("results", []):
   print(f"   • {r['memory']}")

We demonstrate user-level memory partitioning by introducing a second user with different preferences. We store separate chat data and ensure that searches are always entered on the correct user_id. We ensure that memory namespaces are separated, ensuring secure deployment of multiple user agents.

print("n✅ Memory isolation confirmed — users cannot see each other's data.")


console.rule("[bold cyan]MODULE 7: Custom Configuration[/bold cyan]")


custom_config = {
   "llm": {
       "provider": "openai",
       "config": {
           "model": "gpt-4.1-nano-2025-04-14",
           "temperature": 0.1,
           "max_tokens": 2000,
       }
   },
   "embedder": {
       "provider": "openai",
       "config": {
           "model": "text-embedding-3-small",
       }
   },
   "vector_store": {
       "provider": "chroma",
       "config": {
           "collection_name": "advanced_tutorial_v2",
           "path": "/tmp/chroma_advanced",
       }
   },
   "version": "v1.1"
}


custom_memory = Memory.from_config(custom_config)


print(Panel(
   "[green]✓ Custom memory instance created[/green]n"
   "  • LLM: gpt-4.1-nano with temperature=0.1n"
   "  • Embedder: text-embedding-3-smalln"
   "  • Vector Store: ChromaDB at /tmp/chroma_advancedn"
   "  • Collection: advanced_tutorial_v2",
   title="Custom Config Applied", border_style="magenta"
))


custom_memory.add(
   [{"role": "user", "content": "I'm a researcher studying neural plasticity and brain-computer interfaces."}],
   user_id="researcher_01"
)


result = custom_memory.search("What field does this person work in?", user_id="researcher_01", limit=2)
print("n🔍 Custom memory search result:")
for r in result.get("results", []):
   print(f"   • {r['memory']}")


console.rule("[bold cyan]MODULE 8: Memory History[/bold cyan]")


all_alice = memory.get_all(user_id=USER_ID)
alice_memories = all_alice.get("results", [])


table = Table(title=f"📋 Full Memory Profile: {USER_ID}", show_lines=True, width=90)
table.add_column("#", style="dim", width=3)
table.add_column("Memory ID", style="cyan", width=12)
table.add_column("Memory Content", style="white")
table.add_column("Created At", style="yellow", width=12)


for i, mem in enumerate(alice_memories):
   mem_id = mem["id"][:8] + "..."
   created = mem.get("created_at", "N/A")
   if created and created != "N/A":
       try:
           created = datetime.fromisoformat(created.replace("Z", "+00:00")).strftime("%m/%d %H:%M")
       except:
           created = str(created)[:10]
   table.add_row(str(i+1), mem_id, mem["memory"], created)


console.print(table)


console.rule("[bold cyan]MODULE 9: Memory Deletion[/bold cyan]")


all_mems = memory.get_all(user_id=USER_ID).get("results", [])
if all_mems:
   last_mem = all_mems[-1]
   print(f"n🗑️  Deleting memory: '{last_mem['memory']}'")
   memory.delete(memory_id=last_mem["id"])
  
   updated_count = len(memory.get_all(user_id=USER_ID).get("results", []))
   print(f"✅ Deleted. Remaining memories for {USER_ID}: {updated_count}")


console.rule("[bold cyan]✅ TUTORIAL COMPLETE[/bold cyan]")


summary = """
# 🎓 Mem0 Advanced Tutorial Summary


## What You Learned:
1. **Basic Setup** — Instantiate Memory with default & custom configs
2. **Add Memories** — From conversations (auto-extracted by LLM)
3. **Semantic Search** — Retrieve relevant memories by natural language query
4. **CRUD Operations** — Get, Update, Delete individual memories
5. **Memory-Augmented Chat** — Full pipeline: retrieve → respond → store
6. **Multi-User Isolation** — Separate memory namespaces per user_id
7. **Custom Configuration** — Custom LLM, embedder, and vector store
8. **Memory History** — View full memory profiles with timestamps
9. **Cleanup** — Delete specific or all memories


## Key Concepts:
- `memory.add(messages, user_id=...)`
- `memory.search(query, user_id=...)`
- `memory.get_all(user_id=...)`
- `memory.update(memory_id, data)`
- `memory.delete(memory_id)`
- `Memory.from_config(config)`


## Next Steps:
- Swap ChromaDB for Qdrant, Pinecone, or Weaviate
- Use the hosted Mem0 Platform (app.mem0.ai) for production
- Integrate with LangChain, CrewAI, or LangGraph agents
- Add `agent_id` for agent-level memory scoping
"""


console.print(Markdown(summary))

We create a fully customizable Mem0 configuration with clear parameters for LLM, embedder, and vector store. We test custom memory instances and check memory history, timestamps, and structured profiling. Finally, we demonstrate deletion and cleanup operations, completing the full lifecycle management of long-term agent memory.

In conclusion, we implemented a complete memory infrastructure for AI agents that uses Mem0 as the global memory layer. We've shown how to add, retrieve, update, delete, partition, and customize long-term memories while integrating them into a dynamic conversational loop. We have shown how semantic memory retrieval transforms ordinary assistants into context-aware systems capable of personalization and continuity across sessions. With this foundation in place, we are now equipped to extend the architecture into multi-agent systems, enterprise-level deployments, alternative vector databases, and advanced agent frameworks, turning memory into a core capability rather than an afterthought.


Check out Full Implementation Code and notebook. Also, feel free to follow us Twitter and don't forget to join our 130k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us


Source link

Related Articles

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to top button