How to Build a Universal Long-Term Memory Framework for AI Agents Using Mem0 and OpenAI

nimda April 16, 2026

0 13 7 minutes read

How to Build a Universal Long-Term Memory Framework for AI Agents Using Mem0 and OpenAI

In this tutorial, we build a long-term memory layer for AI users to use Mem0OpenAI models, and ChromaDB. We are building a system that can extract structured memories from natural conversations, store them mathematically, retrieve them intelligently, and integrate them directly into the responses of a personal agent. We go beyond simple chat history and use persistent, user-scoped memory with full CRUD control, semantic search, multi-user segmentation, and custom configuration. Finally, we develop a production-ready memory-optimized agent architecture that demonstrates how modern AI systems can reflect on context rather than operate arbitrarily.

!pip install mem0ai openai rich chromadb -q


import os
import getpass
from datetime import datetime


print("=" * 60)
print("🔐  MEM0 Advanced Tutorial — API Key Setup")
print("=" * 60)


OPENAI_API_KEY = getpass.getpass("Enter your OpenAI API key: ")
os.environ["OPENAI_API_KEY"] = OPENAI_API_KEY


print("n✅ API key set!n")


from openai import OpenAI
from mem0 import Memory
from rich.console import Console
from rich.panel import Panel
from rich.table import Table
from rich.markdown import Markdown
from rich import print as rprint
import json


console = Console()
openai_client = OpenAI()


console.rule("[bold cyan]MODULE 1: Basic Memory Setup[/bold cyan]")


memory = Memory()


print(Panel(
   "[green]✓ Memory instance created with default config[/green]n"
   "  • LLM: gpt-4.1-nano (OpenAI)n"
   "  • Vector Store: ChromaDB (local)n"
   "  • Embedder: text-embedding-3-small",
   title="Memory Config", border_style="cyan"
))

We install all necessary dependencies and securely configure our OpenAI API key. We start the Mem0 Memory instance and the OpenAI client and Rich console utilities. We are establishing the foundation of our long-term memory system with automatic configuration powered by ChromaDB and OpenAI embedding.

console.rule("[bold cyan]MODULE 2: Adding & Retrieving Memories[/bold cyan]")


USER_ID = "alice_tutorial"


print("n📝 Adding memories for user:", USER_ID)


conversations = [
   [
       {"role": "user", "content": "Hi! I'm Alice. I'm a software engineer who loves Python and machine learning."},
       {"role": "assistant", "content": "Nice to meet you Alice! Python and ML are great areas to be in."}
   ],
   [
       {"role": "user", "content": "I prefer dark mode in all my IDEs and I use VS Code as my main editor."},
       {"role": "assistant", "content": "Good to know! VS Code with dark mode is a popular combo."}
   ],
   [
       {"role": "user", "content": "I'm currently building a RAG pipeline for my company's internal docs. It's for a fintech startup."},
       {"role": "assistant", "content": "That's exciting! RAG pipelines are really valuable for enterprise use cases."}
   ],
   [
       {"role": "user", "content": "I have a dog named Max and I enjoy hiking on weekends."},
       {"role": "assistant", "content": "Max sounds lovely! Hiking is a great way to recharge."}
   ],
]


results = []
for i, convo in enumerate(conversations):
   result = memory.add(convo, user_id=USER_ID)
   extracted = result.get("results", [])
   for mem in extracted:
       results.append(mem)
   print(f"  Conversation {i+1}: {len(extracted)} memory(ies) extracted")


print(f"n✅ Total memories stored: {len(results)}")

We simulate real-time multi-threaded conversations and store them using Mem0's default memory drain pipeline. We add structured conversational data for a given user and allow LLM to extract valuable long-term facts. We verify how many memories are created, verifying that semantic information is successfully maintained.

console.rule("[bold cyan]MODULE 3: Semantic Search[/bold cyan]")


queries = [
   "What programming languages does the user prefer?",
   "What is Alice working on professionally?",
   "What are Alice's hobbies?",
   "What tools and IDE does Alice use?",
]


for query in queries:
   search_results = memory.search(query=query, user_id=USER_ID, limit=2)
   table = Table(title=f"🔍 Query: {query}", show_lines=True)
   table.add_column("Memory", style="white", max_width=60)
   table.add_column("Score", style="green", justify="center")
  
   for r in search_results.get("results", []):
       score = r.get("score", "N/A")
       score_str = f"{score:.4f}" if isinstance(score, float) else str(score)
       table.add_row(r["memory"], score_str)
  
   console.print(table)
   print()


console.rule("[bold cyan]MODULE 4: CRUD Operations[/bold cyan]")


all_memories = memory.get_all(user_id=USER_ID)
memories_list = all_memories.get("results", [])


print(f"n📚 All memories for '{USER_ID}':")
for i, mem in enumerate(memories_list):
   print(f"  [{i+1}] ID: {mem['id'][:8]}...  →  {mem['memory']}")


if memories_list:
   first_id = memories_list[0]["id"]
   original_text = memories_list[0]["memory"]
  
   print(f"n✏️  Updating memory: '{original_text}'")
   memory.update(memory_id=first_id, data=original_text + " (confirmed)")
  
   updated = memory.get(memory_id=first_id)
   print(f"   After update: '{updated['memory']}'")

We perform semantic search queries to retrieve relevant memories using natural language. We show how Mem0 compares memories stored at the same points and returns the most contextually aligned information. We also perform CRUD operations by listing, updating, and validating database entries.

console.rule("[bold cyan]MODULE 5: Memory-Augmented Chat[/bold cyan]")


def chat_with_memory(user_message: str, user_id: str, session_history: list) -> str:
  
   relevant = memory.search(query=user_message, user_id=user_id, limit=5)
   memory_context = "n".join(
       f"- {r['memory']}" for r in relevant.get("results", [])
   ) or "No relevant memories found."
  
   system_prompt = f"""You are a highly personalized AI assistant.
You have access to long-term memories about this user.


RELEVANT USER MEMORIES:
{memory_context}


Use these memories to provide context-aware, personalized responses.
Be natural — don't explicitly announce that you're using memories."""
  
   messages = [{"role": "system", "content": system_prompt}]
   messages.extend(session_history[-6:])
   messages.append({"role": "user", "content": user_message})
  
   response = openai_client.chat.completions.create(
       model="gpt-4.1-nano-2025-04-14",
       messages=messages
   )
   assistant_response = response.choices[0].message.content
  
   exchange = [
       {"role": "user", "content": user_message},
       {"role": "assistant", "content": assistant_response}
   ]
   memory.add(exchange, user_id=user_id)
  
   session_history.append({"role": "user", "content": user_message})
   session_history.append({"role": "assistant", "content": assistant_response})
  
   return assistant_response




session = []
demo_messages = [
   "Can you recommend a good IDE setup for me?",
   "What kind of project am I currently building at work?",
   "Suggest a weekend activity I might enjoy.",
   "What's a good tech stack for my current project?",
]


print("n🤖 Starting memory-augmented conversation with Alice...n")


for msg in demo_messages:
   print(Panel(f"[bold yellow]User:[/bold yellow] {msg}", border_style="yellow"))
   response = chat_with_memory(msg, USER_ID, session)
   print(Panel(f"[bold green]Assistant:[/bold green] {response}", border_style="green"))
   print()

We create a conversational loop with a complete memory that retrieves relevant memories before generating responses. We dynamically inject personal context into the system's information and store each new exchange back into long-term memory. We simulate a multi-opportunity session to demonstrate ongoing context and personalization in action.

console.rule("[bold cyan]MODULE 6: Multi-User Memory Isolation[/bold cyan]")


USER_BOB = "bob_tutorial"


bob_conversations = [
   [
       {"role": "user", "content": "I'm Bob, a data scientist specializing in computer vision and PyTorch."},
       {"role": "assistant", "content": "Great to meet you Bob!"}
   ],
   [
       {"role": "user", "content": "I prefer Jupyter notebooks over VS Code, and I use Vim keybindings."},
       {"role": "assistant", "content": "Classic setup for data science work!"}
   ],
]


for convo in bob_conversations:
   memory.add(convo, user_id=USER_BOB)


print("n🔐 Testing memory isolation between Alice and Bob:n")


test_query = "What programming tools does this user prefer?"


alice_results = memory.search(query=test_query, user_id=USER_ID, limit=3)
bob_results = memory.search(query=test_query, user_id=USER_BOB, limit=3)


print("👩 Alice's memories:")
for r in alice_results.get("results", []):
   print(f"   • {r['memory']}")


print("n👨 Bob's memories:")
for r in bob_results.get("results", []):
   print(f"   • {r['memory']}")

We demonstrate user-level memory partitioning by introducing a second user with different preferences. We store separate chat data and ensure that searches are always entered on the correct user_id. We ensure that memory namespaces are separated, ensuring secure deployment of multiple user agents.

print("n✅ Memory isolation confirmed — users cannot see each other's data.")


console.rule("[bold cyan]MODULE 7: Custom Configuration[/bold cyan]")


custom_config = {
   "llm": {
       "provider": "openai",
       "config": {
           "model": "gpt-4.1-nano-2025-04-14",
           "temperature": 0.1,
           "max_tokens": 2000,
       }
   },
   "embedder": {
       "provider": "openai",
       "config": {
           "model": "text-embedding-3-small",
       }
   },
   "vector_store": {
       "provider": "chroma",
       "config": {
           "collection_name": "advanced_tutorial_v2",
           "path": "/tmp/chroma_advanced",
       }
   },
   "version": "v1.1"
}


custom_memory = Memory.from_config(custom_config)


print(Panel(
   "[green]✓ Custom memory instance created[/green]n"
   "  • LLM: gpt-4.1-nano with temperature=0.1n"
   "  • Embedder: text-embedding-3-smalln"
   "  • Vector Store: ChromaDB at /tmp/chroma_advancedn"
   "  • Collection: advanced_tutorial_v2",
   title="Custom Config Applied", border_style="magenta"
))


custom_memory.add(
   [{"role": "user", "content": "I'm a researcher studying neural plasticity and brain-computer interfaces."}],
   user_id="researcher_01"
)


result = custom_memory.search("What field does this person work in?", user_id="researcher_01", limit=2)
print("n🔍 Custom memory search result:")
for r in result.get("results", []):
   print(f"   • {r['memory']}")


console.rule("[bold cyan]MODULE 8: Memory History[/bold cyan]")


all_alice = memory.get_all(user_id=USER_ID)
alice_memories = all_alice.get("results", [])


table = Table(title=f"📋 Full Memory Profile: {USER_ID}", show_lines=True, width=90)
table.add_column("#", style="dim", width=3)
table.add_column("Memory ID", style="cyan", width=12)
table.add_column("Memory Content", style="white")
table.add_column("Created At", style="yellow", width=12)


for i, mem in enumerate(alice_memories):
   mem_id = mem["id"][:8] + "..."
   created = mem.get("created_at", "N/A")
   if created and created != "N/A":
       try:
           created = datetime.fromisoformat(created.replace("Z", "+00:00")).strftime("%m/%d %H:%M")
       except:
           created = str(created)[:10]
   table.add_row(str(i+1), mem_id, mem["memory"], created)


console.print(table)


console.rule("[bold cyan]MODULE 9: Memory Deletion[/bold cyan]")


all_mems = memory.get_all(user_id=USER_ID).get("results", [])
if all_mems:
   last_mem = all_mems[-1]
   print(f"n🗑️  Deleting memory: '{last_mem['memory']}'")
   memory.delete(memory_id=last_mem["id"])
  
   updated_count = len(memory.get_all(user_id=USER_ID).get("results", []))
   print(f"✅ Deleted. Remaining memories for {USER_ID}: {updated_count}")


console.rule("[bold cyan]✅ TUTORIAL COMPLETE[/bold cyan]")


summary = """
# 🎓 Mem0 Advanced Tutorial Summary


## What You Learned:
1. **Basic Setup** — Instantiate Memory with default & custom configs
2. **Add Memories** — From conversations (auto-extracted by LLM)
3. **Semantic Search** — Retrieve relevant memories by natural language query
4. **CRUD Operations** — Get, Update, Delete individual memories
5. **Memory-Augmented Chat** — Full pipeline: retrieve → respond → store
6. **Multi-User Isolation** — Separate memory namespaces per user_id
7. **Custom Configuration** — Custom LLM, embedder, and vector store
8. **Memory History** — View full memory profiles with timestamps
9. **Cleanup** — Delete specific or all memories


## Key Concepts:
- `memory.add(messages, user_id=...)`
- `memory.search(query, user_id=...)`
- `memory.get_all(user_id=...)`
- `memory.update(memory_id, data)`
- `memory.delete(memory_id)`
- `Memory.from_config(config)`


## Next Steps:
- Swap ChromaDB for Qdrant, Pinecone, or Weaviate
- Use the hosted Mem0 Platform (app.mem0.ai) for production
- Integrate with LangChain, CrewAI, or LangGraph agents
- Add `agent_id` for agent-level memory scoping
"""


console.print(Markdown(summary))

We create a fully customizable Mem0 configuration with clear parameters for LLM, embedder, and vector store. We test custom memory instances and check memory history, timestamps, and structured profiling. Finally, we demonstrate deletion and cleanup operations, completing the full lifecycle management of long-term agent memory.

In conclusion, we implemented a complete memory infrastructure for AI agents that uses Mem0 as the global memory layer. We've shown how to add, retrieve, update, delete, partition, and customize long-term memories while integrating them into a dynamic conversational loop. We have shown how semantic memory retrieval transforms ordinary assistants into context-aware systems capable of personalization and continuity across sessions. With this foundation in place, we are now equipped to extend the architecture into multi-agent systems, enterprise-level deployments, alternative vector databases, and advanced agent frameworks, turning memory into a core capability rather than an afterthought.

Check out Full Implementation Code and notebook. Also, feel free to follow us Twitter and don't forget to join our 130k+ ML SubReddit and Subscribe to Our newspaper. Wait! are you on telegram? now you can join us on telegram too.

Need to work with us on developing your GitHub Repo OR Hug Face Page OR Product Release OR Webinar etc.? contact us

Source link

nimda April 16, 2026

0 13 7 minutes read

How to Build a Universal Long-Term Memory Framework for AI Agents Using Mem0 and OpenAI

nimda

Leave a Reply Cancel reply

Subscribers, Revenue, Market Share & Global Reach

5-return back to the base

Gemma 3 270m: Model of a hyper-effective compact of AI

One RAG Pipeline, Four Very Different PDFs: Same Four Bricks, Every Answer Typed and Cited

Cut researchers present the work that calls llms: Eliminating SQL relief to improve the accuracy of information and efficiency

OASIS: Simuleringar av social interaction mellan en miljon agent

FALCON 3 models are now available at Amazon Sagemaker Jumpstart

This AI paper introduces codesters: Physical models are symbolic language with code / guide

Meta SAM 2.1 is now available in Amazon SageMaker JumpStart

nimda

Subscribe to our mailing list to get the new updates!

Coding Implementation to Build Multi-Agent AI Systems with SmolAgents Using Coding, Tool Calling, and Dynamic Orchestration

UCSD and AI Research Together Introduce Parcae: A Stable Architecture for Large-Scale Language Models That Achieves the Quality of a Converter Twice the Size

Related Articles

NVIDIA AI Releases Nemotron 3 Embedded: 8B Benchmark Open Embedded Cluster Ranked #1 in RTEB

Moonshot AI Releases Kimi K3: 2.8 Trillion Parameter Open MoE Model With Kimi Delta Attention and 1M Content

OpenAI Details for GPT-Red: An Automated Internal Model of the Red Team Beats Human Red Teams 84% ​​to 13% in Rapid Injection

Patter SDK Guide to Building a Restaurant Reservation Phone Agent with Dynamic Dynamics, Monitors, Timekeeping Dashboards, and Eval Checks