MemU is an agentic memory framework for LLM and AI agent backends. It receive multi-modal inputs, extracts them into memory items, and then organizes and summarizes these items into structured memory files.
Unlike traditional RAG systems that rely solely on embedding-based search, MemU supports non-embedding retrieval through direct file reading. The LLM comprehends natural language memory files directly, enabling deep search by progressively tracking from categories β items β original resources.
MemU offers several convenient ways to get started right away:
-
One call = response + memory π memU Response API: https://memu.pro/docs#responseapi
-
Try it instantly π https://app.memu.so/quick-start
Star MemU to get notified about new releases and join our growing community of AI developers building intelligent agents with persistent memory capabilities.

π¬ Join our Discord community: https://discord.gg/memu
MemU v0.3.0 has been released! This version initializes the memorize and retrieve workflows with the new 3-layer architecture.
Starting from this release, MemU will roll out multiple features in the short- to mid-term:
- Multi-modal enhancements β Support for images, audio, and video
- Intention β Higher-level decision-making and goal management
- Multi-client support β Switch between OpenAI, Deepseek, Gemini, etc.
- Data persistence expansion β Support for Postgres, S3, DynamoDB
- Benchmark tools β Test agent performance and memory efficiency
- β¦β¦
- memU-ui β The web frontend for MemU, providing developers with an intuitive and visual interface
- memU-server β Powers memU-ui with reliable data support, ensuring efficient reading, writing, and maintenance of agent memories
Most memory systems in current LLM pipelines rely heavily on explicit modeling, requiring manual definition and annotation of memory categories. This limits AIβs ability to truly understand memory and makes it difficult to support diverse usage scenarios.
MemU offers a flexible and robust alternative, inspired by hierarchical storage architecture in computer systems. It progressively transforms heterogeneous input data into queryable and interpretable textual memory.
Its core architecture consists of three layers: Resource Layer β Memory Item Layer β MemoryCategory Layer.
- Resource Layer: Multimodal raw data warehouse
- Memory Item Layer: Discrete extracted memory units
- MemoryCategory Layer: Aggregated textual memory units
Key Features:
- Full Traceability: Track from raw data β items β documents and back
- Memory Lifecycle: Memorization β Retrieval β Self-evolution
- Two Retrieval Methods:
- RAG-based: Fast embedding vector search
- LLM-based: Direct file reading with deep semantic understanding
- Self-Evolving: Adapts memory structure based on usage patterns
pip install memu-py
β οΈ Important: Ensure you have Python 3.14+
β οΈ Important: Replace"your-openai-api-key"with your actual OpenAI API key to use the service.
from memu.app import MemoryService
import os
async def main():
api_key = "your-openai-api-key"
file_path = os.path.abspath("tests/example/example_conversation.json")
# Initialize service with RAG method
service_rag = MemoryService(
llm_config={"api_key": api_key},
retrieve_config={"method": "rag"}
)
# Memorize
memory = await service_rag.memorize(resource_url=file_path, modality="conversation")
for cat in memory.get('categories', []):
print(f" - {cat.get('name')}: {(cat.get('summary') or '')[:80]}...")
queries = [
{"role": "user", "content": {"text": "Tell me about preferences"}},
{"role": "user", "content": {"text": "What are their habits?"}}
]
# RAG-based retrieval
print("\n[RETRIEVED - RAG]")
result_rag = await service_rag.retrieve(queries=queries)
for item in result_rag.get('items', [])[:3]:
print(f" - [{item.get('memory_type')}] {item.get('summary', '')[:100]}...")
# Initialize service with LLM method (reuse same memory store)
service_llm = MemoryService(
llm_config={"api_key": api_key},
retrieve_config={"method": "llm"}
)
service_llm.store = service_rag.store # Reuse memory store
# LLM-based retrieval
print("\n[RETRIEVED - LLM]")
result_llm = await service_llm.retrieve(queries=queries)
for item in result_llm.get('items', [])[:3]:
print(f" - [{item.get('memory_type')}] {item.get('summary', '')[:100]}...")
if __name__ == "__main__":
import asyncio
asyncio.run(main())RAG-based (method="rag"): Fast embedding vector search for large-scale data
LLM-based (method="llm"): Deep semantic understanding through direct file reading
Both support:
- Context-aware rewriting: Resolves pronouns using conversation history
- Progressive search: Categories β Items β Resources
- Next-step suggestions: Iterative multi-turn retrieval
MemU provides practical examples demonstrating different memory extraction and organization scenarios. Each example showcases a specific use case with real-world applications.
Extract and organize memory from multi-turn conversations. Perfect for:
- Personal AI assistants that remember user preferences and history
- Customer support bots maintaining conversation context
- Social chatbots building user profiles over time
Example: Process multiple conversation files and automatically categorize memories into personal_info, preferences, work_life, relationships, etc.
export OPENAI_API_KEY=your_api_key
python examples/example_1_conversation_memory.pyWhat it does:
- Processes conversation JSON files
- Extracts memory items (preferences, habits, opinions)
- Organizes into structured categories
- Generates readable markdown files for each category
Extract skills and lessons learned from agent execution logs. Ideal for:
- DevOps teams learning from deployment experiences
- Agent systems improving through iterative execution
- Knowledge management from operational logs
Example: Process deployment logs incrementally, learning from each attempt to build a comprehensive skill guide.
export OPENAI_API_KEY=your_api_key
python examples/example_2_skill_extraction.pyWhat it does:
- Processes agent logs sequentially
- Extracts actions, outcomes, and lessons learned
- Demonstrates incremental learning (memory evolves with each file)
- Generates evolving skill guides (log_1.md β log_2.md β log_3.md β skill.md)
Key Feature: Shows MemU's core strength - continuous memory updates. Each file updates existing memory, and category summaries evolve progressively.
Process diverse content types (documents, images, videos) into unified memory. Great for:
- Documentation systems processing mixed media
- Learning platforms combining text and visual content
- Research tools analyzing multimodal data
Example: Process technical documents and architecture diagrams together, creating unified memory categories.
export OPENAI_API_KEY=your_api_key
python examples/example_3_multimodal_memory.pyWhat it does:
- Processes multiple modalities (text documents, images)
- Extracts memory from different content types
- Unifies memories into cross-modal categories
- Creates organized documentation (technical_documentation, architecture_concepts, code_examples, visual_diagrams)
By contributing to MemU, you agree that your contributions will be licensed under the Apache License 2.0.
For more information please contact [email protected]
-
GitHub Issues: Report bugs, request features, and track development. Submit an issue
-
Discord: Get real-time support, chat with the community, and stay updated. Join us
-
X (Twitter): Follow for updates, AI insights, and key announcements. Follow us
We're proud to work with amazing organizations:
Interested in partnering with MemU? Contact us at [email protected]
