Metadata-Version: 2.4
Name: omem-os
Version: 1.0.0
Summary: AI Memory Operating System — Graph-RAG, temporal truth maintenance, actionable schemas, selective encryption, sub-200ms hybrid retrieval.
Author-email: Mohit Kumar Rajbadi <mohitkumarrajbadi@gmail.com>
License-Expression: MIT
Project-URL: Homepage, https://github.com/mohitkumarrajbadi/omem
Project-URL: Repository, https://github.com/mohitkumarrajbadi/omem
Project-URL: Issues, https://github.com/mohitkumarrajbadi/omem/issues
Project-URL: Changelog, https://github.com/mohitkumarrajbadi/omem/releases
Keywords: ai,memory,rag,vector-search,embeddings,llm,agents,memory-os,compression,reflection,importance,multi-agent
Classifier: Development Status :: 5 - Production/Stable
Classifier: Intended Audience :: Developers
Classifier: Programming Language :: Python :: 3
Classifier: Programming Language :: Python :: 3.9
Classifier: Programming Language :: Python :: 3.10
Classifier: Programming Language :: Python :: 3.11
Classifier: Programming Language :: Python :: 3.12
Classifier: Programming Language :: Python :: 3.13
Classifier: Topic :: Scientific/Engineering :: Artificial Intelligence
Requires-Python: >=3.9
Description-Content-Type: text/markdown
License-File: LICENSE
Requires-Dist: numpy<2.0.0,>=1.24.0
Requires-Dist: faiss-cpu>=1.7.4
Requires-Dist: click>=8.0.0
Requires-Dist: numba>=0.58.0
Requires-Dist: xxhash>=3.0.0
Requires-Dist: mcp>=1.0.0
Requires-Dist: psycopg2-binary>=2.9.0
Provides-Extra: embeddings
Requires-Dist: sentence-transformers>=2.2.0; extra == "embeddings"
Provides-Extra: langchain
Requires-Dist: langchain>=0.1.0; extra == "langchain"
Provides-Extra: external
Requires-Dist: openai>=1.0.0; extra == "external"
Provides-Extra: dev
Requires-Dist: pytest>=7.0; extra == "dev"
Requires-Dist: pytest-cov>=4.0; extra == "dev"
Provides-Extra: all
Requires-Dist: sentence-transformers>=2.2.0; extra == "all"
Requires-Dist: langchain>=0.1.0; extra == "all"
Requires-Dist: numba>=0.58.0; extra == "all"
Requires-Dist: pytest>=7.0; extra == "all"
Requires-Dist: pytest-cov>=4.0; extra == "all"
Dynamic: license-file

<div align="center">

<img src="https://img.shields.io/badge/version-1.0.0-blueviolet?style=for-the-badge" alt="Version">
<img src="https://img.shields.io/badge/python-3.9%2B-blue?style=for-the-badge&logo=python" alt="Python">
<img src="https://img.shields.io/badge/rust-core-orange?style=for-the-badge&logo=rust" alt="Rust">
<img src="https://img.shields.io/badge/license-MIT-green?style=for-the-badge" alt="License">
<img src="https://img.shields.io/badge/MCP-compatible-purple?style=for-the-badge" alt="MCP">

<br><br>

# OMem
### The Memory Operating System for AI Agents

**Persistent · Intelligent · Blazing Fast**

*Give your AI the memory it deserves — one that learns, forgets, and thinks.*

<br>

[**Quick Start**](#quick-start) · [**Benchmarks**](#benchmarks) · [**MCP / Claude Desktop**](#integrations) · [**CLI**](#cli-reference) · [**Docs**](./DEVELOPER.md)

</div>

---

## The Problem with AI Memory Today

Your agent is brilliant in the moment — but the second the conversation ends, it's gone. You've tried:

- 🗃 **Vector databases** — Dumb storage. No lifecycle. No importance. Returns noise.
- 📜 **Long context windows** — Expensive. Slow. Hits limits. Drowns your agent in irrelevant history.
- 💾 **Conversation buffers** — Grows forever. Can't handle multi-session continuity.

**None of these are memory systems. They're storage systems.**

---

## OMem is Different

OMem is a **Memory Operating System** — a complete cognitive layer that mirrors how intelligent systems *actually* remember:

```
Store everything  →  Classify what matters  →  Retrieve what's relevant
Compress noise    →  Forget the useless      →  Resolve contradictions
```

It's not a database with a retrieval wrapper. It's a brain.

---

## Benchmarks

> *Tested on Apple M-series. Dataset: 5,000 memories, 500 queries, `all-MiniLM-L6-v2` embedding model — shared identically across all systems for a fair comparison.*

### ⚡ Head-to-Head Performance

| System | Setup | Add (ops/s) | RAG (ops/s) | RAG p99 |
| :--- | ---: | ---: | ---: | ---: |
| **OMem** | **4.0 ms** | **65 †** | **292** | **20 ms** |
| ChromaDB | 507 ms | 277 ‡ | 280 | 4 ms |
| LanceDB | 8 ms | 82,000 ‡ | 182 | 7 ms |
| **Mem0** | **15,000+ ms** | **< 1** | **18** | **638 ms** |

> **† Smart Ingestion** — OMem's `add()` performs: `embed → auto-classify → dedup-check → entity-graph sync → async persist`. ChromaDB and LanceDB store pre-computed vectors only. We do the heavy lifting so your agent doesn't have to.
>
> **‡ Raw storage** — No classification, no deduplication, no graph linking.

### 🏆 Why OMem Wins Where It Counts

| Metric | OMem vs Mem0 | OMem vs ChromaDB | OMem vs LanceDB |
|---|---|---|---|
| RAG throughput | **16× faster** | **1.0× (parity)** | **1.6× faster** |
| p50 recall | **0.007 ms** | 3.5 ms | 5.3 ms |
| Setup time | **125× faster** | **127× faster** | parity |
| Smart features | ✅ All 9 | ❌ 0/9 | ❌ 0/9 |

**The critical insight:** Mem0 is 16× slower because it runs an LLM extraction pipeline on every add. OMem replaces that with a Rust-native classification engine — zero LLM calls, zero API costs, zero network latency.

### 🧩 Feature Matrix

| Feature | OMem | ChromaDB | Mem0 | LanceDB |
| :--- | :---: | :---: | :---: | :---: |
| Auto-Classification | ✅ | ❌ | ❌ | ❌ |
| Causal Graphs | ✅ | ❌ | ❌ | ❌ |
| Hybrid RAG (vector + keyword + recency + importance) | ✅ | ❌ | ❌ | ❌ |
| Forgetting & Decay | ✅ | ❌ | ❌ | ❌ |
| Memory Compression | ✅ | ❌ | ❌ | ❌ |
| Conflict Detection & TMS | ✅ | ❌ | ❌ | ❌ |
| CLI Tools | ✅ | ❌ | ❌ | ❌ |
| Zero Config | ✅ | ✅ | ❌ | ✅ |
| MCP Server (Claude/Cursor) | ✅ | ❌ | ❌ | ❌ |

---

## Quick Start

### Installation

```bash
# Clone
git clone https://github.com/mohitkumarrajbadi/omem
cd omem

# Install
SETUPTOOLS_USE_DISTUTILS=stdlib pip install -e .

# Verify
omem health
```

> **macOS / Anaconda users** — add to `~/.zshrc` once:
> ```bash
> export KMP_DUPLICATE_LIB_OK=TRUE
> export HF_HUB_OFFLINE=1
> ```

### 60-Second Example

```python
from omem import OMem

brain = OMem()

# Add memories — type and importance are detected automatically
brain.add("User prefers dark mode and Python for all backend work")
brain.add("Critical bug: race condition in payment module causes duplicate charges", importance=0.95)
brain.add("Architecture decision: migrated from REST to GraphQL for better performance")

# Retrieve what's relevant — not everything
results = brain.recall("What bugs do we have?")
print(results[0].content)
# → "Critical bug: race condition in payment module..."

# Understand exactly why this memory was returned
for exp in brain.inspect("payment bugs"):
    print(exp.explain())
# → vector=0.91, keyword=0.85, recency=0.94, importance=1.5x boost
```

### The Sleep Cycle — Let Your Agent Dream

```python
# After hours of operation, consolidate redundant memories
brain.add("User clicked login button")
brain.add("User pressed sign-in")
brain.add("User tapped the login link")

result = brain.sleep()
# → compressed: 3 → 1 ("User repeatedly accessed login (3 instances)")
# → forgotten: 12 low-value memories removed
# → reflected: 4 new insights generated
```

---

## How It Works

```
┌─────────────────────────────────────────────────────────┐
│            Your Agent  /  Claude  /  Cursor              │
└──────────────────────────┬──────────────────────────────┘
                           │  MCP or Python SDK
                           ▼
┌─────────────────────────────────────────────────────────┐
│                    OMem Unified API                      │
│        add · recall · sleep · inspect · serve           │
└────────────┬───────────────────────────┬────────────────┘
             │                           │
             ▼                           ▼
┌─────────────────────┐     ┌────────────────────────────┐
│     Rust Core       │     │        Brain Logic          │
│                     │     │                            │
│  • SIMD scoring     │     │  • Auto-classification     │
│  • FAISS HNSW       │     │  • Importance estimation   │
│  • Hybrid ranking   │     │  • Forgetting & decay      │
│  • Write buffer     │     │  • Reflection & compress   │
│  • RW lock          │     │  • Conflict TMS            │
└─────────────────────┘     └────────────────────────────┘
             │                           │
             └─────────────┬─────────────┘
                           ▼
             ┌──────────────────────────┐
             │  SQLite · PostgreSQL     │
             │  FAISS · Knowledge Graph │
             └──────────────────────────┘
```

### The Retrieval Pipeline

Every `recall()` call combines **4 signals in a single SIMD pass**:

```
Final Score = (0.50 × vector_similarity)
            + (0.20 × keyword_overlap)
            + (0.15 × recency_decay)
            + (0.15 × importance_weight)
            × status_multiplier
```

Then optionally expanded via **Graph-RAG**: top results are linked to related entities in the knowledge graph, surfacing connected memories that pure vector search would miss.

---

## Real-World Usage

### Customer Support Agent

```python
from omem import OMem

memory = OMem(namespace="support")

# Store rich customer context
memory.add("Customer John (john@acme.com) reported dashboard timeout on mobile Safari")
memory.add("Acme Corp is on Enterprise plan, SOC2 required by Q3")

# Later — retrieve with filters
context = memory.recall(
    "mobile issues Acme",
    context_type="bugs",    # boost bug-type memories
    time_range="recent",    # prioritize last 3 days
    k=5
)
```

### Multi-Agent System

```python
# Each agent is fully isolated
researcher = OMem(namespace="researcher")
writer     = OMem(namespace="writer")

researcher.add("Study shows 40% retention improvement with personalized onboarding")

# No cross-namespace leakage
writer.recall("retention")       # → []

# Global search when needed
researcher.recall("retention", project_only=False)  # → finds it
```

### Conflict Detection

```python
brain.add("Python version: 3.9")
brain.add("Python version: 3.11")  # → auto-flagged as CONFLICTED

brain.resolve_conflict("Python version")
# → resolves in favor of most recent, deprecates the old one
```

---

## Integrations

### Claude Desktop & Cursor (MCP Server) ⭐

```bash
omem serve   # starts the MCP stdio server
```

Add to `claude_desktop_config.json`:

```json
{
  "mcpServers": {
    "omem": {
      "command": "omem",
      "args": ["serve"]
    }
  }
}
```

**What your AI gets:**

| Tool | What it does |
|---|---|
| `remember` | Store a fact, decision, or preference |
| `recall` | Semantic search with type and time filters |
| `reflect` | Generate high-level insights from memory |
| `maintain` | Compress, forget, and optimize memory |
| `resolve_conflict` | Detect and fix contradictions |
| `summarize_state` | Get a project architecture overview |

**Addressing a common concern:**

> *"Won't injecting memory into every prompt bloat my context?"*

No. OMem is a **retrieval layer**, not an injection layer. From 5,000 memories, it returns **3–5 targeted results (~200–500 tokens)**. That's 97% less context than a naive approach — while giving the agent exactly what it needs.

### LangChain

```python
from omem.integrations.langchain import OMemRetriever

retriever = OMemRetriever(omem_instance=brain)
chain = RetrievalQA.from_chain_type(llm=llm, retriever=retriever)
```

---

## CLI Reference

```bash
# Setup
omem init                         # initialize at ~/.omem/brain.db
omem health                       # system health check

# Write
omem add "content" -i 0.9 -n myproject -t DECISION

# Read
omem search "query" -k 10 -c architecture -t recent
omem list -n myproject -t DECISION -l 50
omem inspect "query"              # debug retrieval scoring
omem stats && omem namespaces

# Maintenance
omem maintain --all               # compress + reflect + forget + dream

# Import / Export
omem export -f json -o dump.json
omem load dump.json -n myproject

# Integrations
omem serve                        # MCP server for Claude / Cursor
omem dashboard --port 7900        # web memory dashboard
omem demo                         # end-to-end interactive walkthrough
omem benchmark --n 10000          # performance test
```

---

## Architecture Details

### Memory Types

OMem auto-classifies every memory on ingestion:

| Type | Examples |
|---|---|
| `SEMANTIC` | Facts, general knowledge |
| `DECISION` | Choices made, preferences |
| `CAUSAL` | Bug root causes, cause-effect chains |
| `PROCEDURAL` | How-to steps, workflows |
| `EPISODIC` | Events, experiences |
| `REFLECTION` | AI-generated insights |
| `ACTIVE` | Critical / urgent items |
| `WORKING` | Temporary, current-task context |

### Scoring Signals

```
vector_similarity   — semantic closeness to query (FAISS HNSW)
keyword_overlap     — token-level BM25-style matching
recency_decay       — exponential half-life decay over time
importance_weight   — auto-scored + access-frequency boosted
status_multiplier   — CONFLICTED memories penalized, DEPRECATED skipped
```

### Storage

| Backend | Use Case |
|---|---|
| SQLite (default) | Local, single-process, zero config |
| In-memory | Testing, ephemeral agents |
| PostgreSQL | Production, multi-process, distributed |

---

## Configuration

```python
brain = OMem(
    backend="sqlite",              # "sqlite" | "memory" | "postgres"
    db_path="~/.omem/brain.db",   # custom path
    model="all-MiniLM-L6-v2",     # embedding model
    embedding_provider="local",    # "local" | "openai"
)
```

Environment variables:

```bash
HF_HUB_OFFLINE=1              # disable HuggingFace Hub network checks (faster startup)
KMP_DUPLICATE_LIB_OK=TRUE     # fix OpenMP conflict on macOS/Anaconda
TOKENIZERS_PARALLELISM=false  # suppress tokenizer warning
```

---

## Roadmap

| Status | Feature |
|---|---|
| ✅ Released | Hybrid RAG, Auto-classification, Forgetting, Compression, MCP Server |
| ✅ Released | Truth Maintenance System, Knowledge Graph, Graph-RAG |
| ✅ Released | PostgreSQL backend, CLI, Dashboard |
| 🔄 In Progress | LOCOMO benchmark validation, distributed mode |
| 📅 Planned | Custom embedding providers (OpenAI, Cohere), Memory versioning |

---

## FAQ

**Q: Does this run an LLM internally?**  
A: No. Classification and importance scoring use lightweight heuristics and a small (~90MB) embedding model. No LLM API calls, no external dependencies, no costs.

**Q: How is this different from ChromaDB or Pinecone?**  
A: Those are vector storage systems. OMem is a memory *operating system* — with lifecycle (importance → decay → forget), deduplication, conflict detection, knowledge graphs, and a cognitive maintenance cycle.

**Q: Will it bloat my agent's context window?**  
A: The opposite. OMem retrieves 3–5 relevant memories per query (~300 tokens) instead of injecting your entire history. See the [Context FAQ](./DEVELOPER.md#memory-layer-faq--does-it-bloat-context).

**Q: Is it production-ready?**  
A: v1.0.0 is stable for production workloads. The SQLite backend handles hundreds of thousands of memories. PostgreSQL backend available for multi-process deployments.

**Q: What about privacy?**  
A: Everything runs 100% locally by default. Your memories never leave your machine. PostgreSQL backend is self-hosted.

**Q: Do I need Rust installed?**  
A: Only if you want the SIMD-accelerated scoring path. The pure-Python path works out of the box and is still competitive.

---

## Contributing

```bash
git clone https://github.com/mohitkumarrajbadi/omem
cd omem
python -m venv .venv && source .venv/bin/activate
SETUPTOOLS_USE_DISTUTILS=stdlib pip install -e ".[dev]"
pytest tests/ -v
python benchmarks/competitor.py   # run head-to-head benchmarks
```

See [DEVELOPER.md](./DEVELOPER.md) for architecture, CLI reference, and contribution guidelines.

---

## License

MIT — see [LICENSE](./LICENSE)

---

<div align="center">

**Built for the AI developer community**

*If OMem makes your agents smarter, give it a ⭐*

[Report Bug](https://github.com/mohitkumarrajbadi/omem/issues) · [Request Feature](https://github.com/mohitkumarrajbadi/omem/issues) · [Discussions](https://github.com/mohitkumarrajbadi/omem/discussions)

</div>
