Skip to main content
Mem0 mirrors the OpenAI client interface so you can plug memories into existing chat-completion code with minimal changes. Point your OpenAI-compatible client at Mem0, keep the same request shape, and gain persistent memory between calls.
You’ll use this when…
  • Your app already relies on OpenAI chat completions and you want Mem0 to feel familiar.
  • You need to reuse existing middleware that expects OpenAI-compatible responses.
  • You plan to switch between Mem0 Platform and the self-hosted client without rewriting code.

Feature

  • Drop-in client: client.chat.completions.create(...) works the same as OpenAI’s method signatures.
  • Shared parameters: Mem0 accepts messages, model, and optional memory-scoping fields (user_id, agent_id, run_id).
  • Memory-aware responses: Each call saves relevant facts so future prompts automatically reflect past conversations.
  • OSS parity: Use the same API surface whether you call the hosted proxy or the OSS configuration.
Run one request with user_id set. If the next call references that ID and its reply uses the stored memory, compatibility is confirmed.

Configure it

Call the managed Mem0 proxy

from mem0.proxy.main import Mem0

client = Mem0(api_key="m0-xxx")

messages = [
    {"role": "user", "content": "I love Indian food but I cannot eat pizza since I'm allergic to cheese."}
]

chat_completion = client.chat.completions.create(
    messages=messages,
    model="gpt-4.1-nano-2025-04-14",
    user_id="alice"
)
Reuse the same identifiers your OpenAI client already sends so you can switch between providers without branching logic.

Use the OpenAI-compatible OSS client

from mem0.proxy.main import Mem0

config = {
    "vector_store": {
        "provider": "qdrant",
        "config": {
            "host": "localhost",
            "port": 6333
        }
    }
}

client = Mem0(config=config)

chat_completion = client.chat.completions.create(
    messages=[{"role": "user", "content": "What's the capital of France?"}],
    model="gpt-4.1-nano-2025-04-14"
)

See it in action

Memory-aware restaurant recommendation

from mem0.proxy.main import Mem0

client = Mem0(api_key="m0-xxx")

# Store preferences
client.chat.completions.create(
    messages=[{"role": "user", "content": "I love Indian food but I'm allergic to cheese."}],
    model="gpt-4.1-nano-2025-04-14",
    user_id="alice"
)

# Later conversation reuses the memory
response = client.chat.completions.create(
    messages=[{"role": "user", "content": "Suggest dinner options in San Francisco."}],
    model="gpt-4.1-nano-2025-04-14",
    user_id="alice"
)

print(response.choices[0].message.content)
The second response should call out Indian restaurants and avoid cheese, proving Mem0 recalled the stored preference.

Verify the feature is working

  • Compare responses from Mem0 vs. OpenAI for identical prompts—both should return the same structure (choices, usage, etc.).
  • Inspect stored memories after each request to confirm the fact extraction captured the right details.
  • Test switching between hosted (Mem0(api_key=...)) and OSS configurations to ensure both respect the same request body.

Best practices

  1. Scope context intentionally: Pass identifiers only when you want conversations to persist; skip them for one-off calls.
  2. Log memory usage: Inspect response.metadata.memories (if enabled) to see which facts the model recalled.
  3. Reuse middleware: Point your existing OpenAI client wrappers to the Mem0 proxy URL to avoid code drift.
  4. Handle fallbacks: Keep a code path for plain OpenAI calls in case Mem0 is unavailable, then resync memory later.

Parameter reference

ParameterTypePurpose
user_idstrAssociates the conversation with a user so memories persist.
agent_idstrOptional agent or bot identifier for multi-agent scenarios.
run_idstrOptional session/run identifier for short-lived flows.
metadatadictStore extra fields alongside each memory entry.
filtersdictRestrict retrieval to specific memories while responding.
limitintCap how many memories Mem0 pulls into the context (default 10).
Other request fields mirror OpenAI’s chat completion API.