Skip to main content
Multimodal support lets Mem0 extract facts from images alongside regular text. Add screenshots, receipts, or product photos and Mem0 will store the insights as searchable memories so agents can recall them later.
You’ll use this when…
  • Users share screenshots, menus, or documents and you want the details to become memories.
  • You already collect text conversations but need visual context for better answers.
  • You want a single workflow that handles both URLs and local image files.
Images larger than 20 MB are rejected. Compress or resize files before sending them to avoid errors.

Feature anatomy

  • Vision processing: Mem0 runs the image through a vision model that extracts text and key details.
  • Memory creation: Extracted information is stored as standard memories so search, filters, and analytics continue to work.
  • Context linking: Visual and textual turns in the same conversation stay linked, giving agents richer context.
  • Flexible inputs: Accept publicly accessible URLs or base64-encoded local files in both Python and JavaScript SDKs.
FormatUsed forNotes
JPEG / JPGPhotos and screenshotsDefault option for camera captures.
PNGImages with transparencyKeeps sharp text and UI elements crisp.
WebPWeb-optimized imagesSmaller payloads for faster uploads.
GIFStatic or animated graphicsWorks for simple graphics and short loops.

Configure it

Add image messages from URLs

from mem0 import Memory

client = Memory()

messages = [
    {"role": "user", "content": "Hi, my name is Alice."},
    {
        "role": "user",
        "content": {
            "type": "image_url",
            "image_url": {
                "url": "https://example.com/menu.jpg"
            }
        }
    }
]

client.add(messages, user_id="alice")
Inspect the response payload—the memories list should include entries extracted from the menu image as well as the text turns.

Upload local images as base64

import base64
from mem0 import Memory

def encode_image(image_path):
    with open(image_path, "rb") as image_file:
        return base64.b64encode(image_file.read()).decode("utf-8")

client = Memory()
base64_image = encode_image("path/to/your/image.jpg")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "text", "text": "What's in this image?"},
            {
                "type": "image_url",
                "image_url": {
                    "url": f"data:image/jpeg;base64,{base64_image}"
                }
            }
        ]
    }
]

client.add(messages, user_id="alice")
Keep base64 payloads under 5 MB to speed up uploads and avoid hitting the 20 MB limit.

See it in action

Restaurant menu memory

from mem0 import Memory

client = Memory()

messages = [
    {
        "role": "user",
        "content": "Help me remember which dishes I liked."
    },
    {
        "role": "user",
        "content": {
            "type": "image_url",
            "image_url": {
                "url": "https://example.com/restaurant-menu.jpg"
            }
        }
    },
    {
        "role": "user",
        "content": "I’m allergic to peanuts and prefer vegetarian meals."
    }
]

result = client.add(messages, user_id="user123")
print(result)
The response should capture both the allergy note and menu items extracted from the photo so future searches can combine them.

Document capture

messages = [
    {
        "role": "user",
        "content": "Store this receipt information for expenses."
    },
    {
        "role": "user",
        "content": {
            "type": "image_url",
            "image_url": {
                "url": "https://example.com/receipt.jpg"
            }
        }
    }
]

client.add(messages, user_id="user123")
Combine the receipt upload with structured metadata (tags, categories) if you need to filter expenses later.

Error handling

from mem0 import Memory
from mem0.exceptions import InvalidImageError, FileSizeError

client = Memory()

try:
    messages = [{
        "role": "user",
        "content": {
            "type": "image_url",
            "image_url": {"url": "https://example.com/image.jpg"}
        }
    }]

    client.add(messages, user_id="user123")
    print("Image processed successfully")

except InvalidImageError:
    print("Invalid image format or corrupted file")
except FileSizeError:
    print("Image file too large")
except Exception as exc:
    print(f"Unexpected error: {exc}")
Fail fast on invalid formats so you can prompt users to re-upload before losing their context.

Verify the feature is working

  • After calling add, inspect the returned memories and confirm they include image-derived text (menu items, receipt totals, etc.).
  • Run a follow-up search for a detail from the image; the memory should surface alongside related text.
  • Monitor image upload latency—large files should still complete under your acceptable response time.
  • Log file size and URL sources to troubleshoot repeated failures.

Best practices

  1. Ask for intent: Prompt users to explain why they sent an image so the memory includes the right context.
  2. Keep images readable: Encourage clear photos without heavy filters or shadows for better extraction.
  3. Split bulk uploads: Send multiple images as separate add calls to isolate failures and improve reliability.
  4. Watch privacy: Avoid uploading sensitive documents unless your environment is secured for that data.
  5. Validate file size early: Check file size before encoding to save bandwidth and time.

Troubleshooting

IssueCauseFix
Upload rejectedFile larger than 20 MBCompress or resize before sending.
Memory missing image dataLow-quality or blurry imageRetake the photo with better lighting.
Invalid format errorUnsupported file typeConvert to JPEG or PNG first.
Slow processingHigh-resolution imagesDownscale or compress to under 5 MB.
Base64 errorsIncorrect prefix or encodingEnsure data:image/<type>;base64, is present and the string is valid.