You’ll use this when…
- Users share screenshots, menus, or documents and you want the details to become memories.
- You already collect text conversations but need visual context for better answers.
- You want a single workflow that handles both URLs and local image files.
Images larger than 20 MB are rejected. Compress or resize files before sending them to avoid errors.
Feature anatomy
- Vision processing: Mem0 runs the image through a vision model that extracts text and key details.
- Memory creation: Extracted information is stored as standard memories so search, filters, and analytics continue to work.
- Context linking: Visual and textual turns in the same conversation stay linked, giving agents richer context.
- Flexible inputs: Accept publicly accessible URLs or base64-encoded local files in both Python and JavaScript SDKs.
Supported formats
Supported formats
| Format | Used for | Notes |
|---|---|---|
| JPEG / JPG | Photos and screenshots | Default option for camera captures. |
| PNG | Images with transparency | Keeps sharp text and UI elements crisp. |
| WebP | Web-optimized images | Smaller payloads for faster uploads. |
| GIF | Static or animated graphics | Works for simple graphics and short loops. |
Configure it
Add image messages from URLs
Inspect the response payload—the memories list should include entries extracted from the menu image as well as the text turns.
Upload local images as base64
Keep base64 payloads under 5 MB to speed up uploads and avoid hitting the 20 MB limit.
See it in action
Restaurant menu memory
The response should capture both the allergy note and menu items extracted from the photo so future searches can combine them.
Document capture
Combine the receipt upload with structured metadata (tags, categories) if you need to filter expenses later.
Error handling
Fail fast on invalid formats so you can prompt users to re-upload before losing their context.
Verify the feature is working
- After calling
add, inspect the returned memories and confirm they include image-derived text (menu items, receipt totals, etc.). - Run a follow-up
searchfor a detail from the image; the memory should surface alongside related text. - Monitor image upload latency—large files should still complete under your acceptable response time.
- Log file size and URL sources to troubleshoot repeated failures.
Best practices
- Ask for intent: Prompt users to explain why they sent an image so the memory includes the right context.
- Keep images readable: Encourage clear photos without heavy filters or shadows for better extraction.
- Split bulk uploads: Send multiple images as separate
addcalls to isolate failures and improve reliability. - Watch privacy: Avoid uploading sensitive documents unless your environment is secured for that data.
- Validate file size early: Check file size before encoding to save bandwidth and time.
Troubleshooting
| Issue | Cause | Fix |
|---|---|---|
| Upload rejected | File larger than 20 MB | Compress or resize before sending. |
| Memory missing image data | Low-quality or blurry image | Retake the photo with better lighting. |
| Invalid format error | Unsupported file type | Convert to JPEG or PNG first. |
| Slow processing | High-resolution images | Downscale or compress to under 5 MB. |
| Base64 errors | Incorrect prefix or encoding | Ensure data:image/<type>;base64, is present and the string is valid. |