Skip to main content

/ocr

FeatureSupported
Cost Trackingโœ…
Loggingโœ… (Basic Logging not supported)
Load Balancingโœ…
Supported Providersmistral, azure_ai

LiteLLM Python SDK Usageโ€‹

Quick Startโ€‹

from litellm import ocr
import os

os.environ["MISTRAL_API_KEY"] = "sk-.."

response = ocr(
model="mistral/mistral-ocr-latest",
document={
"type": "document_url",
"document_url": "https://arxiv.org/pdf/2201.04234"
}
)

# Access extracted text
for page in response.pages:
print(f"Page {page.index}:")
print(page.markdown)

Async Usageโ€‹

from litellm import aocr
import os, asyncio

os.environ["MISTRAL_API_KEY"] = "sk-.."

async def test_async_ocr():
response = await aocr(
model="mistral/mistral-ocr-latest",
document={
"type": "document_url",
"document_url": "https://arxiv.org/pdf/2201.04234"
}
)

# Access extracted text
for page in response.pages:
print(f"Page {page.index}:")
print(page.markdown)

asyncio.run(test_async_ocr())

Using Base64 Encoded Documentsโ€‹

import base64
from litellm import ocr

# Encode PDF to base64
with open("document.pdf", "rb") as f:
base64_pdf = base64.b64encode(f.read()).decode('utf-8')

response = ocr(
model="mistral/mistral-ocr-latest",
document={
"type": "document_url",
"document_url": f"data:application/pdf;base64,{base64_pdf}"
}
)

Optional Parametersโ€‹

response = ocr(
model="mistral/mistral-ocr-latest",
document={
"type": "document_url",
"document_url": "https://example.com/doc.pdf"
},
# Optional Mistral parameters
pages=[0, 1, 2], # Only process specific pages
include_image_base64=True, # Include extracted images
image_limit=10, # Max images to return
image_min_size=100 # Min image size to include
)

LiteLLM Proxy Usageโ€‹

LiteLLM provides a Mistral API compatible /ocr endpoint for OCR calls.

Setup

Add this to your litellm proxy config.yaml

model_list:
- model_name: mistral-ocr
litellm_params:
model: mistral/mistral-ocr-latest
api_key: os.environ/MISTRAL_API_KEY

Start litellm

litellm --config /path/to/config.yaml

# RUNNING on http://0.0.0.0:4000

Test request

curl http://0.0.0.0:4000/v1/ocr \
-H "Authorization: Bearer sk-1234" \
-H "Content-Type: application/json" \
-d '{
"model": "mistral-ocr",
"document": {
"type": "document_url",
"document_url": "https://arxiv.org/pdf/2201.04234"
}
}'

Request/Response Formatโ€‹

info

LiteLLM follows the Mistral OCR API specification.

See the official Mistral OCR documentation for complete details.

Example Requestโ€‹

{
"model": "mistral/mistral-ocr-latest",
"document": {
"type": "document_url",
"document_url": "https://arxiv.org/pdf/2201.04234"
},
"pages": [0, 1, 2], # Optional: specific pages to process
"include_image_base64": True, # Optional: include extracted images
"image_limit": 10, # Optional: max images to return
"image_min_size": 100 # Optional: min image size in pixels
}

Request Parametersโ€‹

ParameterTypeRequiredDescription
modelstringYesThe OCR model to use (e.g., "mistral/mistral-ocr-latest")
documentobjectYesDocument to process. Must contain type and URL field
document.typestringYesEither "document_url" for PDFs/docs or "image_url" for images
document.document_urlstringConditionalURL to the document (required if type is "document_url")
document.image_urlstringConditionalURL to the image (required if type is "image_url")
pagesarrayNoList of specific page indices to process (0-indexed)
include_image_base64booleanNoWhether to include extracted images as base64 strings
image_limitintegerNoMaximum number of images to return
image_min_sizeintegerNoMinimum size (in pixels) for images to include

Document Format Examplesโ€‹

For PDFs and documents:

{
"type": "document_url",
"document_url": "https://example.com/document.pdf"
}

For images:

{
"type": "image_url",
"image_url": "https://example.com/image.png"
}

For base64-encoded content:

{
"type": "document_url",
"document_url": "data:application/pdf;base64,JVBERi0xLjQKJ..."
}

Response Formatโ€‹

The response follows Mistral's OCR format with the following structure:

{
"pages": [
{
"index": 0,
"markdown": "# Document Title\n\nExtracted text content...",
"dimensions": {
"dpi": 200,
"height": 2200,
"width": 1700
},
"images": [
{
"image_base64": "base64string...",
"bbox": {
"x": 100,
"y": 200,
"width": 300,
"height": 400
}
}
]
}
],
"model": "mistral-ocr-2505-completion",
"usage_info": {
"pages_processed": 29,
"doc_size_bytes": 3002783
},
"document_annotation": null,
"object": "ocr"
}

Response Fieldsโ€‹

FieldTypeDescription
pagesarrayList of processed pages with extracted content
pages[].indexintegerPage number (0-indexed)
pages[].markdownstringExtracted text in Markdown format
pages[].dimensionsobjectPage dimensions (dpi, height, width in pixels)
pages[].imagesarrayExtracted images from the page (if include_image_base64=true)
modelstringThe model used for OCR processing
usage_infoobjectProcessing statistics (pages processed, document size)
document_annotationobjectOptional document-level annotations
objectstringAlways "ocr" for OCR responses

Supported Providersโ€‹

ProviderLink to Usage
Mistral AIUsage
Azure AIUsage