Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.unsiloed.ai/llms.txt

Use this file to discover all available pages before exploring further.

Instead of configuring ocr_engine, segment_processing, agentic_ocr, and merge_tables individually, you can apply a preset mode that bundles a recommended combination of all four. The parser ships with three:

Fast

Optimized for speed. Best for standard documents with clean text and simple tables.

Accurate

Balanced accuracy and performance. Best for complex layouts and detailed tables.

Agentic

Maximum accuracy with per-segment re-OCR. Best for critical documents requiring the highest fidelity.

Mode Parameters

Each mode is shorthand for a specific set of parameters. To use a mode, pass the values from its column in your /parse request.
ParameterFastAccurateAgentic
ocr_engineUnsiloedBetaUnsiloedBetaUnsiloedBeta
segment_processing.Table.htmlVLMVLMVLM
segment_processing.Table.model_idastra_v2astra_v3astra_v3
agentic_ocrnullnulladvanced
merge_tablesfalsetruefalse
Modes are preset configurations. You can override any individual parameter after applying one. See the Parse API reference for the full parameter list.

Applying a Mode

The submit-and-poll flow is the same as in the Quickstart; only the request body differs. The snippets below show each mode in Python — translate to JavaScript or cURL by changing the request body shape, not the parameter values.
import os
import requests

API_KEY = os.environ["UNSILOED_API_KEY"]
BASE_URL = "https://prod.visionapi.unsiloed.ai"

with open("document.pdf", "rb") as f:
    response = requests.post(
        f"{BASE_URL}/parse",
        headers={"api-key": API_KEY},
        files={"file": ("document.pdf", f, "application/pdf")},
        data={
            "ocr_engine": "UnsiloedBeta",
            "merge_tables": "false",
            "agentic_ocr": "",
            "segment_processing": '{"Table": {"html": "VLM", "model_id": "astra_v2"}}',
        },
    )
response.raise_for_status()
job_id = response.json()["job_id"]
print(f"Job submitted: {job_id}")