Documentation Index
Fetch the complete documentation index at: https://docs.unsiloed.ai/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The v3 endpoint parses a PDF (or archive of PDFs) and returns markdown text for each page. Compared to v1/v2 it is intentionally simpler: no layout / OCR-engine / segment-analysis knobs, no segment tree in the response. You submit a PDF, you get markdown back. The pipeline picks the best model per page internally. The endpoint is async: every submission returns ajob_id you poll until the job reaches a terminal state.
Endpoint base URL: https://prod.visionapi.unsiloed.ai/v3/parse
The v3 surface has four routes:
| Route | Use it for |
|---|---|
POST /v3/parse | Submit a single PDF — three body shapes (multipart, JSON URL, JSON file_id) |
POST /v3/parse/upload | Mint a presigned PUT URL for PDFs larger than the inline cap |
POST /v3/parse/batch | Submit a tar/tar.gz/zip archive of PDFs in one job |
GET /v3/parse/{job_id} | Poll status and retrieve the inline markdown result |
Authentication
Every request requires anX-API-Key header. Keys are personal, rate-limited per key (100 requests/day, 2 RPS), and isolated — you can only see your own jobs.
v3 API keys are issued on request — they are separate from v1/v2 keys. To get one, email aman@unsiloed.ai and andre@unsiloed.ai (or open an issue at github.com/Unsiloed-AI/unsiloed-olmocr-bench) with a one-line note about what you’re evaluating. Typical turnaround is same-day.
Guarantees
| Property | What it means for you |
|---|---|
| Per-key isolation | Polling another user’s job_id returns 404, as does trying to re-parse another user’s file_id. |
| 24-hour retention | Every job artifact — your uploaded PDF, status, result, container logs — is deleted automatically 24 hours after the job is created. Pull your results within that window. |
| No scoring on our end | The API returns markdown only. If you want to reproduce a benchmark number, run the unmodified upstream scorer against the markdown locally. |
POST /v3/parse — Submit a single PDF
POST /v3/parse accepts three body shapes (auto-detected from the Content-Type header). All three submit the same async job and return the same response.
Body shape 1 — Inline multipart upload
For small PDFs (up to ~3 MB raw). Single HTTP call.PDF binary, sent as
multipart/form-data. Capped at ~3 MB raw (≈ 4 MB after base64 encoding inside API Gateway). For larger files use body shape 2 or 3.Optional query parameter on the request URL. Restrict OCR to a subset of pages.
"1-5": pages 1 through 5"1,3,5": specific pages- omitted: all pages
Body shape 2 — JSON with caller-hosted URL
For PDFs up to 50 MB that you already host (S3 public-read, S3 presigned, GitHub release asset, your own web server, etc.). Single HTTP call.Publicly fetchable
https:// URL or s3://bucket/key reference to the PDF. We fetch it. URLs pointing at private IPs, link-local, or AWS instance metadata are rejected by an SSRF guard.Body shape 3 — JSON with file_id from a presigned upload
For PDFs up to 50 MB that you do not want to host publicly. First call POST /v3/parse/upload (below) to get a presigned upload_url and file_id; PUT your PDF to the URL; then submit the parse using the file_id.
The
file_id returned by POST /v3/parse/upload after you finish the PUT. Acts as the job_id for subsequent polling.You must use the same API key that minted the
file_id via POST /v3/parse/upload. Cross-key submissions return 404 (hiding existence) — this is how per-key isolation is enforced.Response (any body shape)
Job identifier (32-character UUID hex). Pass to
GET /v3/parse/{job_id} to poll. When you used body shape 3, this equals the file_id you submitted.Always
"queued" on submission. Subsequent values: "running" → "done" or "failed".ISO 8601 timestamp when the job was created.
POST /v3/parse/upload — Presigned upload URL
Returns a presigned S3PUT URL so you can upload a PDF directly (bypassing the API Gateway request size cap). Use this for the 3-call flow of body shape 3 above. No request body required — just an empty POST with the auth header.
Response
Opaque identifier. After you PUT the PDF to
upload_url, pass this back as {"file_id": "..."} to POST /v3/parse to start parsing.Presigned S3
PUT URL. 1-hour expiry from issuance. Send the PDF body directly to this URL with HTTP method PUT and Content-Type: application/pdf. The transfer bypasses our API Gateway entirely.Always
"PUT".Always
"application/pdf". Your PUT must set the same Content-Type header.Maximum PDF size accepted by the pipeline after upload. Currently
52428800 (50 MB).Seconds until the
upload_url expires (3600).Full 3-call flow
POST /v3/parse/batch — Archive of PDFs
Process many PDFs in one job. You host an archive of PDFs; we fetch it and process every PDF inside.Public
https:// URL or s3:// reference to a .tar, .tar.gz/.tgz, or .zip archive of PDFs. Archive format is auto-detected by content sniffing the first bytes, not by file extension. Non-PDF files inside the archive are skipped silently.POST /v3/parse:
documents[] instead of pages[] (one entry per PDF in the archive) — see “Polling” below.
GET /v3/parse/ — Poll status + retrieve result
Query parameters
When set to
"markdown" and status is "done", returns concatenated page markdown as Content-Type: text/markdown; charset=utf-8 instead of a JSON envelope. Useful for curl ... | tee out.md. Ignored while the job is queued/running/failed.Response — while running
Response — single-PDF done
"done" for a completed single-PDF job.Number of pages in the PDF (after applying the
pages selector, if any).Per-page markdown. Each entry has
page (1-indexed integer) and markdown (string). Page order is ascending.Response — batch done
One entry per PDF found in the archive. Each entry has
pdf (relative path inside the archive), page_count, and pages[] (same shape as single-PDF). If a particular PDF failed, the entry has an error field instead of pages.If the JSON would exceed API Gateway’s 10 MB response cap, the response is
{ "job_id", "status": "done", "result_url" } instead — fetch result_url (presigned S3 GET) to download the same JSON. The schema of the downloaded JSON is identical to the inline shape, so clients can use one code path for both.Response — failed
Polling example
Error responses
| Status | Body / Header | When | What to do |
|---|---|---|---|
401 | {"message": "Unauthorized"} | Missing or invalid X-API-Key | Use a personal key; request one if you don’t have it yet |
403 | {"message": "Forbidden"} | Key not yet propagated through API Gateway edges (within 30–60s of issuance) | Retry after a minute |
404 | {"error": "Job not found"} | Job doesn’t exist, or the job belongs to a different API key | Confirm you’re using the same key that submitted the job; otherwise check the job_id |
404 | {"error": "no upload found for file_id=..."} | The file_id you sent doesn’t have an uploaded PDF behind it, or it belongs to a different key | Make sure you completed the PUT step from /upload, and that you’re using the same key |
413 | {"message": "Request Too Long"} | Multipart body too big for API Gateway’s request cap | Switch to body shape 2 (JSON URL) or body shape 3 (presigned upload) |
429 | {"message": "Limit Exceeded"} | Per-key quota (100 requests/day) or rate limit (2 RPS / 2 burst) exceeded | Slow down and retry; quota resets daily |
400 | {"error": "invalid url: ..."} | URL validation failed (wrong scheme, IP-literal, link-local, RFC1918, AWS metadata, etc.) | Use an https:// or s3:// URL pointing at a public/presigned object |
400 | {"error": "multipart parse failed: ..."} | Malformed multipart body | Verify your client sets Content-Type: multipart/form-data; boundary=... correctly |
job failed (200 body) | {"status": "failed", "error": "..."} | Container hit a runtime error (file isn’t a real PDF, archive contains no PDFs, fetch URL timed out, etc.) | Read the error field; fix and resubmit |
Code examples
See also
- Open-source benchmark harness + client — reproduces our published olmOCR-Bench numbers across vendors and includes a thin client (
clients/bench_via_api.py) that calls this endpoint, collects the returned markdown, and scores it with the unmodified upstream scorer. - v1 Parse Document — segmented response with bounding boxes, OCR data, and per-segment processing knobs.
- v2 Parse Document (Presigned Upload) — segmented response variant with presigned upload for larger files and higher throughput.

