Parse Document (Presigned Upload)

Overview

The v2 presigned upload endpoint decouples document delivery from job creation. Instead of uploading your file through the API server, you:

POST to /v2/parse/upload with your configuration — the API creates a parse job and returns a short-lived presigned URL.
PUT your file directly to the presigned URL — the API server is never in the transfer path.
Once the upload completes, the job is automatically enqueued for processing.
Poll GET /parse/{job_id} to track progress and retrieve results — the same endpoint as v1.

This is the recommended endpoint for large files and high-throughput workloads. It supports faster uploads, larger file sizes, and higher concurrency than the standard Parse Document endpoint.

The job status starts as AwaitingUpload. It transitions to Queued once the upload completes, then to Processing, and finally Succeeded or Failed. See Get Parse Job Status for the full response schema and polling examples.

Request

file_name

string

required

File name with extension (e.g. "report.pdf"). Determines the MIME type for the presigned URL. Supported formats: PDF, images (PNG, JPEG, TIFF), and office documents (DOCX, PPTX, XLSX).

use_high_resolution

boolean

Enhanced image quality — enables high-resolution processing with upscaling algorithms. Improves OCR accuracy on low-quality scans by enhancing clarity and contrast. Latency penalty: ~2–3 seconds per page. Defaults to true.

layout_analysis

string

Layout analysis strategy: smart_layout_detection (default) or page_by_page.

"smart_layout_detection" (default): Intelligently identifies document structure, headers, sections, and content relationships across the entire document using bounding boxes.
"page_by_page": Analyzes each page independently as a single segment. Faster for simple documents.
"advanced_layout_detection": Uses a vision-language model for exhaustive page segmentation. Detects 14 element types (Caption, Footnote, Formula, ListItem, PageFooter, PageHeader, Picture, SectionHeader, Table, Text, Title, KeyValuePair, Signature, Seal). Best for visually complex or unusual layouts.

ocr_strategy

string

OCR strategy: auto_detection (default) or force_ocr.

"auto_detection" (default): Intelligently detects bad quality PDFs, scanned documents, and images, then applies OCR only where needed.
"force_ocr": Runs OCR on the entire document regardless of quality.

ocr_engine

string

OCR engine to use for text recognition: UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm.

"UnsiloedHawk" (Recommended, default): Higher accuracy for complex layouts and mixed content.
"UnsiloedBeta": Handles rotated/warped text and irregular bounding boxes.
"UnsiloedStorm": Enterprise-grade accuracy optimized for 50+ languages.

agentic_ocr

string

Per-segment OCR enhancement — re-runs a dedicated agentic OCR model on each detected segment after layout detection for higher accuracy. Omit or leave empty to disable.

"standard": Good balance of speed and accuracy.
"advanced": Higher quality, best for complex layouts, rotated text, and mixed-language content.

extract_strikethrough

boolean

Strikethrough detection — detects and preserves strikethrough formatting in HTML and Markdown output. Defaults to false.

merge_tables

boolean

Cross-page table consolidation — detects and combines table segments across page breaks. Reconstructs complete table structure by matching headers and columns. Defaults to false.

merge_batch_size

integer

Maximum number of tables per merge group. Groups larger than this are split. Defaults to 20.

extract_charts

boolean

Chart data extraction — extract structured data from charts and graphs, including data points and chart type information. Defaults to false.

detect_checkboxes

boolean

Checkbox detection — detect and identify checkboxes in the document with their bounding box locations. Defaults to false.

extract_links

boolean

Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.

extract_colors

boolean

Transfer text color from the PDF text layer to OCR results. Defaults to false.

xml_citation

boolean

Citation extraction — extracts academic citations from PDF documents and hyperlinks them in the markdown output. Generates structured bibliography metadata. PDFs only. Defaults to false.

page_range

string

Specify which pages to process. Leave empty to process all pages.Examples: "1-5" (pages 1 to 5), "2,4,6" (specific pages), "[1,3,5]" (array format), "1-3,7,10-15" (combination).

expires_in

integer

Seconds until the task and its output are deleted. Defaults to the plan expiration time.

export_format

array

Export format(s) to generate after processing. Currently supported: ["docx"]. The exported file is available via the exports field in the task response.

segment_type_naming

string

Segment type naming convention. "Unsiloed" (default) — e.g., PageHeader, ListItem, Picture. "Other" — alternative names e.g., Header, List Item, Figure.

segment_filter

string

Comma-separated segment types to keep in the output, or "all" to include every type. Defaults to "all".

validate_segments

array

Segment validation — uses a vision model to validate and correct segment types, fixing misclassified segments. Example: ["Table", "Formula", "Picture"].

validate_table_segments

boolean

Legacy option to validate table segments using a vision model. Prefer validate_segments instead.

chunk_processing

object

JSON object controlling how segments are grouped into chunks. See the v1 parse endpoint for full schema details.

segment_processing

object

JSON object for segment processing and analysis configuration.

segment_analysis

object

Configure how different segment types are processed — table models, image descriptions, and formula processing.

{
  "Table": {"html": "VLM", "markdown": "VLM", "model_id": "us_table_v2"},
  "Picture": {"html": "VLM", "markdown": "VLM", "model_id": "nova"},
  "Formula": {"html": "Auto", "markdown": "VLM", "model_id": "nova"}
}

Options per segment type:

html / markdown: "VLM" or "Auto"
model_id (Table): "astra", "us_table_v1", "us_table_v2"
model_id (Picture/Formula): "nova", "luna", "sol"
use_table_ocr (Table only): Advanced OCR for bordered cells and complex table layouts.
vlm: Custom prompt for the VLM model. Use this to give the model specific instructions for extracting or describing these segment types.

For full details, see Parse Document (v1).

llm_processing

object

JSON object for LLM processing configuration.

output_fields

object

JSON object controlling which fields are included in the output. Set fields to false to exclude them and reduce response size. Available fields: html, markdown, ocr, image, content, bbox, confidence, embed. All default to true.

error_handling

string

How to handle per-page errors during processing. "Continue" (default) — skips failed pages and continues processing the rest. "Fail" — aborts the entire job on the first error.

Response

job_id

string

Unique identifier for the parse job. Use this with GET /parse/{job_id} to poll status and retrieve results.

upload_url

string

Presigned PUT URL. Upload your file directly to this URL — no api-key header needed for the upload itself. Valid until expires_at.

expires_at

string

RFC 3339 timestamp when upload_url expires. Upload must complete before this time.

upload_method

string

Always "PUT". Use an HTTP PUT request when uploading to upload_url.

upload_headers

object

Key-value headers you MUST include in your PUT request. Always includes Content-Type set to the MIME type inferred from file_name.

credit_used

integer

Number of page credits deducted as an initial reservation, reconciled on upload.

quota_remaining

integer

Remaining page credits after deduction.

Step-by-Step Guide

Step 1 — Create the parse job

POST to /v2/parse/upload with your file name and configuration. The API returns a job_id and a short-lived presigned URL.

import requests

API_KEY = "your-api-key"
BASE_URL = "https://prod.visionapi.unsiloed.ai"

response = requests.post(
    f"{BASE_URL}/v2/parse/upload",
    headers={"api-key": API_KEY, "Content-Type": "application/json"},
    json={
        "file_name": "report.pdf",
        "use_high_resolution": True,
        "layout_analysis": "smart_layout_detection",
        "ocr_strategy": "auto_detection",
        "extract_strikethrough": False,
        "merge_tables": False,
    },
)
response.raise_for_status()
data = response.json()

job_id = data["job_id"]
upload_url = data["upload_url"]
upload_headers = data["upload_headers"]

print(f"Job ID: {job_id}")
print(f"Upload URL expires: {data['expires_at']}")

The response contains the presigned URL and required headers:

{
  "job_id": "a3f1c2d4-7e8b-4a9f-b2c1-123456789abc",
  "upload_url": "https://upload.visionapi.unsiloed.ai/uploads/a3f1c2d4.../report.pdf?signature=...",
  "expires_at": "2025-10-22T07:06:16Z",
  "upload_method": "PUT",
  "upload_headers": {
    "Content-Type": "application/pdf"
  },
  "credit_used": 0,
  "quota_remaining": 1000
}

Step 2 — Upload the file

Use the upload_url and upload_headers from the response to PUT your file. No API key is needed for this request.

with open("report.pdf", "rb") as f:
    put_response = requests.put(upload_url, headers=upload_headers, data=f)
put_response.raise_for_status()

print("Upload complete — job is now queued for processing")

You must include every header listed in upload_headers. A missing or mismatched Content-Type will cause the upload to be rejected with a 403.

Step 3 — Poll for results

Once the upload completes, the job transitions from AwaitingUpload → Queued → Processing → Succeeded. Poll GET /parse/{job_id} using the same endpoint as v1.

import time

while True:
    status_response = requests.get(
        f"{BASE_URL}/parse/{job_id}",
        headers={"api-key": API_KEY},
    )
    status_response.raise_for_status()
    job = status_response.json()
    print(f"Status: {job['status']}")

    if job["status"] == "Succeeded":
        print(f"Done! {job['total_chunks']} chunks extracted.")
        break
    elif job["status"] == "Failed":
        raise RuntimeError(f"Job failed: {job.get('message')}")

    time.sleep(5)

See Get Parse Job Status for the full response schema.

# Step 1: Create job and get presigned URL
RESPONSE=$(curl -s -X POST https://prod.visionapi.unsiloed.ai/v2/parse/upload \
  -H "api-key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "report.pdf",
    "use_high_resolution": true,
    "layout_analysis": "smart_layout_detection",
    "ocr_strategy": "auto_detection",
    "extract_strikethrough": false,
    "merge_tables": false
  }')

JOB_ID=$(echo "$RESPONSE" | jq -r '.job_id')
UPLOAD_URL=$(echo "$RESPONSE" | jq -r '.upload_url')

# Step 2: Upload file
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

# Step 3: Poll for results
curl -X GET "https://prod.visionapi.unsiloed.ai/parse/$JOB_ID" \
  -H "api-key: your-api-key"

{
  "job_id": "a3f1c2d4-7e8b-4a9f-b2c1-123456789abc",
  "upload_url": "https://upload.visionapi.unsiloed.ai/uploads/a3f1c2d4.../report.pdf?signature=...",
  "expires_at": "2025-10-22T07:06:16Z",
  "upload_method": "PUT",
  "upload_headers": {
    "Content-Type": "application/pdf"
  },
  "credit_used": 0,
  "quota_remaining": 1000
}

Why Use v2

Feature	v1 (`POST /parse`)	v2 (`POST /v2/parse/upload`)
File delivery	Through API server	Direct via presigned URL
Max file size	Limited by server upload	Up to 5 GB via direct PUT
Upload speed	Bottlenecked by API server	Full client bandwidth
Concurrency	Shared server capacity	No server contention
Best for	Quick uploads, small files	Large files, high volume

Error Handling

Status	Cause	Action
`400`	Invalid or missing `file_name`, unsupported extension	Fix `file_name` and retry
`401`	Missing or invalid `api-key`	Check your API key
`402`	Insufficient quota — not enough page credits remaining	Add page credits to your account
`403`	Access has been revoked	Contact support
`429`	Rate limit exceeded	Back off and retry after 1 second
`500`	Internal server error	Retry with exponential backoff
`503`	Job queue is at capacity	Back off and retry after the `Retry-After` header value
`403` on upload	Missing or wrong headers, expired URL	Check `upload_headers`, get a new URL
Job status `Failed`	Processing error	Check `message` field in the status response

Authorizations

Authorization

string

header

required

API key for authentication. Use 'Bearer <your_api_key>'

Body

application/json

Request body for POST /v2/parse/upload. Configuration fields mirror the existing /parse multipart form fields.

file_name

string

required

File name with extension. Required. Determines content-type.

agentic_ocr

string | null

Enable per-segment agentic OCR for higher accuracy. Pass "standard" or "advanced".

chunk_processing

any

JSON object for chunk processing configuration.

detect_checkboxes

boolean | null

Detect checkboxes in document images. Defaults to false.

error_handling

string | null

Error handling strategy: Continue (default) or Fail.

expires_in

integer<int32> | null

Seconds until the task and its output are deleted.

export_format

enum<string>[] | null

Export format(s) to generate after processing. Currently supported: ["docx"].

File format for exporting parsed results. When specified in a parse request, the pipeline generates the requested export file after processing completes. The exported file is available via the exports field in the task response.

Available options:

docx

extract_charts

boolean | null

Extract structured data from charts and graphs. Defaults to false.

extract_colors

boolean | null

Transfer text color from the PDF text layer to OCR results. Defaults to false.

extract_links

boolean | null

Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.

extract_strikethrough

boolean | null

Preserve strikethrough formatting in HTML/Markdown output. Defaults to false.

layout_analysis

string | null

Layout analysis strategy: smart_layout_detection (default) or page_by_page.

llm_processing

any

JSON object for LLM processing configuration.

merge_batch_size

integer<int32> | null

Maximum number of tables per merge group. Groups larger than this are split. Defaults to 20.

merge_tables

boolean | null

Merge tables that span multiple pages into a single unified structure. Defaults to false.

ocr_engine

string | null

OCR engine: UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm.

ocr_strategy

string | null

OCR strategy: auto_detection (default) or force_ocr.

output_fields

any

JSON object controlling which output fields are included in the response.

page_range

string | null

Page range to process. Formats: "1-5", "2,4,6", "[1,3,5]". Defaults to all pages.

segment_filter

string | null

Segment filter: comma-separated segment types to keep, or "all". Defaults to "all".

segment_processing

any

JSON object for segment processing/analysis configuration.

segment_type_naming

string | null

Segment type naming convention: Unsiloed (default) or Other.

use_high_resolution

boolean | null

Use high-resolution images for cropping and post-processing. Defaults to true.

validate_segments

any

JSON array of segment types to validate with VLM. Example: ["Table","Formula"].

validate_table_segments

boolean | null

Legacy: validate table segments using VLM. Prefer validate_segments instead.

xml_citation

boolean | null

Extract and hyperlink bibliography citations in the markdown output. Defaults to false.

Response

Upload URL created

Response from POST /v2/parse/upload.

credit_used

integer<int32>

required

Number of page credits deducted (initial reservation, reconciled on upload).

expires_at

string

required

RFC 3339 timestamp when upload_url expires.

job_id

string

required

Use this ID to poll GET /parse/{job_id} for status.

quota_remaining

integer<int32>

required

Remaining page credits after deduction.

upload_headers

object

required

Headers the client MUST include in the PUT request.

Show child attributes

upload_method

string

required

Always "PUT".

upload_url

string

required

S3 presigned PUT URL. Valid until expires_at.

Parsing

Extraction

Classification

Splitting

Organization

Parse Document (Presigned Upload)

Overview

Request

Response

Step-by-Step Guide

Step 1 — Create the parse job

Step 2 — Upload the file

Step 3 — Poll for results

Why Use v2

Error Handling

Authorizations

Body

Response

Parsing

Extraction

Classification

Splitting

Organization

​Overview

​Request

​Response

​Step-by-Step Guide

​Step 1 — Create the parse job

​Step 2 — Upload the file

​Step 3 — Poll for results

​Why Use v2

​Error Handling

Authorizations

Body

Response

Overview

Request

Response

Step-by-Step Guide

Step 1 — Create the parse job

Step 2 — Upload the file

Step 3 — Poll for results

Why Use v2

Error Handling