Skip to main content
POST
/
v2
/
parse
/
upload
# Step 1: Create job and get presigned URL
RESPONSE=$(curl -s -X POST https://prod.visionapi.unsiloed.ai/v2/parse/upload \
  -H "api-key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "report.pdf",
    "use_high_resolution": true,
    "layout_analysis": "smart_layout_detection",
    "ocr_strategy": "auto_detection",
    "extract_strikethrough": false,
    "merge_tables": false
  }')

JOB_ID=$(echo "$RESPONSE" | jq -r '.job_id')
UPLOAD_URL=$(echo "$RESPONSE" | jq -r '.upload_url')

# Step 2: Upload file
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

# Step 3: Poll for results
curl -X GET "https://prod.visionapi.unsiloed.ai/parse/$JOB_ID" \
  -H "api-key: your-api-key"
{
  "job_id": "a3f1c2d4-7e8b-4a9f-b2c1-123456789abc",
  "upload_url": "https://upload.visionapi.unsiloed.ai/uploads/a3f1c2d4.../report.pdf?signature=...",
  "expires_at": "2025-10-22T07:06:16Z",
  "upload_method": "PUT",
  "upload_headers": {
    "Content-Type": "application/pdf"
  },
  "credit_used": 0,
  "quota_remaining": 1000
}

Overview

The v2 presigned upload endpoint decouples document delivery from job creation. Instead of uploading your file through the API server, you:
  1. POST to /v2/parse/upload with your configuration — the API creates a parse job and returns a short-lived presigned URL.
  2. PUT your file directly to the presigned URL — the API server is never in the transfer path.
  3. Once the upload completes, the job is automatically enqueued for processing.
  4. Poll GET /parse/{job_id} to track progress and retrieve results — the same endpoint as v1.
This is the recommended endpoint for large files and high-throughput workloads. It supports faster uploads, larger file sizes, and higher concurrency than the standard Parse Document endpoint.
The job status starts as AwaitingUpload. It transitions to Queued once the upload completes, then to Processing, and finally Succeeded or Failed. See Get Parse Job Status for the full response schema and polling examples.

Request

file_name
string
required
File name with extension (e.g. "report.pdf"). Determines the MIME type for the presigned URL. Supported formats: PDF, images (PNG, JPEG, TIFF), and office documents (DOCX, PPTX, XLSX).
use_high_resolution
boolean
Enhanced image quality — enables high-resolution processing with upscaling algorithms. Improves OCR accuracy on low-quality scans by enhancing clarity and contrast. Latency penalty: ~2–3 seconds per page. Defaults to true.
layout_analysis
string
Layout analysis strategy: smart_layout_detection (default) or page_by_page.
  • "smart_layout_detection" (default): Intelligently identifies document structure, headers, sections, and content relationships across the entire document using bounding boxes.
  • "page_by_page": Analyzes each page independently as a single segment. Faster for simple documents.
  • "advanced_layout_detection": Uses a vision-language model for exhaustive page segmentation. Detects 14 element types (Caption, Footnote, Formula, ListItem, PageFooter, PageHeader, Picture, SectionHeader, Table, Text, Title, KeyValuePair, Signature, Seal). Best for visually complex or unusual layouts.
ocr_strategy
string
OCR strategy: auto_detection (default) or force_ocr.
  • "auto_detection" (default): Intelligently detects bad quality PDFs, scanned documents, and images, then applies OCR only where needed.
  • "force_ocr": Runs OCR on the entire document regardless of quality.
ocr_engine
string
OCR engine to use for text recognition: UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm.
  • "UnsiloedHawk" (Recommended, default): Higher accuracy for complex layouts and mixed content.
  • "UnsiloedBeta": Handles rotated/warped text and irregular bounding boxes.
  • "UnsiloedStorm": Enterprise-grade accuracy optimized for 50+ languages.
agentic_ocr
string
Per-segment OCR enhancement — re-runs a dedicated agentic OCR model on each detected segment after layout detection for higher accuracy. Omit or leave empty to disable.
  • "standard": Good balance of speed and accuracy.
  • "advanced": Higher quality, best for complex layouts, rotated text, and mixed-language content.
extract_strikethrough
boolean
Strikethrough detection — detects and preserves strikethrough formatting in HTML and Markdown output. Defaults to false.
merge_tables
boolean
Cross-page table consolidation — detects and combines table segments across page breaks. Reconstructs complete table structure by matching headers and columns. Defaults to false.
merge_batch_size
integer
Maximum number of tables per merge group. Groups larger than this are split. Defaults to 20.
extract_charts
boolean
Chart data extraction — extract structured data from charts and graphs, including data points and chart type information. Defaults to false.
detect_checkboxes
boolean
Checkbox detection — detect and identify checkboxes in the document with their bounding box locations. Defaults to false.
Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.
extract_colors
boolean
Transfer text color from the PDF text layer to OCR results. Defaults to false.
xml_citation
boolean
Citation extraction — extracts academic citations from PDF documents and hyperlinks them in the markdown output. Generates structured bibliography metadata. PDFs only. Defaults to false.
page_range
string
Specify which pages to process. Leave empty to process all pages.Examples: "1-5" (pages 1 to 5), "2,4,6" (specific pages), "[1,3,5]" (array format), "1-3,7,10-15" (combination).
expires_in
integer
Seconds until the task and its output are deleted. Defaults to the plan expiration time.
export_format
array
Export format(s) to generate after processing. Currently supported: ["docx"]. The exported file is available via the exports field in the task response.
segment_type_naming
string
Segment type naming convention. "Unsiloed" (default) — e.g., PageHeader, ListItem, Picture. "Other" — alternative names e.g., Header, List Item, Figure.
segment_filter
string
Comma-separated segment types to keep in the output, or "all" to include every type. Defaults to "all".
validate_segments
array
Segment validation — uses a vision model to validate and correct segment types, fixing misclassified segments. Example: ["Table", "Formula", "Picture"].
validate_table_segments
boolean
Legacy option to validate table segments using a vision model. Prefer validate_segments instead.
chunk_processing
object
JSON object controlling how segments are grouped into chunks. See the v1 parse endpoint for full schema details.
segment_processing
object
JSON object for segment processing and analysis configuration.
segment_analysis
object
Configure how different segment types are processed — table models, image descriptions, and formula processing.
{
  "Table": {"html": "VLM", "markdown": "VLM", "model_id": "us_table_v2"},
  "Picture": {"html": "VLM", "markdown": "VLM", "model_id": "nova"},
  "Formula": {"html": "Auto", "markdown": "VLM", "model_id": "nova"}
}
Options per segment type:
  • html / markdown: "VLM" or "Auto"
  • model_id (Table): "astra", "us_table_v1", "us_table_v2"
  • model_id (Picture/Formula): "nova", "luna", "sol"
  • use_table_ocr (Table only): Advanced OCR for bordered cells and complex table layouts.
  • vlm: Custom prompt for the VLM model. Use this to give the model specific instructions for extracting or describing these segment types.
For full details, see Parse Document (v1).
llm_processing
object
JSON object for LLM processing configuration.
output_fields
object
JSON object controlling which fields are included in the output. Set fields to false to exclude them and reduce response size. Available fields: html, markdown, ocr, image, content, bbox, confidence, embed. All default to true.
error_handling
string
How to handle per-page errors during processing. "Continue" (default) — skips failed pages and continues processing the rest. "Fail" — aborts the entire job on the first error.

Response

job_id
string
Unique identifier for the parse job. Use this with GET /parse/{job_id} to poll status and retrieve results.
upload_url
string
Presigned PUT URL. Upload your file directly to this URL — no api-key header needed for the upload itself. Valid until expires_at.
expires_at
string
RFC 3339 timestamp when upload_url expires. Upload must complete before this time.
upload_method
string
Always "PUT". Use an HTTP PUT request when uploading to upload_url.
upload_headers
object
Key-value headers you MUST include in your PUT request. Always includes Content-Type set to the MIME type inferred from file_name.
credit_used
integer
Number of page credits deducted as an initial reservation, reconciled on upload.
quota_remaining
integer
Remaining page credits after deduction.

Step-by-Step Guide

Step 1 — Create the parse job

POST to /v2/parse/upload with your file name and configuration. The API returns a job_id and a short-lived presigned URL.
import requests

API_KEY = "your-api-key"
BASE_URL = "https://prod.visionapi.unsiloed.ai"

response = requests.post(
    f"{BASE_URL}/v2/parse/upload",
    headers={"api-key": API_KEY, "Content-Type": "application/json"},
    json={
        "file_name": "report.pdf",
        "use_high_resolution": True,
        "layout_analysis": "smart_layout_detection",
        "ocr_strategy": "auto_detection",
        "extract_strikethrough": False,
        "merge_tables": False,
    },
)
response.raise_for_status()
data = response.json()

job_id = data["job_id"]
upload_url = data["upload_url"]
upload_headers = data["upload_headers"]

print(f"Job ID: {job_id}")
print(f"Upload URL expires: {data['expires_at']}")
The response contains the presigned URL and required headers:
{
  "job_id": "a3f1c2d4-7e8b-4a9f-b2c1-123456789abc",
  "upload_url": "https://upload.visionapi.unsiloed.ai/uploads/a3f1c2d4.../report.pdf?signature=...",
  "expires_at": "2025-10-22T07:06:16Z",
  "upload_method": "PUT",
  "upload_headers": {
    "Content-Type": "application/pdf"
  },
  "credit_used": 0,
  "quota_remaining": 1000
}

Step 2 — Upload the file

Use the upload_url and upload_headers from the response to PUT your file. No API key is needed for this request.
with open("report.pdf", "rb") as f:
    put_response = requests.put(upload_url, headers=upload_headers, data=f)
put_response.raise_for_status()

print("Upload complete — job is now queued for processing")
You must include every header listed in upload_headers. A missing or mismatched Content-Type will cause the upload to be rejected with a 403.

Step 3 — Poll for results

Once the upload completes, the job transitions from AwaitingUploadQueuedProcessingSucceeded. Poll GET /parse/{job_id} using the same endpoint as v1.
import time

while True:
    status_response = requests.get(
        f"{BASE_URL}/parse/{job_id}",
        headers={"api-key": API_KEY},
    )
    status_response.raise_for_status()
    job = status_response.json()
    print(f"Status: {job['status']}")

    if job["status"] == "Succeeded":
        print(f"Done! {job['total_chunks']} chunks extracted.")
        break
    elif job["status"] == "Failed":
        raise RuntimeError(f"Job failed: {job.get('message')}")

    time.sleep(5)
See Get Parse Job Status for the full response schema.
# Step 1: Create job and get presigned URL
RESPONSE=$(curl -s -X POST https://prod.visionapi.unsiloed.ai/v2/parse/upload \
  -H "api-key: your-api-key" \
  -H "Content-Type: application/json" \
  -d '{
    "file_name": "report.pdf",
    "use_high_resolution": true,
    "layout_analysis": "smart_layout_detection",
    "ocr_strategy": "auto_detection",
    "extract_strikethrough": false,
    "merge_tables": false
  }')

JOB_ID=$(echo "$RESPONSE" | jq -r '.job_id')
UPLOAD_URL=$(echo "$RESPONSE" | jq -r '.upload_url')

# Step 2: Upload file
curl -X PUT "$UPLOAD_URL" \
  -H "Content-Type: application/pdf" \
  --data-binary @report.pdf

# Step 3: Poll for results
curl -X GET "https://prod.visionapi.unsiloed.ai/parse/$JOB_ID" \
  -H "api-key: your-api-key"
{
  "job_id": "a3f1c2d4-7e8b-4a9f-b2c1-123456789abc",
  "upload_url": "https://upload.visionapi.unsiloed.ai/uploads/a3f1c2d4.../report.pdf?signature=...",
  "expires_at": "2025-10-22T07:06:16Z",
  "upload_method": "PUT",
  "upload_headers": {
    "Content-Type": "application/pdf"
  },
  "credit_used": 0,
  "quota_remaining": 1000
}

Why Use v2

Featurev1 (POST /parse)v2 (POST /v2/parse/upload)
File deliveryThrough API serverDirect via presigned URL
Max file sizeLimited by server uploadUp to 5 GB via direct PUT
Upload speedBottlenecked by API serverFull client bandwidth
ConcurrencyShared server capacityNo server contention
Best forQuick uploads, small filesLarge files, high volume

Error Handling

StatusCauseAction
400Invalid or missing file_name, unsupported extensionFix file_name and retry
401Missing or invalid api-keyCheck your API key
402Insufficient quota — not enough page credits remainingAdd page credits to your account
403Access has been revokedContact support
429Rate limit exceededBack off and retry after 1 second
500Internal server errorRetry with exponential backoff
503Job queue is at capacityBack off and retry after the Retry-After header value
403 on uploadMissing or wrong headers, expired URLCheck upload_headers, get a new URL
Job status FailedProcessing errorCheck message field in the status response

Authorizations

Authorization
string
header
required

API key for authentication. Use 'Bearer <your_api_key>'

Body

application/json

Request body for POST /v2/parse/upload. Configuration fields mirror the existing /parse multipart form fields.

file_name
string
required

File name with extension. Required. Determines content-type.

agentic_ocr
string | null

Enable per-segment agentic OCR for higher accuracy. Pass "standard" or "advanced".

chunk_processing
any

JSON object for chunk processing configuration.

detect_checkboxes
boolean | null

Detect checkboxes in document images. Defaults to false.

error_handling
string | null

Error handling strategy: Continue (default) or Fail.

expires_in
integer<int32> | null

Seconds until the task and its output are deleted.

export_format
enum<string>[] | null

Export format(s) to generate after processing. Currently supported: ["docx"].

File format for exporting parsed results. When specified in a parse request, the pipeline generates the requested export file after processing completes. The exported file is available via the exports field in the task response.

Available options:
docx
extract_charts
boolean | null

Extract structured data from charts and graphs. Defaults to false.

extract_colors
boolean | null

Transfer text color from the PDF text layer to OCR results. Defaults to false.

Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.

extract_strikethrough
boolean | null

Preserve strikethrough formatting in HTML/Markdown output. Defaults to false.

layout_analysis
string | null

Layout analysis strategy: smart_layout_detection (default) or page_by_page.

llm_processing
any

JSON object for LLM processing configuration.

merge_batch_size
integer<int32> | null

Maximum number of tables per merge group. Groups larger than this are split. Defaults to 20.

merge_tables
boolean | null

Merge tables that span multiple pages into a single unified structure. Defaults to false.

ocr_engine
string | null

OCR engine: UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm.

ocr_strategy
string | null

OCR strategy: auto_detection (default) or force_ocr.

output_fields
any

JSON object controlling which output fields are included in the response.

page_range
string | null

Page range to process. Formats: "1-5", "2,4,6", "[1,3,5]". Defaults to all pages.

segment_filter
string | null

Segment filter: comma-separated segment types to keep, or "all". Defaults to "all".

segment_processing
any

JSON object for segment processing/analysis configuration.

segment_type_naming
string | null

Segment type naming convention: Unsiloed (default) or Other.

use_high_resolution
boolean | null

Use high-resolution images for cropping and post-processing. Defaults to true.

validate_segments
any

JSON array of segment types to validate with VLM. Example: ["Table","Formula"].

validate_table_segments
boolean | null

Legacy: validate table segments using VLM. Prefer validate_segments instead.

xml_citation
boolean | null

Extract and hyperlink bibliography citations in the markdown output. Defaults to false.

Response

Upload URL created

Response from POST /v2/parse/upload.

credit_used
integer<int32>
required

Number of page credits deducted (initial reservation, reconciled on upload).

expires_at
string
required

RFC 3339 timestamp when upload_url expires.

job_id
string
required

Use this ID to poll GET /parse/{job_id} for status.

quota_remaining
integer<int32>
required

Remaining page credits after deduction.

upload_headers
object
required

Headers the client MUST include in the PUT request.

upload_method
string
required

Always "PUT".

upload_url
string
required

S3 presigned PUT URL. Valid until expires_at.