Parse Document (Presigned Upload)
Parse and segment PDFs, images, and Office files into meaningful sections using advanced AI models with flexible customization options.
Overview
The v2 presigned upload endpoint decouples document delivery from job creation. Instead of uploading your file through the API server, you:- POST to
/v2/parse/uploadwith your configuration: the API creates a parse job and returns a short-lived presigned URL. - PUT your file directly to the presigned URL: the API server is never in the transfer path.
- Once the upload completes, the job is automatically enqueued for processing.
- Poll
GET /parse/{job_id}to track progress and retrieve results (the same endpoint as v1).
AwaitingUpload. It transitions to Queued once the upload completes, then to Processing, and finally Succeeded or Failed. See Get Parse Job Status for the full response schema and polling examples.Request
"report.pdf"). Determines the MIME type for the presigned URL. Supported formats: PDF, images (PNG, JPG/JPEG, TIFF/TIF, WebP), office documents (DOC, DOCX, PPT, PPTX, XLS, XLSX), and HTML (HTML/HTM).false.smart_layout_detection (default), page_by_page, or advanced_layout_detection."smart_layout_detection"(default): Intelligently identifies document structure, headers, sections, and content relationships across the entire document using bounding boxes."page_by_page": Analyzes each page independently as a single segment. Faster for simple documents."advanced_layout_detection": Uses a vision-language model for exhaustive page segmentation. Detects 14 element types (Caption, Footnote, Formula, ListItem, PageFooter, PageHeader, Picture, SectionHeader, Table, Text, Title, KeyValuePair, Signature, Seal). Best for visually complex or unusual layouts.
auto_detection (default) or force_ocr."auto_detection"(default): Intelligently detects bad quality PDFs, scanned documents, and images, then applies OCR only where needed."force_ocr": Runs OCR on the entire document regardless of quality.
UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm."UnsiloedHawk"(Recommended, default): Higher accuracy for complex layouts and mixed content."UnsiloedBeta": Handles rotated/warped text and irregular bounding boxes."UnsiloedStorm": Enterprise-grade accuracy optimized for 50+ languages.
"standard": Good balance of speed and accuracy."advanced": Higher quality, best for complex layouts, rotated text, and mixed-language content.
false.false.20; values below 2 are clamped to 2.false.false.detect_pii is enabled: any (default), low, medium, or high. any blocks on any PII found; low adds quasi-identifiers (names, dates, locations); medium blocks on contact PII (email, phone) or higher; high blocks only on direct identifiers (SSN, passport, credit card). Ignored if detect_pii is false.standard (default) or advanced (higher precision, additional processing cost). Any other value falls back to standard. Ignored if detect_pii is false.false.false.false."1-5" (pages 1 to 5), "2,4,6" (specific pages), "[1,3,5]" (array format), "1-3,7,10-15" (combination).["docx", "markdown", "json"]. The exported files are available via the exports field in the task response."Unsiloed" (default) uses names like PageHeader, ListItem, Picture. "Other" uses alternative names like Header, List Item, Figure."all" to include every type. Defaults to "all".["Table", "Formula", "Picture"]. Defaults to ["Table", "Picture"]; an empty or unparseable value also falls back to that default, so Table and Picture validation runs even when this field is omitted.validate_segments instead.segment_processing. The v1 alias segment_analysis is not accepted here and is silently ignored.html/markdown:"VLM"or"Auto"model_id(Table):"astra","us_table_v1","us_table_v2"model_id(Picture/Formula):"nova","luna","sol"use_table_ocr(Table only): Advanced OCR for bordered cells and complex table layouts.vlm: Custom prompt for the VLM model. Use this to give the model specific instructions for extracting or describing these segment types.translation: Optional per-segment translation, e.g.{"provider": "Auto", "target_language": "en"}.provideris"Auto"for fast machine translation or"VLM"/"LLM"for model-based translation;target_languageis an ISO 639-1 code, or"auto"to auto-detect the source and translate to English.
false to exclude them and reduce response size. Available fields: html, markdown, ocr, image, content, bbox, confidence, embed, chart_data. All default to true."Continue" (default) skips failed pages and continues processing the rest. "Fail" aborts the entire job on the first error.Response
GET /parse/{job_id} to poll status and retrieve results.api-key header is needed for the upload itself. Valid until expires_at.upload_url expires. Upload must complete before this time. The presigned URL is valid for 15 minutes by default; when expires_in is set (and positive), the upload URL lifetime is set to that same value, which can shorten or extend the window."PUT". Use an HTTP PUT request when uploading to upload_url.Content-Type set to the MIME type inferred from file_name.Step-by-Step Guide
Step 1: Create the parse job
POST to/v2/parse/upload with your file name and configuration. The API returns a job_id and a short-lived presigned URL.
Step 2: Upload the file
Use theupload_url and upload_headers from the response to PUT your file. No API key is needed for this request.
Step 3: Poll for results
Once the upload completes, the job transitions fromAwaitingUpload → Queued → Processing → Succeeded. Poll GET /parse/{job_id} using the same endpoint as v1.
Why Use v2
| Feature | v1 (POST /parse) | v2 (POST /v2/parse/upload) |
|---|---|---|
| File delivery | Through API server | Direct via presigned URL |
| Max file size | Limited by server upload | Up to 5 GB via direct PUT |
| Upload speed | Bottlenecked by API server | Full client bandwidth |
| Concurrency | Shared server capacity | No server contention |
| Best for | Quick uploads, small files | Large files, high volume |
Error Handling
| Status | Cause | Action |
|---|---|---|
400 | Invalid or missing file_name, unsupported extension | Fix file_name and retry |
401 | Missing or invalid api-key | Check your API key |
402 | Insufficient quota or expired credits | Add page credits to your account or renew your plan |
403 | Access has been revoked | Contact support |
429 | Rate limit exceeded | Back off and retry after 1 second |
500 | Internal server error | Retry with exponential backoff |
503 | Job queue is at capacity | Back off and retry after the Retry-After header value |
403 on upload | Missing or wrong headers, expired URL | Check upload_headers, get a new URL |
Job status Failed | Processing error | Check message field in the status response |
Queued straight to Failed without ever reaching Processing: after the upload, the document is page-counted and rejected if it exceeds the page limit (default 2000 pages; split large documents first) or if the remaining page quota cannot cover it.Authorizations
API key for authentication. Use 'Bearer <your_api_key>'
Body
Request body for POST /v2/parse/upload. Configuration fields mirror the existing /parse multipart form fields.
File name with extension. Required. Determines content-type.
Enable per-segment agentic OCR for higher accuracy. Pass "standard" or "advanced".
JSON object for chunk processing configuration.
Run a PII pre-check before parsing. If PII is found at the configured severity,
the task is rejected without parsing. Defaults to false.
Fix the reading order of detected segments. Useful for multi-column layouts. Defaults to false.
Error handling strategy: Continue (default) or Fail.
Seconds until the task and its output are deleted.
Export format(s) to generate after processing. Supported: ["docx", "markdown", "json"].
File format for exporting parsed results. When specified in a parse request,
the pipeline generates the requested export file after processing completes.
The exported file is available via the exports field in the task response.
docx, markdown, json Transfer text color from the PDF text layer to OCR results. Defaults to false.
Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.
Preserve strikethrough formatting in HTML/Markdown output. Defaults to false.
Layout analysis strategy: smart_layout_detection (default) or page_by_page, or advanced_layout_detection (VLM-based exhaustive segmentation with 14 element types).
JSON object for LLM processing configuration.
Maximum number of tables per merge group. Groups larger than this are split. Defaults to 20.
Merge tables that span multiple pages into a single unified structure. Defaults to false.
OCR engine: UnsiloedHawk (default, recommended), UnsiloedBeta, or UnsiloedStorm.
OCR strategy: auto_detection (default) or force_ocr.
JSON object controlling which output fields are included in the response.
Page range to process. Formats: "1-5", "2,4,6", "[1,3,5]". Defaults to all pages.
Severity threshold to block on: any (default), low, medium, high.
PII detector engine: standard (default) or advanced (higher precision, additional processing cost). Any other value falls back to standard. Ignored if detect_pii is false.
Segment filter: comma-separated segment types to keep, or "all". Defaults to "all".
JSON object for segment processing/analysis configuration.
Segment type naming convention: Unsiloed (default) or Other.
Use high-resolution images for cropping and post-processing. Defaults to false.
JSON array of segment types to validate with VLM. Example: ["Table","Formula"].
Legacy: validate table segments using VLM. Prefer validate_segments instead.
Extract and hyperlink bibliography citations in the markdown output. Defaults to false.
Response
Upload URL created
Response from POST /v2/parse/upload.
Number of page credits deducted (initial reservation, reconciled on upload).
RFC 3339 timestamp when upload_url expires.
Use this ID to poll GET /parse/{job_id} for status.
Remaining page credits after deduction.
Headers the client MUST include in the PUT request.
Always "PUT".
S3 presigned PUT URL. Valid until expires_at.

