curl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@document.pdf;type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'ocr_engine=UnsiloedHawk' \
-F 'validate_table_segments=false' \
-F 'merge_tables=true' \
-F 'keep_segment_types=all' \
-F 'validate_segments=["Table","Picture","Formula"]' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Task created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"credit_used": 5,
"quota_remaining": 23695,
"merge_tables": false
}
Parse and segment PDFs, images, and Office files into meaningful sections using advanced AI models with flexible customization options.
curl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@document.pdf;type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'ocr_engine=UnsiloedHawk' \
-F 'validate_table_segments=false' \
-F 'merge_tables=true' \
-F 'keep_segment_types=all' \
-F 'validate_segments=["Table","Picture","Formula"]' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Task created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"credit_used": 5,
"quota_remaining": 23695,
"merge_tables": false
}
POST /v2/parse/upload) supports faster uploads, larger file sizes, and higher throughput.file or url parameter. Both cannot be provided simultaneously.url is not provided.file is not provided.true."smart_layout_detection" (default): Detects layout elements (tables, pictures, formulas, etc.) using bounding boxes."page_by_page": Treats each page as a single segment; faster for simple documents."auto_ocr" (default): Applies OCR only where needed."force_ocr": Applies OCR to all content regardless of existing text layer."UnsiloedBeta" (default): Handles rotated/warped text and irregular bounding boxes."UnsiloedHawk": Higher accuracy for complex layouts and mixed content."UnsiloedStorm": Enterprise-grade accuracy optimized for 50+ languages.false.validate_segments: ["Table"] instead. Defaults to false.["Table", "Formula", "Picture"]. Defaults to []."table,picture". Use "all" to include everything. Defaults to "all".Available segment types:table: Tabular data segmentspicture: Image and graphic segmentsformula: Mathematical equationstext: Regular text contentsectionheader: Section headerstitle: Document titleslistitem: List itemscaption: Image captionsfootnote: Footnotespageheader: Page headerspagefooter: Page footersfalse.{"html": false, "markdown": true, "ocr": false}. All fields default to true. Available fields: html, markdown, ocr, image, llm, content, bbox, confidence, embed.{"Table": {"html": "LLM", "markdown": "LLM", "model_id": "us_table_v2"}}.segment_analysis (Core Parser name). If both are provided, this takes precedence."1-5", "2,4,6", "[1,3,5]". Defaults to all pages."Unsiloed" (default) — e.g., PageHeader, ListItem, Picture. "Other" — alternative names e.g., Header, List Item, Figure.false.false.false.false."Continue" (default) — proceed despite errors (e.g., LLM refusals). "Fail" — stop and fail the task on any error.High-Accuracy Processing
{
"use_high_resolution": true,
"segmentation_method": "smart_layout_detection",
"ocr_mode": "force_ocr",
"merge_tables": true,
"validate_segments": ["Table", "Picture", "Formula"],
"segment_analysis": {
"Table": {
"html": "LLM",
"markdown": "LLM",
"extended_context": true,
"crop_image": "All",
"model_id": "us_table_v2"
}
}
}
Fast Processing
{
"use_high_resolution": false,
"segmentation_method": "page_by_page",
"ocr_mode": "auto_ocr",
"merge_tables": false
}
Financial Documents (Tables + Charts)
{
"merge_tables": true,
"keep_segment_types": "table,picture",
"validate_segments": ["Table", "Picture"],
"segmentation_method": "smart_layout_detection",
"ocr_mode": "auto_ocr",
"segment_analysis": {
"Table": {
"html": "LLM",
"markdown": "LLM",
"model_id": "us_table_v2"
}
}
}
Data Extraction Only (Tables)
{
"merge_tables": true,
"keep_segment_types": "table",
"validate_segments": ["Table"],
"output_fields": {
"html": true,
"markdown": true,
"ocr": false,
"image": false,
"content": true,
"bbox": false,
"confidence": false
}
}
Academic/Research Documents
{
"use_high_resolution": true,
"segmentation_method": "smart_layout_detection",
"ocr_mode": "auto_ocr",
"xml_citation": true
}
Scanned Documents
{
"use_high_resolution": true,
"ocr_mode": "force_ocr",
"segmentation_method": "smart_layout_detection"
}
Output Fields Optimization
{
"output_fields": {
"html": false,
"markdown": false,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": false,
"confidence": false,
"embed": true
}
}
{
"output_fields": {
"html": false,
"markdown": true,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": true,
"confidence": false,
"embed": true
}
}
output_fields or set all fields to true to include all available data.Benefits:file parameter): Upload the document file directly as multipart/form-dataurl parameter): Provide a publicly accessible URL or presigned URL to the documentfile or url, but not bothurl, the document will be downloaded from the provided URL before processingsegmentation_method parameter controls how the document is analyzed and segmented:
"smart_layout_detection" (default): Analyzes pages for layout elements (e.g., Table, Picture, Formula, etc.) using bounding boxes. Provides fine-grained segmentation and better chunking for complex documents.
"page_by_page": Treats each page as a single segment. Faster processing, ideal for simple documents without complex layouts.
ocr_mode parameter controls optical character recognition processing:
"auto_ocr" (default): Intelligently determines when OCR is needed based on the document content. Balances accuracy and performance.
"force_ocr": Applies OCR to all content regardless of existing text layer. Use this for scanned documents or when maximum text extraction is required.
merge_tables parameter enables merging of tables that span across multiple pages:
How It Works:
{
"merge_tables": true
}
xml_citation parameter enables automatic extraction and linking of citations from research papers, academic articles, and scientific documents.
How It Works:
{
"xml_citation": true
}
metadata field with structured citation data:
{
"metadata": {
"citations": [
{
"id": 1,
"title": "Deep Learning for NLP",
"authors": ["John Smith", "Jane Doe"],
"year": "2021",
"journal": "Nature",
"volume": "15",
"pages": "123-145",
"doi": "10.1000/example"
}
],
"document_metadata": {
"title": "Document Title",
"authors": ["Author Name"]
}
}
}
"As shown by Chen et al. (2021)...""As shown by [Chen et al. (2021)](#ref-5)..."keep_segment_types parameter allows you to filter the output to include only specific segment types, reducing response size and focusing on relevant content:
How It Works:
"all" (default): Include all segment types"table": Only table segments"picture": Only image/graphic segments"table,picture": Tables and pictures only"table,formula": Tables and formulas onlytable, picture, formula, text, sectionheader, title, listitem, caption, footnote, pageheader, pagefooter{
"keep_segment_types": "table,picture"
}
output_fields parameter allows you to control which fields are included in the API response. This is useful for reducing response size, improving performance, and optimizing bandwidth usage when you don’t need all available data.
Available Fields:
html (default: true): Include HTML representation of segmentsmarkdown (default: true): Include Markdown representation of segmentsocr (default: true): Include OCR results with bounding boxes and confidence scoresimage (default: true): Include cropped segment images (base64 encoded)llm (default: true): Include LLM-generated content and descriptionscontent (default: true): Include text content of segmentsbbox (default: true): Include bounding box coordinatesconfidence (default: true): Include confidence scores for segmentsembed (default: true): Include embed text in chunk responsesfalse to exclude them from the response. Fields not specified default to true for backward compatibility.
Example Configuration:
{
"html": false,
"markdown": true,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": true,
"confidence": false,
"embed": true
}
image and html can significantly reduce payload sizefalse when you only need basic contentimage, ocr, and llm when processing text contentcontent and embed when generating embeddingssegment_analysis parameter allows you to customize how different segment types are processed, including HTML/Markdown generation strategies and which field should populate the content field.
Available Segment Types:
You can configure processing for any of the following segment types:
Table: Tabular data segmentsPicture: Image and graphic segmentsFormula: Mathematical equationsTitle: Document titlesSectionHeader: Section headersText: Regular text contentListItem: List itemsCaption: Image captionsFootnote: FootnotesPageHeader: Page headersPageFooter: Page footersPage: Full page segmentshtml: Generation strategy for HTML representation
"Auto" (default): Automatically determine the best method"LLM": Use LLM to generate HTMLmarkdown: Generation strategy for Markdown representation
"Auto" (default): Automatically determine the best method"LLM": Use LLM to generate Markdowncontent_source: Defines which field should populate the content field in the response
"OCR" (default): Use OCR text for content"HTML": Use HTML representation as content"Markdown": Use Markdown representation as contentmodel_id (Table segments only): Specifies which AI model to use for table processing
"us_table_v1": Standard table processing model"us_table_v2": Enhanced table processing model with improved accuracy{
"Table": {
"html": "LLM",
"markdown": "LLM",
"content_source": "HTML",
"model_id": "us_table_v2"
},
"Picture": {
"html": "LLM",
"markdown": "LLM",
"content_source": "Markdown"
}
}
content_source Works:
The content_source parameter determines which field’s value will be used to populate the content field in the segment response:
content_source is set to "HTML", the content field will contain the HTML representation, and the separate html and markdown fields will be emptycontent_source is set to "Markdown", the content field will contain the Markdown representation, and the separate html and markdown fields will be emptycontent_source is set to "OCR" (default), the content field contains OCR text, and html and markdown fields are populated separatelycontent_source: "HTML" for Table segments when you want HTML-formatted table data directly in the content fieldcontent_source: "Markdown" for Picture segments when you want Markdown-formatted descriptions in the content field"LLM" for both html and markdown generation strategies to get AI-enhanced representations in those fieldsGET /parse/{job_id} to poll for results."Starting" on creation."unknown" when a URL was provided.merge_tables value).curl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F 'file=@document.pdf;type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'ocr_engine=UnsiloedHawk' \
-F 'validate_table_segments=false' \
-F 'merge_tables=true' \
-F 'keep_segment_types=all' \
-F 'validate_segments=["Table","Picture","Formula"]' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Task created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"credit_used": 5,
"quota_remaining": 23695,
"merge_tables": false
}
curl -X 'GET' \
'https://prod.visionapi.unsiloed.ai/parse/{job_id}' \
-H 'accept: application/json' \
-H 'api-key: your-api-key'
import requests
import time
def get_parse_results(job_id, api_key):
"""Monitor job and retrieve results when complete"""
headers = {"api-key": api_key}
status_url = f"https://prod.visionapi.unsiloed.ai/parse/{job_id}"
# Poll for completion
while True:
response = requests.get(status_url, headers=headers)
if response.status_code == 200:
status_data = response.json()
print(f"Job Status: {status_data['status']}")
if status_data['status'] == 'Succeeded':
return status_data # Results are included in the same response
elif status_data['status'] == 'Failed':
raise Exception(f"Job failed: {status_data.get('message', 'Unknown error')}")
time.sleep(5) # Check every 5 seconds
# Usage
job_id = "e77a5c42-4dc1-44d0-a30e-ed191e8a8908"
results = get_parse_results(job_id, "your-api-key")
{
"job_id": "04a7a6d8-5ef7-465a-b22a-8a98e7104dd9",
"status": "Succeeded",
"created_at": "2025-10-22T06:51:16.870302Z",
"started_at": "2025-10-22T06:51:16.966136Z",
"finished_at": "2025-10-22T06:57:19.821541Z",
"total_chunks": 25,
"chunks": [
{
"segments": [
{
"segment_type": "Title",
"content": "Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)",
"image": null,
"page_number": 1,
"segment_id": "cc5f8dff-31be-4ccf-885d-4f9062fcee17",
"confidence": 0.90187776,
"page_width": 1191.0,
"page_height": 1684.0,
"html": "<h1>Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)</h1>",
"markdown": "# Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)",
"bbox": {
"left": 72.92226,
"top": 62.030334,
"width": 230.36308,
"height": 55.395317
},
"ocr": [
{
"bbox": {
"left": 63.753525,
"top": 5.395447,
"width": 164.45312,
"height": 42.757812
},
"text": "Disinvestment",
"confidence": 0.9999992
}
]
},
{
"segment_type": "Text",
"content": "Background and context information about the disinvestment process...",
"image": null,
"page_number": 1,
"segment_id": "9d60e48b-77ba-4a23-a0ac-95ee13c615ec",
"confidence": 0.88558982,
"page_width": 1191.0,
"page_height": 1684.0,
"html": "<p>Background and context information about the disinvestment process...</p>",
"markdown": "Background and context information about the disinvestment process...",
"bbox": {
"left": 486.9685,
"top": 139.61847,
"width": 241.29932,
"height": 48.451706
},
"ocr": [
{
"bbox": {
"left": 50.9729,
"top": 3.4557495,
"width": 46.046875,
"height": 19.734375
},
"text": "Background",
"confidence": 0.99999654
}
]
}
]
}
]
}
model_id in the segment_analysis parameter:
us_table_v1: Standard table processing modelus_table_v2: Enhanced table processing model with improved accuracyfile nor url parameter providedfile and url simultaneously402): Not enough page credits remaining.429): Billing usage cap reached. Returns plain text: Usage limit exceeded. No Retry-After header.429): Org exceeded 60 requests / 60s sliding window. Returns plain text: Rate limit exceeded. A Retry-After header may be present depending on the infrastructure layer (Envoy/Istio), but is not set by the application.API key for authentication. Use 'Bearer <your_api_key>'
Multipart form data. Provide either file (binary upload) or url (presigned/public URL), not both.
Request body for POST /parse (multipart/form-data).
Provide either file (binary upload) or url (presigned/public URL) — not both.
Document file to process. Required if url is not provided.
Supported formats: PDF, PNG, JPEG, TIFF, PPT, PPTX, DOC, DOCX, XLS, XLSX.
JSON object for chunk processing configuration.
Detect checkboxes in the document. Defaults to false.
Error handling strategy for non-critical processing errors.
Continue (default) — proceed despite errors (e.g., LLM refusals).
Fail — stop and fail the task on any error.
Seconds until the task and its output are deleted. Defaults to the plan expiration time.
Extract structured data from charts and graphs. Defaults to false.
Transfer text color from the PDF text layer to OCR results. Defaults to false.
Attach hyperlink URLs from PDF annotations to OCR results. Defaults to false.
Filter output to include only the specified segment types (comma-separated).
Example: "table,picture". Use "all" to include everything. Defaults to "all".
JSON object for LLM processing configuration.
Merge tables that span multiple pages into a single unified structure. Defaults to false.
OCR engine to use for text recognition.
UnsiloedBeta (default) — handles rotated/warped text and irregular bounding boxes.
UnsiloedHawk — higher accuracy, complex layouts.
UnsiloedStorm — enterprise-grade accuracy, optimized for 50+ languages.
OCR processing mode.
auto_ocr (default) — applies OCR only where needed.
force_ocr — applies OCR to all content regardless of existing text layer.
JSON object controlling which output fields are included in the response.
Example: {"html": false, "markdown": true, "ocr": false}.
All fields default to true.
Page range to process. Formats: "1-5", "2,4,6", "[1,3,5]". Defaults to all pages.
JSON object controlling HTML/Markdown generation strategy and AI model per segment type.
Example: {"Table": {"html": "LLM", "markdown": "LLM", "model_id": "us_table_v2"}}.
Alias for segment_analysis (Core Parser name). If both are provided, this takes precedence.
Segment type naming convention.
Unsiloed (default) — e.g., PageHeader, ListItem, Picture.
Other — alternative names e.g., Header, List Item, Figure.
Document segmentation strategy.
smart_layout_detection (default) — detects layout elements (tables, pictures,
formulas, etc.) using bounding boxes.
page_by_page — treats each page as a single segment; faster for simple documents.
Presigned or public URL of the document to fetch and process.
Required if file is not provided.
Use high-resolution images for cropping and post-processing.
Latency penalty: ~2–3 s per page. Defaults to true.
JSON array string of segment types to validate with VLM.
Example: ["Table", "Formula", "Picture"]. Defaults to [].
Legacy: validate table segment classifications using VLM.
Prefer validate_segments: ["Table"] instead. Defaults to false.
Extract and hyperlink bibliography citations in the markdown output. PDFs only.
Defaults to false.
Job created — poll with GET /parse/{job_id} to retrieve results.
Response body for a successful POST /parse call.
ISO 8601 timestamp when the job was created.
Number of pages deducted from your quota for this job.
Name of the uploaded file or "unknown" when a URL was provided.
Job identifier — pass this to GET /parse/{job_id} to poll for results.
Whether table merging is enabled for this job (reflects the submitted merge_tables value).
Human-readable status message with a polling hint.
Remaining page quota after this job was deducted.
Initial job status. Always "Starting" on creation.