curl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F '[email protected];type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'merge_tables=true' \
-F 'validate_table_segments=false' \
-F 'keep_segment_types=all' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Job created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"quota_remaining": 23700,
"merge_tables": false,
}
Parse and segment PDFs, images, and Office files into meaningful sections using advanced AI models with flexible customization options.
curl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F '[email protected];type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'merge_tables=true' \
-F 'validate_table_segments=false' \
-F 'keep_segment_types=all' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Job created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"quota_remaining": 23700,
"merge_tables": false,
}
file or url parameter. Both cannot be provided simultaneously.url is not provided.file is not provided."smart_layout_detection" (default): Analyzes pages for layout elements using bounding boxes"page_by_page": Treats each page as a single segment"auto_ocr" (default): Automatically determine when to use OCR"full_ocr": Process all text elements with OCR"table", "picture", "table,picture", "table,formula". Default: “all”Available segment types:table: Tabular data segmentspicture: Image and graphic segmentsformula: Mathematical equationstext: Regular text contentsectionheader: Section headerstitle: Document titleslistitem: List itemscaption: Image captionsfootnote: Footnotespageheader: Page headerspagefooter: Page footersfalse to exclude them and reduce response size. Example: {"html": false, "markdown": true, "ocr": false}. Available fields: html, markdown, ocr, image, llm, content, bbox, confidence, embed.content field for each segment type, and configure the AI model for table processing. Example: {"Table": {"html": "LLM", "markdown": "LLM", "content_source": "HTML", "model_id": "us_table_v2"}}.High-Accuracy Processing
{
"use_high_resolution": true,
"segmentation_method": "smart_layout_detection",
"ocr_mode": "full_ocr",
"merge_tables": true,
"validate_table_segments": true,
"segment_analysis": {
"Table": {
"html": "LLM",
"markdown": "LLM",
"extended_context": true,
"crop_image": "All",
"model_id": "us_table_v2"
}
}
}
Fast Processing
{
"use_high_resolution": false,
"segmentation_method": "page_by_page",
"ocr_mode": "auto_ocr",
"merge_tables": false,
"validate_table_segments": false
}
Financial Documents (Tables + Charts)
{
"merge_tables": true,
"validate_table_segments": true,
"keep_segment_types": "table,picture",
"segmentation_method": "smart_layout_detection",
"ocr_mode": "auto_ocr",
"segment_analysis": {
"Table": {
"html": "LLM",
"markdown": "LLM",
"model_id": "us_table_v2"
}
}
}
Data Extraction Only (Tables)
{
"merge_tables": true,
"validate_table_segments": true,
"keep_segment_types": "table",
"output_fields": {
"html": true,
"markdown": true,
"ocr": false,
"image": false,
"content": true,
"bbox": false,
"confidence": false
}
}
Academic/Research Documents
{
"use_high_resolution": true,
"segmentation_method": "smart_layout_detection",
"ocr_mode": "auto_ocr",
"xml_citation": true
}
Scanned Documents
{
"use_high_resolution": true,
"ocr_mode": "full_ocr",
"segmentation_method": "smart_layout_detection"
}
Output Fields Optimization
{
"output_fields": {
"html": false,
"markdown": false,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": false,
"confidence": false,
"embed": true
}
}
{
"output_fields": {
"html": false,
"markdown": true,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": true,
"confidence": false,
"embed": true
}
}
output_fields or set all fields to true to include all available data.Benefits:file parameter): Upload the document file directly as multipart/form-dataurl parameter): Provide a publicly accessible URL or presigned URL to the documentfile or url, but not bothurl, the document will be downloaded from the provided URL before processingsegmentation_method parameter controls how the document is analyzed and segmented:
"smart_layout_detection" (default): Analyzes pages for layout elements (e.g., Table, Picture, Formula, etc.) using bounding boxes. Provides fine-grained segmentation and better chunking for complex documents.
"page_by_page": Treats each page as a single segment. Faster processing, ideal for simple documents without complex layouts.
ocr_mode parameter controls optical character recognition processing:
"auto_ocr" (default): Intelligently determines when OCR is needed based on the document content. Balances accuracy and performance.
"full_ocr": Applies OCR to all text elements in the document. Use this for scanned documents or when maximum text extraction is required.
merge_tables parameter enables intelligent merging of tables that span across multiple pages:
How It Works:
{
"merge_tables": true
}
xml_citation parameter enables automatic extraction and linking of citations from research papers, academic articles, and scientific documents.
How It Works:
{
"xml_citation": true
}
metadata field with structured citation data:
{
"metadata": {
"citations": [
{
"id": 1,
"title": "Deep Learning for NLP",
"authors": ["John Smith", "Jane Doe"],
"year": "2021",
"journal": "Nature",
"volume": "15",
"pages": "123-145",
"doi": "10.1000/example"
}
],
"document_metadata": {
"title": "Document Title",
"authors": ["Author Name"]
}
}
}
"As shown by Chen et al. (2021)...""As shown by [Chen et al. (2021)](#ref-5)..."keep_segment_types parameter allows you to filter the output to include only specific segment types, reducing response size and focusing on relevant content:
How It Works:
"all" (default): Include all segment types"table": Only table segments"picture": Only image/graphic segments"table,picture": Tables and pictures only"table,formula": Tables and formulas onlytable, picture, formula, text, sectionheader, title, listitem, caption, footnote, pageheader, pagefooter{
"keep_segment_types": "table,picture"
}
output_fields parameter allows you to control which fields are included in the API response. This is useful for reducing response size, improving performance, and optimizing bandwidth usage when you don’t need all available data.
Available Fields:
html (default: true): Include HTML representation of segmentsmarkdown (default: true): Include Markdown representation of segmentsocr (default: true): Include OCR results with bounding boxes and confidence scoresimage (default: true): Include cropped segment images (base64 encoded)llm (default: true): Include LLM-generated content and descriptionscontent (default: true): Include text content of segmentsbbox (default: true): Include bounding box coordinatesconfidence (default: true): Include confidence scores for segmentsembed (default: true): Include embed text in chunk responsesfalse to exclude them from the response. Fields not specified default to true for backward compatibility.
Example Configuration:
{
"html": false,
"markdown": true,
"ocr": false,
"image": false,
"llm": false,
"content": true,
"bbox": true,
"confidence": false,
"embed": true
}
image and html can significantly reduce payload sizefalse when you only need basic contentimage, ocr, and llm when processing text contentcontent and embed when generating embeddingssegment_analysis parameter allows you to customize how different segment types are processed, including HTML/Markdown generation strategies and which field should populate the content field.
Available Segment Types:
You can configure processing for any of the following segment types:
Table: Tabular data segmentsPicture: Image and graphic segmentsFormula: Mathematical equationsTitle: Document titlesSectionHeader: Section headersText: Regular text contentListItem: List itemsCaption: Image captionsFootnote: FootnotesPageHeader: Page headersPageFooter: Page footersPage: Full page segmentshtml: Generation strategy for HTML representation
"Auto" (default): Automatically determine the best method"LLM": Use LLM to generate HTMLmarkdown: Generation strategy for Markdown representation
"Auto" (default): Automatically determine the best method"LLM": Use LLM to generate Markdowncontent_source: Defines which field should populate the content field in the response
"OCR" (default): Use OCR text for content"HTML": Use HTML representation as content"Markdown": Use Markdown representation as contentmodel_id (Table segments only): Specifies which AI model to use for table processing
"us_table_v1": Standard table processing model"us_table_v2": Enhanced table processing model with improved accuracy{
"Table": {
"html": "LLM",
"markdown": "LLM",
"content_source": "HTML",
"model_id": "us_table_v2"
},
"Picture": {
"html": "LLM",
"markdown": "LLM",
"content_source": "Markdown"
}
}
content_source Works:
The content_source parameter determines which field’s value will be used to populate the content field in the segment response:
content_source is set to "HTML", the content field will contain the HTML representation, and the separate html and markdown fields will be emptycontent_source is set to "Markdown", the content field will contain the Markdown representation, and the separate html and markdown fields will be emptycontent_source is set to "OCR" (default), the content field contains OCR text, and html and markdown fields are populated separatelycontent_source is set to "LLM", the content field contains LLM-generated contentcontent_source: "HTML" for Table segments when you want HTML-formatted table data directly in the content fieldcontent_source: "Markdown" for Picture segments when you want Markdown-formatted descriptions in the content field"LLM" for both html/markdown generation strategies and set content_source: "LLM" to get AI-enhanced content in the content fieldcurl -X 'POST' \
'https://prod.visionapi.unsiloed.ai/parse' \
-H 'accept: application/json' \
-H 'api-key: your-api-key' \
-H 'Content-Type: multipart/form-data' \
-F '[email protected];type=application/pdf' \
-F 'use_high_resolution=true' \
-F 'segmentation_method=smart_layout_detection' \
-F 'ocr_mode=auto_ocr' \
-F 'merge_tables=true' \
-F 'validate_table_segments=false' \
-F 'keep_segment_types=all' \
-F 'segment_analysis={"Table":{"html":"LLM","markdown":"LLM","extended_context":true,"crop_image":"All","model_id":"us_table_v2"}}'
# Alternative: Use presigned URL instead of file upload
# Replace the file parameter with url parameter:
# -F 'url=https://your-bucket.s3.amazonaws.com/document.pdf?signature=...' \
{
"job_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
"status": "Starting",
"file_name": "document.pdf",
"created_at": "2025-07-18T10:42:10.545832520Z",
"message": "Job created successfully. Use GET /parse/{job_id} to check status and retrieve results.",
"quota_remaining": 23700,
"merge_tables": false,
}
curl -X 'GET' \
'https://prod.visionapi.unsiloed.ai/parse/{job_id}' \
-H 'accept: application/json' \
-H 'api-key: your-api-key'
import requests
import time
def get_parse_results(job_id, api_key):
"""Monitor job and retrieve results when complete"""
headers = {"api-key": api_key}
status_url = f"https://prod.visionapi.unsiloed.ai/parse/{job_id}"
# Poll for completion
while True:
response = requests.get(status_url, headers=headers)
if response.status_code == 200:
status_data = response.json()
print(f"Job Status: {status_data['status']}")
if status_data['status'] == 'Succeeded':
return status_data # Results are included in the same response
elif status_data['status'] == 'Failed':
raise Exception(f"Job failed: {status_data.get('message', 'Unknown error')}")
time.sleep(5) # Check every 5 seconds
# Usage
job_id = "e77a5c42-4dc1-44d0-a30e-ed191e8a8908"
results = get_parse_results(job_id, "your-api-key")
{
"job_id": "04a7a6d8-5ef7-465a-b22a-8a98e7104dd9",
"status": "Succeeded",
"created_at": "2025-10-22T06:51:16.870302Z",
"started_at": "2025-10-22T06:51:16.966136Z",
"finished_at": "2025-10-22T06:57:19.821541Z",
"total_chunks": 25,
"chunks": [
{
"segments": [
{
"segment_type": "Title",
"content": "Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)",
"image": null,
"page_number": 1,
"segment_id": "cc5f8dff-31be-4ccf-885d-4f9062fcee17",
"confidence": 0.90187776,
"page_width": 1191.0,
"page_height": 1684.0,
"html": "<h1>Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)</h1>",
"markdown": "# Disinvestment of IFCI's entire stake in Assets Care & Reconstruction Enterprise Ltd (ACRE)",
"bbox": {
"left": 72.92226,
"top": 62.030334,
"width": 230.36308,
"height": 55.395317
},
"ocr": [
{
"bbox": {
"left": 63.753525,
"top": 5.395447,
"width": 164.45312,
"height": 42.757812
},
"text": "Disinvestment",
"confidence": 0.9999992
}
]
},
{
"segment_type": "Text",
"content": "Background and context information about the disinvestment process...",
"image": null,
"page_number": 1,
"segment_id": "9d60e48b-77ba-4a23-a0ac-95ee13c615ec",
"confidence": 0.88558982,
"page_width": 1191.0,
"page_height": 1684.0,
"html": "<p>Background and context information about the disinvestment process...</p>",
"markdown": "Background and context information about the disinvestment process...",
"bbox": {
"left": 486.9685,
"top": 139.61847,
"width": 241.29932,
"height": 48.451706
},
"ocr": [
{
"bbox": {
"left": 50.9729,
"top": 3.4557495,
"width": 46.046875,
"height": 19.734375
},
"text": "Background",
"confidence": 0.99999654
}
]
}
]
}
]
}
model_id in the segment_analysis parameter:
us_table_v1: Standard table processing modelus_table_v2: Enhanced table processing model with improved accuracyfile nor url parameter providedfile and url simultaneouslySupported file types: PDFs, Images (PNG, JPEG, TIFF, BMP) and Office Documents (DOCX, XLSX, PPTX)
Whether to use high-resolution images for cropping and post-processing (default: false)
Document segmentation strategy
smart_layout_detection, page_by_page OCR processing strategy
auto_ocr, full_ocr Whether to merge adjacent table segments (default: false)
Filter output to include only specific segment types. Accepts comma-separated list (e.g., 'table', 'picture', 'table,picture') or 'all' for everything (default: 'all')
Enable citation extraction from PDF documents. Extracts structured bibliography and hyperlinks in-text citations in markdown output (default: false)
Whether to validate and correct table segment types using VLM (default: false)
Successful response
Unique identifier for the parsing job
Initial job status (typically 'Starting')
Name of the uploaded file
Timestamp when the job was created
Status message about the job creation
Remaining page quota for the API key
Whether table merging is enabled