POST
/
parse
curl -X 'POST' \
  'https://prod.visionapi.unsiloed.ai/parse' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@document.pdf;type=application/pdf' \
  -F 'high_resolution=true' \
  -F 'chunk_processing={"ignore_headers_and_footers": false, "target_length": 512, "tokenizer": {"Enum": "Word"}}' \
  -F 'segment_processing={"Table": {"html": "LLM", "markdown": "LLM", "extended_context": false}, "Picture": {"crop_image": "All", "html": "LLM", "markdown": "LLM", "embed_sources": ["Markdown"]}, "Formula": {"html": "LLM", "markdown": "LLM"}}' \
  -F 'error_handling=Continue' \
  -F 'segmentation_strategy=LayoutAnalysis' \
  -F 'ocr_strategy=All'
{
  "task_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
  "status": "Starting",
  "file_name": "document.pdf",
  "created_at": "2025-07-18T10:42:10.545832520Z",
  "expires_at": "2025-07-18T11:42:10.545832520Z",
  "message": "Task created successfully. Use GET /parse/{task_id} to check status and retrieve results.",
  "task_url": "https://prod.visionapi.unsiloed.ai/parse/e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
  "configuration": {
    "high_resolution": true,
    "chunk_processing": {
      "ignore_headers_and_footers": false,
      "target_length": 512,
      "tokenizer": {
        "Enum": "Word"
      }
    },
    "error_handling": "Continue",
    "segmentation_strategy": "LayoutAnalysis",
    "ocr_strategy": "All"
  }
}

Overview

The Parse Document endpoint processes PDF documents and breaks them into meaningful sections with detailed analysis including text extraction, image recognition, table parsing, and OCR data. This endpoint supports advanced customization options for fine-tuning the parsing behavior to match your specific use cases.
This endpoint returns a task ID for asynchronous processing. Use the GET /parse/ endpoint to check status and retrieve results when processing is complete.

Request

file
file
required
PDF file to process.
high_resolution
boolean
Whether to use high-resolution images for cropping and post-processing. (Latency penalty: ~7 seconds per page)
chunk_processing
object
Advanced chunking configuration options
segment_processing
object
Segment-specific processing configuration for different document elements
error_handling
string
Error handling strategy: “Continue”, “Fail” (default: “Continue”)
segmentation_strategy
string
Document segmentation strategy: “LayoutAnalysis” or “Page”
llm_processing
object
LLM processing configuration for enhanced content analysis
ocr_strategy
string
OCR processing strategy: “All”, “Auto”(default: “All”)

Advanced Configuration Options

Chunk Processing Configuration

The chunk_processing parameter allows fine-tuning of how document content is chunked:
{
  "chunk_processing": {
    "ignore_headers_and_footers": false,
    "target_length": 512,
    "tokenizer": {
      "Enum": "Word"
    }
  }
}
Parameters:
  • ignore_headers_and_footers (boolean, optional): Whether to exclude headers and footers from chunking.
  • target_length (number, optional): Target chunk length in tokens (default: 512)
  • tokenizer (object, optional): Tokenization strategy configuration with Enum values: “Word”, “Cl100kBase”, “XlmRobertaBase”, “BertBaseUncased”

Segment Processing Configuration

The segment_processing parameter allows customization of how different document segments are processed. Each segment type can be configured individually:
{
  "segment_processing": {
    "Table": {
      "html": "LLM",
      "markdown": "LLM",
      "extended_context": false
    },
    "Picture": {
      "crop_image": "All",
      "html": "LLM",
      "markdown": "LLM",
      "embed_sources": ["Markdown"]
    },
    "Formula": {
      "html": "LLM",
      "markdown": "LLM"
    },
    "Text": {
      "html": "LLM",
      "markdown": "LLM"
    },
    "SectionHeader": {
      "html": "LLM",
      "markdown": "LLM"
    },
    "Title": {
      "html": "LLM",
      "markdown": "LLM"
    },
    "Caption": null,
    "Footnote": null,
    "ListItem": null,
    "Page": null,
    "PageFooter": null,
    "PageHeader": null
  }
}
  • html (string, optional): HTML generation method: “LLM”, “Auto”
  • markdown (string, optional): Markdown generation method: “LLM”, “Auto”
  • extended_context (boolean, optional): Use the full page image as context for LLM generation
  • crop_image Controls whether to crop the file’s images to the segment’s bounding box. The cropped image will be stored in the segment’s image field. Use All to always crop, or Auto to only crop when needed for post-processing

Available Segment Types

  • Table: Tabular data processing
  • Picture: Image and graphic processing
  • Formula: Mathematical expression processing
  • Text: Regular text content processing
  • SectionHeader: Section heading processing
  • Title: Document title processing
  • Caption: Image/table caption processing
  • Footnote: Footnote processing
  • ListItem: List item processing
  • Page: Page-level processing
  • PageFooter: Footer processing
  • PageHeader: Header processing

LLM Processing Configuration

The llm_processing parameter configures language model processing:
{
  "llm_processing": {
    "fallback_strategy": "None",
    "max_completion_tokens": 1000,
    "model_id": "gpt-4",
    "temperature": 0.1
  }
}
Parameters:
  • fallback_strategy (string, optional): Fallback strategy when LLM processing fails: “None”,
  • max_completion_tokens (number, optional): Maximum tokens for LLM completion (default: 1000)
  • model_id (string, optional): LLM model identifier to use for processing
  • temperature (number, optional): Temperature setting for LLM processing (0.0 to 1.0)

Error Handling Strategies

error_handling (string, optional): How to handle processing errors:
  • “Continue”: Continue processing and report errors
  • “Fail”: Stop processing on first error

Segmentation Strategy Options

segmentation_strategy (string, optional): Document segmentation approach:
  • “LayoutAnalysis”: Analyzes pages for layout elements (e.g., Table, Picture, Formula, etc.) using bounding boxes. Provides fine-grained segmentation and better chunking.
  • “Page”: Treats each page as a single segment. Faster processing, but without layout element detection and only simple chunking.

OCR Strategy Options

ocr_strategy (string, optional): OCR processing strategy:
  • “All”: Process all text elements with OCR
  • “Auto”: Automatically determine when to use OCR

Response

task_id
string
Unique identifier for the processing task
status
string
Initial task status (typically “Starting”)
file_name
string
Name of the uploaded file
created_at
string
Timestamp when the task was created
message
string
Status message about the task creation
configuration
object
Complete configuration used for processing
expires_at
string
Timestamp when the task expires (if expiration is set)
task_url
string
URL for accessing task status and results

Document Analysis Features

The parsing endpoint provides comprehensive document analysis including:

Text Extraction

Extracts text content with high accuracy, preserving formatting and structure.

Image Recognition

Identifies and analyzes images within documents, providing descriptions and metadata.

Table Parsing

Extracts tabular data with proper structure and formatting.

OCR Processing

Performs optical character recognition on text elements with confidence scores.

Section Detection

Automatically identifies different document sections like headers, body text, and captions.

Bounding Box Information

Provides precise coordinates for all extracted elements.

Advanced Content Processing

  • LLM-Enhanced Analysis: Uses language models for better content understanding
  • Multi-Format Output: Generates HTML, Markdown, and plain text versions
  • Context-Aware Processing: Maintains document context across segments
  • Intelligent Chunking: Creates semantically meaningful document chunks
curl -X 'POST' \
  'https://prod.visionapi.unsiloed.ai/parse' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key' \
  -H 'Content-Type: multipart/form-data' \
  -F 'file=@document.pdf;type=application/pdf' \
  -F 'high_resolution=true' \
  -F 'chunk_processing={"ignore_headers_and_footers": false, "target_length": 512, "tokenizer": {"Enum": "Word"}}' \
  -F 'segment_processing={"Table": {"html": "LLM", "markdown": "LLM", "extended_context": false}, "Picture": {"crop_image": "All", "html": "LLM", "markdown": "LLM", "embed_sources": ["Markdown"]}, "Formula": {"html": "LLM", "markdown": "LLM"}}' \
  -F 'error_handling=Continue' \
  -F 'segmentation_strategy=LayoutAnalysis' \
  -F 'ocr_strategy=All'
{
  "task_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
  "status": "Starting",
  "file_name": "document.pdf",
  "created_at": "2025-07-18T10:42:10.545832520Z",
  "expires_at": "2025-07-18T11:42:10.545832520Z",
  "message": "Task created successfully. Use GET /parse/{task_id} to check status and retrieve results.",
  "task_url": "https://prod.visionapi.unsiloed.ai/parse/e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
  "configuration": {
    "high_resolution": true,
    "chunk_processing": {
      "ignore_headers_and_footers": false,
      "target_length": 512,
      "tokenizer": {
        "Enum": "Word"
      }
    },
    "error_handling": "Continue",
    "segmentation_strategy": "LayoutAnalysis",
    "ocr_strategy": "All"
  }
}

Retrieving Results

After the task is created, use the GET /parse/ endpoint to check status and retrieve results:
cURL
curl -X 'GET' \
  'https://prod.visionapi.unsiloed.ai/parse/{task_id}' \
  -H 'accept: application/json' \
  -H 'api-key: your-api-key'
Python
import requests
import time

def get_parse_results(task_id, api_key):
    """Monitor task and retrieve results when complete"""
    
    headers = {"api-key": api_key}
    status_url = f"https://prod.visionapi.unsiloed.ai/parse/{task_id}"
    
    # Poll for completion
    while True:
        response = requests.get(status_url, headers=headers)
        
        if response.status_code == 200:
            status_data = response.json()
            print(f"Task Status: {status_data['status']}")
            
            if status_data['status'] == 'Succeeded':
                return status_data  # Results are included in the same response
                    
            elif status_data['status'] == 'Failed':
                raise Exception(f"Task failed: {status_data.get('message', 'Unknown error')}")
                
        time.sleep(5)  # Check every 5 seconds

# Usage
task_id = "e77a5c42-4dc1-44d0-a30e-ed191e8a8908"
results = get_parse_results(task_id, "your-api-key")

Expected Results Structure

When the task completes successfully, the response contains comprehensive document analysis with enhanced processing:
{
  "task_id": "e77a5c42-4dc1-44d0-a30e-ed191e8a8908",
  "status": "Succeeded",
  "created_at": "2025-07-18T10:42:10.545832Z",
  "started_at": "2025-07-18T10:42:10.666328Z",
  "finished_at": "2025-07-18T10:42:23.560572Z",
  "message": "Task succeeded",
  "total_chunks": 2,
  "chunks": [
    {
      "segments": [
        {
          "segment_type": "Picture",
          "content": "eternal",
          "image": "https://s3.us-east-1.amazonaws.com/unsiloed-bucket/...",
          "page_number": 1,
          "segment_id": "1c60ecbd-b6da-493c-b3f6-6849337a981f",
          "confidence": 0.6062846,
          "page_width": 1191.0,
          "page_height": 1684.0,
          "html": "<p>The image displays a logo...</p>",
          "markdown": "Eternal Logo\n\nThe image displays a logo...",
          "bbox": {
            "left": 72.92226,
            "top": 62.030334,
            "width": 230.36308,
            "height": 55.395317
          },
          "ocr": [
            {
              "bbox": {
                "left": 63.753525,
                "top": 5.395447,
                "width": 164.45312,
                "height": 42.757812
              },
              "text": "eternal",
              "confidence": 0.9999992
            }
          ]
        },
        {
          "segment_type": "SectionHeader",
          "content": "Tax Invoice ORIGINAL For Recipient",
          "image": null,
          "page_number": 1,
          "segment_id": "9d60e48b-77ba-4a23-a0ac-95ee13c615ec",
          "confidence": 0.46558982,
          "page_width": 1191.0,
          "page_height": 1684.0,
          "html": "<h2>Tax Invoice ORIGINAL For Recipient</h2>",
          "markdown": "## Tax Invoice ORIGINAL For Recipient",
          "bbox": {
            "left": 486.9685,
            "top": 139.61847,
            "width": 241.29932,
            "height": 48.451706
          },
          "ocr": [
            {
              "bbox": {
                "left": 50.9729,
                "top": 3.4557495,
                "width": 46.046875,
                "height": 19.734375
              },
              "text": "Tax",
              "confidence": 0.99999654
            }
          ]
        }
      ]
    }
  ]
}

Segment Types

The parsing API identifies and processes different types of document segments with enhanced processing:

Picture

Images and graphics within the document, including logos, charts, and illustrations. Enhanced with LLM-based description generation.

SectionHeader

Document headers and titles that define section boundaries. Processed with semantic understanding.

Text

Regular text content including paragraphs, sentences, and individual text elements. Enhanced with context-aware processing.

Table

Tabular data with structured rows and columns. Enhanced with LLM-based formatting and extended context options.

Caption

Text captions associated with images or figures. Processed with relationship awareness.

Formula

Mathematical equations and expressions. Enhanced with specialized formula processing.

Title

Document titles and main headings. Processed with enhanced formatting.

Footnote

Document footnotes and references. Processed with context linking.

ListItem

Bulleted and numbered list items. Processed with structure preservation. Each segment includes detailed metadata such as confidence scores, bounding boxes, OCR data, and formatted output in both HTML and Markdown with LLM enhancement.

Configuration Best Practices

For High-Accuracy Processing

{
  "high_resolution": true,
  "chunk_processing": {
    "ignore_headers_and_footers": false,
    "target_length": 512
  },
  "segment_processing": {
    "Table": {"html": "LLM", "markdown": "LLM"},
    "Picture": {"crop_image": "All", "html": "LLM"},
    "Formula": {"html": "LLM", "markdown": "LLM"}
  },
  "segmentation_strategy": "LayoutAnalysis",
  "ocr_strategy": "All"
}

Error Handling

Common Error Scenarios

  1. Invalid API Key: Authentication failed
  2. File Too Large: File exceeds size limits
  3. Invalid Configuration: Malformed processing parameters
  4. Server Error: Internal processing error
  5. Processing Timeout: Task took too long to complete

Authorizations

api-key
string
header
required

Body

multipart/form-data

Response

200 - application/json

Successful response

The response is of type object.