Parsing

Overview

Document parsing in our Vision API is achieved through intelligent chunking strategies that analyze document structure using advanced AI and vision language models. The parsing functionality identifies different document elements like text blocks, tables, images, headers, and footers while maintaining proper reading order and extracting content with high accuracy.

Parsing capabilities are accessed through the /chunking endpoint, which combines structure detection with content extraction and intelligent segmentation.

Key Features

Element Detection

Identify and classify document elements using advanced AI models

Content Extraction

Extract text, tables, and images with appropriate processing methods

Reading Order

Maintain proper document flow and reading sequence

Multi-Modal Processing

Handle text, images, tables, and formulas with specialized extractors

Document Element Types

The parsing system can identify and process the following element types through chunking strategies:

Text Elements

Text: Regular paragraph text
Title: Document and section titles
Section-header: Section headings
Page-header: Header content
Page-footer: Footer content
Caption: Image and table captions
Footnote: Footnote references and content
List-item: Bulleted and numbered lists

Visual Elements

Table: Structured tabular data
Picture: Images, charts, and diagrams
Formula: Mathematical equations and expressions

Parsing Through Chunking Strategies

Semantic Parsing with Chunking

import requests

# Parse and chunk document semantically
url = "https://visionapi.unsiloed.ai/chunking"
headers = {
    "accept": "application/json",
    "X-API-Key": "your-api-key"
}

files = {
    "document_file": ("document.pdf", open("document.pdf", "rb"), "application/pdf")
}

data = {
    "strategy": "semantic",  # Uses AI for intelligent parsing
    "chunk_size": 1000,
    "overlap": 100
}

response = requests.post(url, headers=headers, files=files, data=data)
job_info = response.json()

print(f"Parsing job started: {job_info['job_id']}")

# Check job status
status_url = f"https://visionapi.unsiloed.ai/jobs/{job_info['job_id']}"
status_response = requests.get(status_url, headers=headers)
print(f"Status: {status_response.json()['status']}")

Parsing Strategies

Semantic Parsing (Recommended)

Uses advanced AI to intelligently identify and group document elements. Best for maintaining semantic context and understanding document structure.

Hybrid Parsing

Multi-modal processing that combines table extraction, image analysis, and enhanced metadata extraction. Most comprehensive parsing option.

Heading-Based Parsing

Analyzes document structure based on detected headings and section hierarchy. Ideal for structured documents with clear heading patterns.

Page-Based Parsing

Processes documents page by page, maintaining page boundaries. Useful for documents where page structure is important.

Response Format

Chunked Parsing Results

When using chunking strategies (/chunking), you get parsed content organized into meaningful chunks:

{
  "job_id": "b2094b38-e432-44b6-a5d0-67bed07d5de1",
  "status": "completed",
  "result": {
    "chunks": [
      {
        "content": "Annual Financial Report 2023\n\nExecutive Summary\n\nThis report presents our financial performance for the fiscal year 2023...",
        "metadata": {
          "chunk_index": 0,
          "page_numbers": [1],
          "element_types": ["Title", "Section-header", "Text"],
          "reading_order": [0, 1, 2],
          "confidence_scores": [0.95, 0.92, 0.88]
        }
      },
      {
        "content": "Financial Performance\n\n| Quarter | Revenue | Profit |\n|---------|---------|--------|\n| Q1 | $1.2M | $200K |\n| Q2 | $1.4M | $280K |",
        "metadata": {
          "chunk_index": 1,
          "page_numbers": [1, 2],
          "element_types": ["Section-header", "Table"],
          "reading_order": [3, 4],
          "confidence_scores": [0.91, 0.92]
        }
      }
    ],
    "total_chunks": 2,
    "processing_metadata": {
      "strategy_used": "semantic",
      "total_pages": 2,
      "total_elements_detected": 8,
      "processing_time": 4.2
    }
  }
}

Advanced Parsing Features

Multi-Format Support

The parsing system supports multiple document formats:

# Parse different document types
document_types = [
    ("document.pdf", "application/pdf"),
    ("presentation.pptx", "application/vnd.openxmlformats-officedocument.presentationml.presentation"),
    ("report.docx", "application/vnd.openxmlformats-officedocument.wordprocessingml.document")
]
    
for filename, content_type in document_types:
    files = {"document_file": (filename, open(filename, "rb"), content_type)}
    response = requests.post(chunking_url, headers=headers, files=files, data=data)
    print(f"Parsed {filename}: {response.json()['job_id']}")

Element-Specific Processing

Different element types are processed with specialized methods:

Text Elements: OCR with post-processing for clarity
Tables: Structured extraction maintaining column/row relationships
Images: Vision analysis for content description
Formulas: Mathematical expression recognition
Headers/Footers: Context-aware text extraction

Integration Examples

Document Analysis Workflow

import requests
import time

class DocumentParser:
    def __init__(self, api_key):
        self.api_key = api_key
        self.base_url = "https://visionapi.unsiloed.ai"
        self.headers = {
            "accept": "application/json",
            "X-API-Key": api_key
        }
    
    def parse_document(self, file_path, strategy="semantic"):
        """Parse document and return structured results"""
        
        # Start parsing job
        files = {"document_file": open(file_path, "rb")}
        data = {"strategy": strategy}
        
        response = requests.post(
            f"{self.base_url}/chunking",
            headers=self.headers,
            files=files,
            data=data
        )
        
        job_id = response.json()["job_id"]
        
        # Wait for completion
        while True:
            status_response = requests.get(
                f"{self.base_url}/jobs/{job_id}",
                headers=self.headers
            )
            status = status_response.json()["status"]
            
            if status == "completed":
                return status_response.json()["result"]
            elif status == "failed":
                raise Exception("Parsing failed")
            
            time.sleep(2)

# Usage
parser = DocumentParser("your-api-key")

# Parse into semantic chunks
chunks = parser.parse_document("document.pdf", strategy="semantic")
print(f"Created {len(chunks['chunks'])} semantic chunks")

Best Practices

Choosing the Right Strategy

Performance Optimization

Quality Assurance

Error Handling

def safe_parse_document(file_path, api_key):
    """Parse document with proper error handling"""
    
    try:
        parser = DocumentParser(api_key)
        result = parser.parse_document(file_path)
        return {"success": True, "data": result}
        
    except requests.exceptions.HTTPError as e:
        if e.response.status_code == 400:
            return {"success": False, "error": "Unsupported file type"}
        elif e.response.status_code == 401:
            return {"success": False, "error": "Invalid API key"}
        else:
            return {"success": False, "error": f"HTTP error: {e.response.status_code}"}
            
    except Exception as e:
        return {"success": False, "error": f"Processing error: {str(e)}"}

# Usage with error handling
result = safe_parse_document("document.pdf", "your-api-key")
if result["success"]:
    chunks = result["data"]["chunks"]
    print(f"Successfully parsed {len(chunks)} chunks")
else:
    print(f"Parsing failed: {result['error']}")

Chunking API Reference - Complete API documentation
Document Extraction - Structured data extraction
Document Classification - Document type classification
Document Splitting - Page-level document splitting

Core Endpoints

Job Management

Overview

Key Features

Element Detection

Content Extraction

Reading Order

Multi-Modal Processing

Document Element Types

Text Elements

Visual Elements

Parsing Through Chunking Strategies

Semantic Parsing with Chunking

Parsing Strategies

Semantic Parsing (Recommended)

Hybrid Parsing

Heading-Based Parsing

Page-Based Parsing

Response Format

Chunked Parsing Results

Advanced Parsing Features

Multi-Format Support

Element-Specific Processing

Integration Examples

Document Analysis Workflow

Best Practices

Error Handling

Core Endpoints

Job Management

​Overview

​Key Features

Element Detection

Content Extraction

Reading Order

Multi-Modal Processing

​Document Element Types

​Text Elements

​Visual Elements

​Parsing Through Chunking Strategies

​Semantic Parsing with Chunking

​Parsing Strategies

​Semantic Parsing (Recommended)

​Hybrid Parsing

​Heading-Based Parsing

​Page-Based Parsing

​Response Format

​Chunked Parsing Results

​Advanced Parsing Features

​Multi-Format Support

​Element-Specific Processing

​Integration Examples

​Document Analysis Workflow

​Best Practices

​Error Handling

​Related Endpoints

Overview

Key Features

Document Element Types

Text Elements

Visual Elements

Parsing Through Chunking Strategies

Semantic Parsing with Chunking

Parsing Strategies

Semantic Parsing (Recommended)

Hybrid Parsing

Heading-Based Parsing

Page-Based Parsing

Response Format

Chunked Parsing Results

Advanced Parsing Features

Multi-Format Support

Element-Specific Processing

Integration Examples

Document Analysis Workflow

Best Practices

Error Handling

Related Endpoints