Skip to main content
POST
/
splitter
curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
  -H "api-key: your-api-key" \
  -F "file=@mixed_documents.pdf" \
  -F 'categories=[{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"}]'
{
  "success": true,
  "message": "Successfully split PDF into 2 files",
  "files": [
    {
      "name": "invoice.pdf",
      "fileId": "d079d09f-201c-4420-a50a-b25678a71ae9",
      "type": "file",
      "path": "invoice.pdf",
      "full_path": "https://lyltzyvtloozzovxrupp.supabase.co/storage/v1/object/public/files/ef3ec356-b407-4f9f-ac8f-0dfdef9034c0_invoice.pdf?",
      "confidence_score": 0.8
    },
    {
      "name": "contract.pdf",
      "fileId": "320616cc-8dfd-4b8a-8474-8e7a42d9e287",
      "type": "file",
      "path": "contract.pdf",
      "full_path": "https://lyltzyvtloozzovxrupp.supabase.co/storage/v1/object/public/files/dfaa5d30-6955-4a69-9c69-7e3c4efd8450_contract.pdf?",
      "confidence_score": 0.8
    }
  ]
}

Overview

The Split Document endpoint analyzes PDF pages, classifies them into predefined categories, and creates separate PDF files for each category. This is ideal for processing mixed document batches like scanned files containing invoices, contracts, and reports.
The endpoint processes documents asynchronously via a job-based system. It returns a job_id immediately and processes the document in the background. Poll the status endpoint to retrieve results when complete.

Request

file
file
required
The PDF file to split. Either file or file_url must be provided.
file_url
string
URL to a PDF file to split. Either file or file_url must be provided.
categories
string
required
JSON string containing array of category objects with name and optional description (e.g., [{"name":"invoice","description":"Financial invoices"}])

Response

job_id
string
Unique identifier for the splitting job
status
string
Current status of the job (“processing”)
message
string
Human-readable status message
quota_remaining
number
Remaining API quota after this request

Job Status Response

The endpoint returns a JSON response containing information about the split PDF documents that have been uploaded to cloud storage.
success
boolean
Whether the splitting operation succeeded
message
string
Descriptive message about the splitting operation
files
array
Array of split PDF files with their metadata

Request Examples

curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
  -H "api-key: your-api-key" \
  -F "file=@mixed_documents.pdf" \
  -F 'categories=[{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"}]'

Response Examples

{
  "success": true,
  "message": "Successfully split PDF into 2 files",
  "files": [
    {
      "name": "invoice.pdf",
      "fileId": "d079d09f-201c-4420-a50a-b25678a71ae9",
      "type": "file",
      "path": "invoice.pdf",
      "full_path": "https://lyltzyvtloozzovxrupp.supabase.co/storage/v1/object/public/files/ef3ec356-b407-4f9f-ac8f-0dfdef9034c0_invoice.pdf?",
      "confidence_score": 0.8
    },
    {
      "name": "contract.pdf",
      "fileId": "320616cc-8dfd-4b8a-8474-8e7a42d9e287",
      "type": "file",
      "path": "contract.pdf",
      "full_path": "https://lyltzyvtloozzovxrupp.supabase.co/storage/v1/object/public/files/dfaa5d30-6955-4a69-9c69-7e3c4efd8450_contract.pdf?",
      "confidence_score": 0.8
    }
  ]
}

Authorizations

api-key
string
header
required

Body

multipart/form-data
categories
string
required

JSON string containing array of category objects with name and optional description. Example: [{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"},{"name":"report"}]

file
file

PDF file to split. Either file or file_url must be provided.

file_url
string

URL to a PDF file to split. Either file or file_url must be provided. Example: https://example.com/mixed_documents.pdf

Response

202 - application/json

Accepted - split job started

job_id
string

Unique identifier for the splitting job

status
string

Current job status (typically 'processing')

message
string

Status message about the job

quota_remaining
number

Remaining API quota after this request