Split Document - Unsiloed AI

curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
  -H "api-key: your-api-key" \
  -F "file=@mixed_documents.pdf" \
  -F 'categories=[{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"}]'

{
  "success": true,
  "message": "Successfully split PDF into 2 files",
  "files": [
    {
      "name": "invoice.pdf",
      "fileId": "d079d09f-201c-4420-a50a-b25678a71ae9",
      "type": "file",
      "path": "invoice.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/ef3ec356-b407-4f9f-ac8f-0dfdef9034c0_invoice.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    },
    {
      "name": "contract.pdf",
      "fileId": "320616cc-8dfd-4b8a-8474-8e7a42d9e287",
      "type": "file",
      "path": "contract.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/dfaa5d30-6955-4a69-9c69-7e3c4efd8450_contract.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    }
  ]
}

POST

splitter

curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
  -H "api-key: your-api-key" \
  -F "file=@mixed_documents.pdf" \
  -F 'categories=[{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"}]'

{
  "success": true,
  "message": "Successfully split PDF into 2 files",
  "files": [
    {
      "name": "invoice.pdf",
      "fileId": "d079d09f-201c-4420-a50a-b25678a71ae9",
      "type": "file",
      "path": "invoice.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/ef3ec356-b407-4f9f-ac8f-0dfdef9034c0_invoice.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    },
    {
      "name": "contract.pdf",
      "fileId": "320616cc-8dfd-4b8a-8474-8e7a42d9e287",
      "type": "file",
      "path": "contract.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/dfaa5d30-6955-4a69-9c69-7e3c4efd8450_contract.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    }
  ]
}

Overview

The Split Document endpoint analyzes PDF pages, classifies them into predefined categories, and creates separate PDF files for each category. This is ideal for processing mixed document batches like scanned files containing invoices, contracts, and reports.

The endpoint processes documents asynchronously via a job-based system. It returns a job_id immediately and processes the document in the background. Poll the status endpoint to retrieve results when complete.

Request

file

The PDF file to split. Either file or file_url must be provided; sending both returns a 400.

file_url

string

URL to a PDF file to split. Either file or file_url must be provided.

Response

The endpoint returns HTTP 200 with the job identifier:

job_id

string

Unique identifier for the splitting job

status

string

Current status of the job (“processing”)

quota_remaining

number

Remaining API quota after this request

Split Result

When the job completes, GET /splitter/{job_id} returns the split files inside its result object. The fields below describe that result object:

success

boolean

Whether the splitting operation succeeded

message

string

Descriptive message about the splitting operation

files

array

Array of split PDF files with their metadata

Show file_structure

name

string

Category-derived filename of the split PDF (e.g., “invoice.pdf” for the “invoice” category)

fileId

string

Unique identifier for the file in storage

type

string

File type (always “file”)

path

string

Relative path to the file in storage

full_path

string

Presigned download URL for the split PDF file. Expires roughly an hour after the response is generated, so download the file straight away rather than storing the URL. Re-issue GET /splitter/{job_id} to get a fresh URL.

confidence_score

number

Classification confidence for this file (0-1), averaged across all pages assigned to the category

Request Examples

curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
  -H "api-key: your-api-key" \
  -F "file=@mixed_documents.pdf" \
  -F 'categories=[{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"}]'

Response Examples

{
  "success": true,
  "message": "Successfully split PDF into 2 files",
  "files": [
    {
      "name": "invoice.pdf",
      "fileId": "d079d09f-201c-4420-a50a-b25678a71ae9",
      "type": "file",
      "path": "invoice.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/ef3ec356-b407-4f9f-ac8f-0dfdef9034c0_invoice.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    },
    {
      "name": "contract.pdf",
      "fileId": "320616cc-8dfd-4b8a-8474-8e7a42d9e287",
      "type": "file",
      "path": "contract.pdf",
      "full_path": "https://example-bucket.s3.amazonaws.com/files/dfaa5d30-6955-4a69-9c69-7e3c4efd8450_contract.pdf?AWSAccessKeyId=...&Signature=...&Expires=...",
      "confidence_score": 0.8
    }
  ]
}

Authorizations

api-key

string

header

required

Body

multipart/form-data

Response

200 - application/json

Split job created

job_id

string

Unique identifier for the splitting job

status

string

Current job status (typically 'processing')

quota_remaining

number

Remaining API quota after this request

Get Classification Result

Get Split Result

​Overview

​Request

​Response

​Split Result

​Request Examples

​Response Examples

Authorizations

Body

Response

Overview

Request

Response

Split Result

Request Examples

Response Examples