Split Document
Split PDF documents by classifying pages into different categories
Overview
The Split Document endpoint analyzes PDF pages, classifies them into predefined categories, and creates separate PDF files for each category. This is ideal for processing mixed document batches like scanned files containing invoices, contracts, and reports.Request
[{"name":"invoice","description":"Financial invoices"}]). Descriptions help the classifier disambiguate similar categories. Categories that match no pages are skipped; no file is created for them.Response
The endpoint returns HTTP 200 with the job identifier:Split Result
When the job completes,GET /splitter/{job_id} returns the split files inside its result object. The fields below describe that result object:
Request Examples
Response Examples
Authorizations
Body
JSON string containing array of category objects with name and optional description. Example: [{"name":"invoice","description":"Business invoices with itemized charges"},{"name":"contract","description":"Legal agreements and binding documents"},{"name":"report"}]
PDF file to split. Either file or file_url must be provided.
URL to a PDF file to split. Either file or file_url must be provided. Example: https://example.com/mixed_documents.pdf
Reorder pages within each category after classification, using content and page numbers to infer the logical order. Only applied to categories that match more than one page.

