Splitting Overview - Unsiloed AI

Splitting takes a single PDF that contains several documents bundled together and breaks them apart. Real-world bundles arrive every day (a scanned batch of receipts, a stack of student exams, a patient’s referral packet) and downstream systems usually need each logical document on its own. Given a bundled PDF and a list of candidate categories (the same shape /classify uses), /splitter returns one downloadable PDF per matched category, each with a confidence score. All pages assigned to the same category are collected into that category’s file: a bundle with two invoices produces one Invoice.pdf containing both, not two separate files.

How It Works

After you submit a PDF and categories, the API:

Analyzes every page of the bundle.
Classifies each page against your candidate categories.
Groups adjacent pages of the same type so multi-page documents stay together.
Generates one PDF per matched category, named after the category.
Returns a download URL and confidence score for each split file.

Common Categories

Categories are whatever you define. Common groupings include:

Business: invoices, receipts, purchase orders, contracts
Financial: bank statements, financial reports, tax forms
Legal: contracts, agreements, legal notices, compliance forms
Healthcare: medical records, insurance forms, lab reports
HR: resumes, employment forms, payroll documents
Academic: research papers, reports, transcripts

Dig Deeper

Getting Started With Splitting

Submit a bundled PDF, define categories, and read back the split files.

Response Format

Browse the canonical splitting response shape with a field-by-field reference.

For the full request and response specification, see the Split API reference.

Response Format

Getting Started With Splitting

​How It Works

​Common Categories

​Dig Deeper

Getting Started With Splitting

Response Format

How It Works

Common Categories

Dig Deeper