Skip to main content
Splitting takes a single PDF that contains several documents bundled together and breaks them apart. Real-world bundles arrive every day (a scanned batch of receipts, a stack of student exams, a patient’s referral packet) and downstream systems usually need each logical document on its own. Given a bundled PDF and a list of candidate categories (the same shape /classify uses), /splitter returns one downloadable PDF per matched category, each with a confidence score. All pages assigned to the same category are collected into that category’s file: a bundle with two invoices produces one Invoice.pdf containing both, not two separate files.

How It Works

After you submit a PDF and categories, the API:
  1. Analyzes every page of the bundle.
  2. Classifies each page against your candidate categories.
  3. Groups adjacent pages of the same type so multi-page documents stay together.
  4. Generates one PDF per matched category, named after the category.
  5. Returns a download URL and confidence score for each split file.

Common Categories

Categories are whatever you define. Common groupings include:
  • Business: invoices, receipts, purchase orders, contracts
  • Financial: bank statements, financial reports, tax forms
  • Legal: contracts, agreements, legal notices, compliance forms
  • Healthcare: medical records, insurance forms, lab reports
  • HR: resumes, employment forms, payroll documents
  • Academic: research papers, reports, transcripts

Dig Deeper

Getting Started With Splitting

Submit a bundled PDF, define categories, and read back the split files.

Response Format

Browse the canonical splitting response shape with a field-by-field reference.
For the full request and response specification, see the Split API reference.