> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unsiloed.ai/docs/llms.txt
> Use this file to discover all available pages before exploring further.

# Getting Started With Splitting

> Submit a bundled PDF to /splitter and download one PDF per matched category.

<Note>
  Mail rooms, AP scans, and patient intake forms often arrive as one big PDF with several different documents stacked together. The `/splitter` endpoint takes that bundle and a list of categories, then returns one labeled PDF per matched category. For other endpoints, see the [Parse quickstart](/quickstart), [Extraction quickstart](/document-processing/extraction/quickstart), or [Classification quickstart](/document-processing/classification/quickstart).
</Note>

This walkthrough builds a script that ships a bundled PDF and a category list to `/splitter`, waits for the split to complete, and writes one labeled PDF per matched category into a local `split_files/` directory, ready to drop into per-category downstream pipelines. If you'd rather just copy the whole script, it's in the dropdown below.

<Accordion title="Show the Full Script">
  Set `UNSILOED_API_KEY` in your environment and save the bundled PDF as `bundle.pdf` in the same directory before running.

  <Tabs>
    <Tab title="Python">
      ```python split_bundle.py theme={null}
      import json
      import os
      import time
      import requests

      API_KEY = os.environ["UNSILOED_API_KEY"]
      BASE_URL = "https://prod.visionapi.unsiloed.ai"

      categories = [
          {"name": "Invoice", "description": "Vendor invoices with itemized charges and a total due"},
          {"name": "Receipt", "description": "Point-of-sale receipts with line items, tax, and payment method"},
          {"name": "Purchase Order", "description": "Buyer-issued purchase orders authorizing goods or services from a vendor"},
      ]

      with open("bundle.pdf", "rb") as f:
          response = requests.post(
              f"{BASE_URL}/splitter",
              headers={"api-key": API_KEY},
              files={"file": ("bundle.pdf", f, "application/pdf")},
              data={"categories": json.dumps(categories)},
          )
      response.raise_for_status()

      job_id = response.json()["job_id"]
      print(f"Job submitted: {job_id}")

      max_attempts = 60  # roughly 5 minutes at 5 seconds per poll
      attempts = 0
      while True:
          result = requests.get(
              f"{BASE_URL}/splitter/{job_id}",
              headers={"api-key": API_KEY},
          ).json()
          print(f"Status: {result['status']}")
          if result["status"] == "completed":
              break
          if result["status"] == "failed":
              raise RuntimeError(result.get("error", "split job failed"))
          attempts += 1
          if attempts >= max_attempts:
              raise TimeoutError("Split job did not finish within 5 minutes")
          time.sleep(5)

      with open("result.json", "w") as f:
          json.dump(result, f, indent=2)

      os.makedirs("split_files", exist_ok=True)
      for file_info in result["result"]["files"]:
          pdf_bytes = requests.get(file_info["full_path"]).content
          out_path = os.path.join("split_files", file_info["name"])
          with open(out_path, "wb") as out:
              out.write(pdf_bytes)
          print(f"Saved {out_path} (confidence={file_info['confidence_score']:.2%})")
      ```
    </Tab>

    <Tab title="JavaScript">
      Save this as `script.mjs` or set `"type": "module"` in your `package.json`. Requires Node.js 18 or newer for the global `fetch`, `FormData`, and `Blob`.

      ```javascript script.mjs theme={null}
      import fs from "node:fs";
      import path from "node:path";

      const API_KEY = process.env.UNSILOED_API_KEY;
      const BASE_URL = "https://prod.visionapi.unsiloed.ai";

      const categories = [
        { name: "Invoice", description: "Vendor invoices with itemized charges and a total due" },
        { name: "Receipt", description: "Point-of-sale receipts with line items, tax, and payment method" },
        { name: "Purchase Order", description: "Buyer-issued purchase orders authorizing goods or services from a vendor" },
      ];

      const form = new FormData();
      form.append("file", new Blob([fs.readFileSync("bundle.pdf")]), "bundle.pdf");
      form.append("categories", JSON.stringify(categories));

      const response = await fetch(`${BASE_URL}/splitter`, {
        method: "POST",
        headers: { "api-key": API_KEY },
        body: form,
      });
      if (!response.ok) throw new Error(`${response.status}: ${await response.text()}`);

      const { job_id } = await response.json();
      console.log(`Job submitted: ${job_id}`);

      const maxAttempts = 60; // roughly 5 minutes at 5 seconds per poll
      let attempts = 0;
      let result;
      while (true) {
        const res = await fetch(`${BASE_URL}/splitter/${job_id}`, {
          headers: { "api-key": API_KEY },
        });
        result = await res.json();
        console.log(`Status: ${result.status}`);
        if (result.status === "completed") break;
        if (result.status === "failed") throw new Error(result.error || "split job failed");
        if (++attempts >= maxAttempts) throw new Error("Split job did not finish within 5 minutes");
        await new Promise((r) => setTimeout(r, 5000));
      }

      fs.writeFileSync("result.json", JSON.stringify(result, null, 2));

      fs.mkdirSync("split_files", { recursive: true });
      for (const file of result.result.files) {
        const res = await fetch(file.full_path);
        const buf = Buffer.from(await res.arrayBuffer());
        const outPath = path.join("split_files", file.name);
        fs.writeFileSync(outPath, buf);
        console.log(`Saved ${outPath} (confidence=${(file.confidence_score * 100).toFixed(2)}%)`);
      }
      ```
    </Tab>

    <Tab title="cURL">
      ```bash theme={null}
      # Submit the bundle and capture the job_id from the response:
      resp=$(curl -sX POST "https://prod.visionapi.unsiloed.ai/splitter" \
        -H "api-key: $UNSILOED_API_KEY" \
        -F "file=@bundle.pdf" \
        -F 'categories=[{"name":"Invoice","description":"Vendor invoices with itemized charges and a total due"},{"name":"Receipt","description":"Point-of-sale receipts with line items, tax, and payment method"},{"name":"Purchase Order","description":"Buyer-issued purchase orders authorizing goods or services from a vendor"}]')
      JOB_ID=$(echo "$resp" | grep -o '"job_id":"[^"]*"' | cut -d'"' -f4)
      echo "Job submitted: $JOB_ID"

      # Poll until the job finishes, with a 5-minute timeout:
      attempts=0
      max_attempts=60
      while true; do
        resp=$(curl -sX GET "https://prod.visionapi.unsiloed.ai/splitter/$JOB_ID" \
          -H "api-key: $UNSILOED_API_KEY")
        status=$(echo "$resp" | grep -o '"status":"[^"]*"' | head -1 | cut -d'"' -f4)
        echo "Status: $status"
        [ "$status" = "completed" ] && break
        [ "$status" = "failed" ] && { echo "Job failed"; exit 1; }
        attempts=$((attempts + 1))
        [ "$attempts" -ge "$max_attempts" ] && { echo "Split job did not finish within 5 minutes"; exit 1; }
        sleep 5
      done

      # Save the full response and download each split file:
      echo "$resp" > result.json
      mkdir -p split_files
      echo "$resp" \
        | python3 -c "import json,sys; [print(f['name'], f['full_path']) for f in json.load(sys.stdin)['result']['files']]" \
        | while read name url; do
            curl -s "$url" -o "split_files/$name"
            echo "Saved split_files/$name"
          done
      ```
    </Tab>
  </Tabs>
</Accordion>

## Step 1: Set Up Your Environment

Before writing any code, we need three things: an API key, a bundled PDF, and the runtime for our chosen language.

### 1.1 Get an Unsiloed AI API Key

To get API access, [sign up on Unsiloed AI](https://cal.com/aman-mishra-p0ry57/15min). Export your key as an environment variable named `UNSILOED_API_KEY` so it stays out of source control:

```bash theme={null}
export UNSILOED_API_KEY="your-api-key"
```

### 1.2 Pick a Bundled PDF

The `/splitter` endpoint is designed for PDFs that contain more than one logical document. This walkthrough assumes a multi-document PDF saved as `bundle.pdf` in your working directory.

If you don't have one handy, download our [sample bundle](https://raw.githubusercontent.com/Unsiloed-AI/cookbook/c585446e46e4be2790c6c29fe2a7a3a1b346191d/sample-documents/sample-split.pdf) (a three-page accounts-payable batch scan: an invoice, a receipt, and a purchase order) and save it as `bundle.pdf`.

### 1.3 Install Dependencies

<Tabs>
  <Tab title="Python">
    You need Python 3.8 or newer. Install the `requests` package:

    ```bash theme={null}
    pip install requests
    ```
  </Tab>

  <Tab title="JavaScript">
    You need Node.js 18 or newer for the global `fetch`, `FormData`, and `Blob`. No external packages needed.
  </Tab>

  <Tab title="cURL">
    You need cURL, which is preinstalled on macOS and most Linux distributions. We also use Python 3 for one short JSON-parsing line in the download loop.
  </Tab>
</Tabs>

## Step 2: Submit the Bundle

Two form fields go up: `file` for the bundled PDF and `categories` for a JSON-stringified array of the labels the splitter can choose from. The categories list is the only vocabulary the splitter uses. Pages that don't fit any category are still grouped under the closest match, so the list needs to cover everything that could plausibly appear in the bundle. The endpoint returns a `job_id` to poll. All requests go to `https://prod.visionapi.unsiloed.ai` with the API key in the `api-key` header.

### 2.1 Set Up the Script

<Tabs>
  <Tab title="Python">
    Create a file called `split_bundle.py` and start with the imports and configuration:

    ```python split_bundle.py theme={null}
    import json
    import os
    import time
    import requests

    API_KEY = os.environ["UNSILOED_API_KEY"]
    BASE_URL = "https://prod.visionapi.unsiloed.ai"
    ```

    `API_KEY` reads your key from the environment so it doesn't get hard-coded into the file, and `BASE_URL` points at the Unsiloed AI production endpoint. Both appear in every request below.
  </Tab>

  <Tab title="JavaScript">
    Create a file called `script.mjs` and start with the imports and configuration:

    ```javascript script.mjs theme={null}
    import fs from "node:fs";
    import path from "node:path";

    const API_KEY = process.env.UNSILOED_API_KEY;
    const BASE_URL = "https://prod.visionapi.unsiloed.ai";
    ```

    `API_KEY` reads your key from the environment so it doesn't get hard-coded into the file, and `BASE_URL` points at the Unsiloed AI production endpoint. Both appear in every request below.
  </Tab>

  <Tab title="cURL">
    cURL doesn't need a setup step. Each command below inlines the API key and base URL directly.
  </Tab>
</Tabs>

### 2.2 Define the Categories

Decide which document types the bundle might contain. Each category is an object with a `name` and an optional `description`; richer descriptions help the splitter pick the right label when categories are similar.

<Tabs>
  <Tab title="Python">
    Add the category list to the script:

    ```python split_bundle.py theme={null}
    categories = [
        {"name": "Invoice", "description": "Vendor invoices with itemized charges and a total due"},
        {"name": "Receipt", "description": "Point-of-sale receipts with line items, tax, and payment method"},
        {"name": "Purchase Order", "description": "Buyer-issued purchase orders authorizing goods or services from a vendor"},
    ]
    ```
  </Tab>

  <Tab title="JavaScript">
    Add the category list to the script:

    ```javascript script.mjs theme={null}
    const categories = [
      { name: "Invoice", description: "Vendor invoices with itemized charges and a total due" },
      { name: "Receipt", description: "Point-of-sale receipts with line items, tax, and payment method" },
      { name: "Purchase Order", description: "Buyer-issued purchase orders authorizing goods or services from a vendor" },
    ];
    ```
  </Tab>

  <Tab title="cURL">
    For cURL we pass the categories inline as a JSON string on the request itself, so there's nothing to set up here. The submission command in the next step shows the full form.
  </Tab>
</Tabs>

Include every document type the bundle might contain. Pages that don't match any category are still grouped under the closest match.

### 2.3 Upload the Bundle

Send the file and categories as a multipart upload to `/splitter`. The endpoint expects the document under the form field name `file` and the categories as a JSON-encoded string under `categories`.

<Tabs>
  <Tab title="Python">
    Continue the script by uploading the bundle:

    ```python split_bundle.py theme={null}
    with open("bundle.pdf", "rb") as f:
        response = requests.post(
            f"{BASE_URL}/splitter",
            headers={"api-key": API_KEY},
            files={"file": ("bundle.pdf", f, "application/pdf")},
            data={"categories": json.dumps(categories)},
        )
    response.raise_for_status()
    ```

    The `raise_for_status()` call throws an `HTTPError` on any non-2xx response, so we don't need to check `.status_code` ourselves.
  </Tab>

  <Tab title="JavaScript">
    Continue the script by uploading the bundle:

    ```javascript script.mjs theme={null}
    const form = new FormData();
    form.append("file", new Blob([fs.readFileSync("bundle.pdf")]), "bundle.pdf");
    form.append("categories", JSON.stringify(categories));

    const response = await fetch(`${BASE_URL}/splitter`, {
      method: "POST",
      headers: { "api-key": API_KEY },
      body: form,
    });
    if (!response.ok) throw new Error(`${response.status}: ${await response.text()}`);
    ```

    `fetch` doesn't throw on non-2xx responses by default, so we check `response.ok` and raise the error ourselves.
  </Tab>

  <Tab title="cURL">
    Run:

    ```bash theme={null}
    curl -X POST "https://prod.visionapi.unsiloed.ai/splitter" \
      -H "api-key: $UNSILOED_API_KEY" \
      -F "file=@bundle.pdf" \
      -F 'categories=[{"name":"Invoice","description":"Vendor invoices with itemized charges and a total due"},{"name":"Receipt","description":"Point-of-sale receipts with line items, tax, and payment method"},{"name":"Purchase Order","description":"Buyer-issued purchase orders authorizing goods or services from a vendor"}]'
    ```

    The response prints to stdout. We need the `job_id` field for the next step.
  </Tab>
</Tabs>

### 2.4 Capture the Job ID

<Tabs>
  <Tab title="Python">
    Read and print the `job_id`:

    ```python split_bundle.py theme={null}
    job_id = response.json()["job_id"]
    print(f"Job submitted: {job_id}")
    ```

    Run the script:

    ```bash theme={null}
    python split_bundle.py
    ```

    The output should be a single line like `Job submitted: 887f26e6-d089-47f6-8def-afe84de40ecd`.
  </Tab>

  <Tab title="JavaScript">
    Read and log the `job_id`:

    ```javascript script.mjs theme={null}
    const { job_id } = await response.json();
    console.log(`Job submitted: ${job_id}`);
    ```

    Run the script:

    ```bash theme={null}
    node script.mjs
    ```

    The output should be a single line like `Job submitted: 887f26e6-d089-47f6-8def-afe84de40ecd`.
  </Tab>

  <Tab title="cURL">
    The response body from the POST above looks like:

    ```json theme={null}
    {
      "job_id": "887f26e6-d089-47f6-8def-afe84de40ecd",
      "status": "processing",
      "quota_remaining": 7698
    }
    ```

    Copy the `job_id` value; you'll paste it into the polling command in the next step.
  </Tab>
</Tabs>

## Step 3: Poll and Download the Split Files

The job runs asynchronously. We GET `/splitter/{job_id}` repeatedly until the status is `completed`, then download each split PDF using the signed URL in the response.

The status values the polling loop handles:

* **`completed`:** the split files are ready to download
* **`failed`:** the job errored; check the `error` field for details
* **`queued`:** the job is waiting to be picked up
* **`processing`:** the job is still running

### 3.1 Write the Polling Loop

<Tabs>
  <Tab title="Python">
    Add a polling loop. The `max_attempts` cap stops the loop if the job hangs:

    ```python split_bundle.py theme={null}
    max_attempts = 60  # roughly 5 minutes at 5 seconds per poll
    attempts = 0
    while True:
        result = requests.get(
            f"{BASE_URL}/splitter/{job_id}",
            headers={"api-key": API_KEY},
        ).json()
        print(f"Status: {result['status']}")
        if result["status"] == "completed":
            break
        if result["status"] == "failed":
            raise RuntimeError(result.get("error", "split job failed"))
        attempts += 1
        if attempts >= max_attempts:
            raise TimeoutError("Split job did not finish within 5 minutes")
        time.sleep(5)
    ```
  </Tab>

  <Tab title="JavaScript">
    Add a polling loop. The `maxAttempts` cap stops the loop if the job hangs:

    ```javascript script.mjs theme={null}
    const maxAttempts = 60; // roughly 5 minutes at 5 seconds per poll
    let attempts = 0;
    let result;
    while (true) {
      const res = await fetch(`${BASE_URL}/splitter/${job_id}`, {
        headers: { "api-key": API_KEY },
      });
      result = await res.json();
      console.log(`Status: ${result.status}`);
      if (result.status === "completed") break;
      if (result.status === "failed") throw new Error(result.error || "split job failed");
      if (++attempts >= maxAttempts) throw new Error("Split job did not finish within 5 minutes");
      await new Promise((r) => setTimeout(r, 5000));
    }
    ```
  </Tab>

  <Tab title="cURL">
    Replace `JOB_ID` below with the value you captured from Step 2.4, then run this loop. It polls every 5 seconds and gives up after 5 minutes if the job hasn't completed:

    ```bash theme={null}
    JOB_ID="paste-job-id-here"
    attempts=0
    max_attempts=60  # roughly 5 minutes at 5 seconds per poll

    while true; do
      resp=$(curl -sX GET "https://prod.visionapi.unsiloed.ai/splitter/$JOB_ID" \
        -H "api-key: $UNSILOED_API_KEY")
      status=$(echo "$resp" | grep -o '"status":"[^"]*"' | head -1 | cut -d'"' -f4)
      echo "Status: $status"
      [ "$status" = "completed" ] && break
      [ "$status" = "failed" ] && { echo "Job failed"; exit 1; }
      attempts=$((attempts + 1))
      [ "$attempts" -ge "$max_attempts" ] && { echo "Split job did not finish within 5 minutes"; exit 1; }
      sleep 5
    done
    ```

    The loop keeps the latest response body in `$resp` for the next step.
  </Tab>
</Tabs>

### 3.2 Download the Split PDFs

Each entry in `result.result.files` has a presigned `full_path` URL that downloads the split PDF. The code below saves the metadata to `result.json` and writes each split file into a `split_files/` directory.

<Tabs>
  <Tab title="Python">
    Add the download step:

    ```python split_bundle.py theme={null}
    with open("result.json", "w") as f:
        json.dump(result, f, indent=2)

    os.makedirs("split_files", exist_ok=True)
    for file_info in result["result"]["files"]:
        pdf_bytes = requests.get(file_info["full_path"]).content
        out_path = os.path.join("split_files", file_info["name"])
        with open(out_path, "wb") as out:
            out.write(pdf_bytes)
        print(f"Saved {out_path} (confidence={file_info['confidence_score']:.2%})")
    ```

    Run the script:

    ```bash theme={null}
    python split_bundle.py
    ```

    You should see a few `Status: processing` lines, then `Status: completed`, then one `Saved` line per matched category.
  </Tab>

  <Tab title="JavaScript">
    Add the download step:

    ```javascript script.mjs theme={null}
    fs.writeFileSync("result.json", JSON.stringify(result, null, 2));

    fs.mkdirSync("split_files", { recursive: true });
    for (const file of result.result.files) {
      const res = await fetch(file.full_path);
      const buf = Buffer.from(await res.arrayBuffer());
      const outPath = path.join("split_files", file.name);
      fs.writeFileSync(outPath, buf);
      console.log(`Saved ${outPath} (confidence=${(file.confidence_score * 100).toFixed(2)}%)`);
    }
    ```

    Run the script:

    ```bash theme={null}
    node script.mjs
    ```

    You should see a few `Status: processing` lines, then `Status: completed`, then one `Saved` line per matched category.
  </Tab>

  <Tab title="cURL">
    The polling loop in Step 3.1 left the full response in `$resp`. Save the metadata, then download each split file using the presigned `full_path` URLs:

    ```bash theme={null}
    echo "$resp" > result.json
    mkdir -p split_files

    echo "$resp" \
      | python3 -c "import json,sys; [print(f['name'], f['full_path']) for f in json.load(sys.stdin)['result']['files']]" \
      | while read name url; do
          curl -s "$url" -o "split_files/$name"
          echo "Saved split_files/$name"
        done
    ```

    The `split_files/` directory now holds one PDF per matched category, named after it.
  </Tab>
</Tabs>

## Error Responses

Failures fall into two buckets: HTTP errors raised before the job is queued, and a `failed` status on a job that started but couldn't complete.

### HTTP Errors

The `/splitter` endpoint returns JSON error bodies with a `detail` field. The common cases:

* **`401 Unauthorized`:** body is `{"detail":"Invalid API key"}`. The `api-key` header is missing or wrong.
* **`400 Bad Request`:** body is `{"detail":"Invalid JSON format for categories: ..."}`. The `categories` form field isn't valid JSON.
* **`422 Unprocessable Entity`:** body is `{"detail":[{"type":"missing","loc":["body","categories"],"msg":"Field required","input":null}]}`. A required form field (usually `file` or `categories`) is missing.
* **`400 Bad Request`:** body is `{"detail":"At least one category is required"}`. The `categories` array is empty.
* **`400 Bad Request`:** body is `{"detail":"Failed to process file: Failed to get PDF page count: ..."}`. The upload isn't a readable PDF.
* **`404 Not Found`:** body is `{"detail":"Job not found"}`. The `job_id` you polled doesn't exist.

### Failed Jobs

A job that was accepted but couldn't be processed comes back with `status: "failed"` and a populated `error` field. The walkthrough's polling loop raises on this case so you see the message instead of waiting out the timeout:

```json theme={null}
{
  "job_id": "7b31a7d7-e810-4a0b-931e-fbed0879bab2",
  "status": "failed",
  "progress": "Splitting failed",
  "error": "Failed to classify pages",
  "file_url": "https://example-bucket.s3.amazonaws.com/user_uploads/...",
  "file_name": "bundle.pdf",
  "parameters": { "classes": ["Invoice", "Receipt", "Purchase Order"], "page_count": 3 },
  "result": null
}
```

## Response Shape

A completed response contains job metadata, an echo of the input `parameters`, and a `result.files[]` array with one entry per matched category. Each entry carries a presigned `full_path` URL we can download directly.

```json theme={null}
{
  "job_id": "887f26e6-d089-47f6-8def-afe84de40ecd",
  "status": "completed",
  "progress": "Starting document splitting...",
  "error": null,
  "file_url": "https://example-bucket.s3.amazonaws.com/user_uploads/...",
  "file_name": "bundle.pdf",
  "parameters": {
    "classes": ["Invoice", "Receipt", "Purchase Order"],
    "page_count": 3,
    "enable_reordering": false,
    "category_descriptions": {
      "Invoice": "Vendor invoices with itemized charges and a total due",
      "Receipt": "Point-of-sale receipts with line items, tax, and payment method",
      "Purchase Order": "Buyer-issued purchase orders authorizing goods or services from a vendor"
    }
  },
  "result": {
    "success": true,
    "message": "Successfully split PDF into 3 files",
    "files": [
      {
        "name": "Invoice.pdf",
        "path": "Invoice.pdf",
        "type": "file",
        "fileId": "359afb3c-2554-4acd-9cb3-be4044d7ec97",
        "full_path": "https://example-bucket.s3.amazonaws.com/files/...",
        "confidence_score": 0.9999976308610644
      },
      {
        "name": "Receipt.pdf",
        "path": "Receipt.pdf",
        "type": "file",
        "fileId": "eca3f2db-e113-4b5f-92db-aab89d417114",
        "full_path": "https://example-bucket.s3.amazonaws.com/files/...",
        "confidence_score": 0.9999976308610644
      },
      {
        "name": "Purchase Order.pdf",
        "path": "Purchase Order.pdf",
        "type": "file",
        "fileId": "1af2c343-9a86-46c5-b8f0-c48cc04d1b6f",
        "full_path": "https://example-bucket.s3.amazonaws.com/files/...",
        "confidence_score": 0.9999976308610644
      }
    ]
  }
}
```

The fields fall into three broad categories:

**For downloading the split PDFs:**

* **`result.files[].full_path`:** presigned S3 URL to download the split PDF; this is what the walkthrough fetches into `split_files/`
* **`result.files[].name`:** filename derived from the matched category, suitable for saving to disk
* **`result.files[].confidence_score`:** the splitter's confidence in the classification, on a 0-1 scale; use it to flag low-confidence splits for human review
* **`result.files[].fileId`:** unique identifier for the split file, useful for tracking or deduplicating downstream

**For echoing the request back:**

* **`parameters.classes`:** the category names you submitted
* **`parameters.category_descriptions`:** the descriptions you submitted, keyed by category name
* **`parameters.page_count`:** number of pages in the uploaded PDF
* **`parameters.enable_reordering`:** whether the splitter reordered pages within each category after classification; defaults to `false`
* **`file_url`:** signed URL to the original uploaded bundle
* **`file_name`:** name of the uploaded bundle

**For job and progress tracking:**

* **`status`:** `completed`, `failed`, or an in-progress value such as `processing`
* **`progress`:** human-readable progress message
* **`error`:** error message if the job failed, otherwise `null`
* **`result.success`:** whether the split operation succeeded
* **`result.message`:** human-readable success or failure message

### Sample Output

Running the script against the [sample AP batch](https://raw.githubusercontent.com/Unsiloed-AI/cookbook/c585446e46e4be2790c6c29fe2a7a3a1b346191d/sample-documents/sample-split.pdf) produces a `split_files/` directory with one PDF per matched category:

```
split_files/
├── Invoice.pdf          # page 1 — the Greenfield Print & Bindery invoice
├── Receipt.pdf          # page 2 — the Cooper's Office Supply receipt
└── Purchase Order.pdf   # page 3 — the Lighthouse Studios LLC purchase order
```

All three files report a confidence score above 99%. Open them to confirm each page of the bundle landed in the matching category file.

## Next Steps

<Note>
  For more on splitting, including the underlying classification step and the full response shape, see the [Splitting overview](/document-processing/splitting/splitting) and the [Response Format](/document-processing/splitting/response-format) reference.
</Note>

<CardGroup cols={2}>
  <Card title="Splitting Overview" icon="scissors" href="/document-processing/splitting/splitting">
    Learn how the splitter groups pages and where to use it in a pipeline.
  </Card>

  <Card title="Classification" icon="tags" href="/document-processing/classification/classification">
    Classify a single document against candidate categories instead of splitting a bundle.
  </Card>

  <Card title="API Reference" icon="code" href="/api-reference/splitting/split-document">
    Browse the full request and response specs for the splitting endpoint.
  </Card>

  <Card title="FAQ" icon="circle-question" href="/faq/general">
    Check limits, supported formats, and answers to common questions.
  </Card>
</CardGroup>
