Parse Excel
Dedicated ingestion endpoint for Excel workbooks (.xls / .xlsx) with spreadsheet-specific configuration.
Overview
The Parse Excel endpoint processes Excel workbooks (.xls, .xlsx) using a dedicated spreadsheet pipeline. It shares auth, billing, quota, and rate-limit infrastructure with Parse Document, and is polled via the same GET /parse/{job_id}.
- POST to
/parse/excelwith your file (orurl) and any spreadsheet-specific configuration. - The job is automatically enqueued for processing.
- Poll
GET /parse/{job_id}to track progress and retrieve results.
ocr_strategy, layout_analysis, segment_processing, etc.) do not apply here — they are silently ignored. Non-Excel uploads are rejected with 400; submit those to POST /parse instead.Request
file (multipart binary upload) or url (presigned/public URL). The file field is multipart-only; JSON callers must use url..xls, .xlsx. Required if url is not provided.file is not provided.Cell metadata
false. The three sub-toggles below are only honored when this is true.cell_metadata is true. Defaults to true.cell_metadata is true. Defaults to true.cell_metadata is true. Defaults to true.Hidden content
false. The five sub-toggles below are only honored when this is true.true.true.true.true.false.Table extraction
true.split_large_tables is true. Defaults to 50."accurate"(default): Best fidelity at the cost of latency."fast": Quicker clustering, may merge nearby tables."off": Treat each sheet as a single table.
Lifecycle
POST /v2/parse/upload instead.Response
The endpoint returns HTTP 200 with the same envelope asPOST /parse:
GET /parse/{job_id} to poll for results."Starting" on creation."unknown" when no usable segment exists.false for Excel jobs — table merging is a PDF-only feature.Retrieving Results
UseGET /parse/{job_id} (the shared polling endpoint) to check status and retrieve results. The result envelope is the same as for PDF jobs — chunks containing segments — and Excel segments include a cell_references field linking each segment back to its source sheet, address, and range.
Error Handling
| Status | Cause | Action |
|---|---|---|
400 | Missing file/url, non-Excel file type, or malformed parameters | Check the file extension and required fields |
401 | Missing or invalid api-key | Check your API key |
402 | Insufficient quota | Add credits to your account or renew your plan |
403 | Access has been revoked | Contact support |
429 | Rate limit (default 10 req/s) or billing usage cap hit | Back off and retry after the Retry-After header value |
500 | Internal server error | Retry with exponential backoff |
503 | Job queue at capacity | Retry after the duration indicated in the Retry-After header |
Authorizations
API key for authentication. Use 'Bearer <your_api_key>'
Body
Provide either file (binary .xls/.xlsx upload, multipart only) or url (presigned/public URL, both content types), not both. The file field is multipart-only. Excel parsing uses its own config — PDF parsing options do not apply.
Request body for POST /parse/excel (multipart/form-data).
Excel parsing has its own configuration — none of the PDF parsing options
(OCR, layout analysis, segment processing) apply. Provide either file
(binary .xls/.xlsx upload) or url, not both. Every config field is optional
and defaults to the Excel pipeline's own default.
Excel workbook to process. Required if url is not provided. Supported: XLS, XLSX.
Include cell colors (only when cell_metadata). Defaults to true.
Include cell color, formula, and dropdown metadata in the output. Defaults to false.
Include data-validation dropdown options (only when cell_metadata). Defaults to true.
Drop hidden sheets/rows/cols/styling from the output. Defaults to false.
When excluding hidden content, also drop hidden columns. Defaults to true.
When excluding hidden content, also drop hidden rows. Defaults to true.
When excluding hidden content, also drop hidden sheets. Defaults to true.
When excluding hidden content, also drop embedded/pasted images. Defaults to false.
When excluding hidden content, also drop styling. Defaults to true.
Reserved field. Persisted in the task configuration but currently has no
effect on retention — Excel tasks use the same Task::new_fast creation
path as POST /parse, which does not set the task's expires_at column.
See the same note on ParseCreateRequest.expires_in.
Include cell formulas (only when cell_metadata). Defaults to true.
Max rows per split segment when split_large_tables. Defaults to 50.
Split large tables into smaller segments. Defaults to true.
Table clustering effort: accurate (default), fast, or off.
Presigned or public URL of the workbook to fetch. Required if file is not provided.
Response
Job created — poll with GET /parse/{job_id} to retrieve results.
Response body for a successful POST /parse call.
ISO 8601 timestamp when the job was created.
Number of pages deducted from your quota for this job.
Name of the uploaded file or "unknown" when a URL was provided.
Job identifier — pass this to GET /parse/{job_id} to poll for results.
Whether table merging is enabled for this job (reflects the submitted merge_tables value).
Human-readable status message with a polling hint.
Remaining page quota after this job was deducted.
Initial job status. Always "Starting" on creation.

