Extract and structure table data from PDF documents with advanced table detection and parsing
The Extract Tables endpoint processes PDF documents and extracts structured table data using advanced AI-powered table detection and parsing. This endpoint is specifically designed to identify, extract, and structure tabular content from documents with high accuracy.
The endpoint returns a job ID for asynchronous processing. Use the job management endpoints to check status and retrieve results.
The PDF file to process for table extraction. Maximum file size: 100MB
API key for authentication
Unique identifier for the table extraction job
Initial job status (typically “queued”)
Descriptive message about the job creation
Number of API calls remaining in your quota
After creating a table extraction job, use the job management endpoints to monitor progress and retrieve results:
When the job completes, the results will contain structured table data:
The table extraction endpoint provides advanced features for accurate table detection and parsing:
Invalid request parameters or malformed file
Invalid or missing API key
File size exceeds 100MB limit
Invalid file format or processing error
Rate limit exceeded or quota exhausted
Server error during processing
Document Quality: Higher quality PDFs with clear table borders produce better results. Scanned documents may have lower accuracy.
Table Structure: Tables with consistent formatting, clear headers, and regular structure are extracted more accurately.
Multi-page Tables: For tables spanning multiple pages, the system attempts to merge them automatically based on column structure.
Complex Layouts: Tables with irregular structures, merged cells, or nested tables may require manual review of results.
Processing Time: Large documents with many tables may take longer to process. Monitor job status regularly.
Rate limits are enforced per API key and reset on a rolling window basis. Monitor your quota usage through the quota_remaining
field in responses.