Overview
The Extract Tables endpoint processes PDF documents and extracts structured table data using advanced AI-powered table detection and parsing. This endpoint is specifically designed to identify, extract, and structure tabular content from documents with high accuracy.
The endpoint returns a job ID for asynchronous processing. Use the job management endpoints to check status and retrieve results.
Request
The PDF file to process for table extraction. Maximum file size: 100MB
Response
Unique identifier for the table extraction job
Initial job status (typically “queued”)
Descriptive message about the job creation
Number of credits remaining in your quota
curl -X POST "https://prod.visionapi.unsiloed.ai/tables" \
-H "accept: application/json" \
-H "api-key: your-api-key" \
-H "Content-Type: multipart/form-data" \
-F "pdf_file=@document.pdf;type=application/pdf"
Success Response
Error Response
{
"job_id" : "e4a73f3c-869e-4129-bdef-614a6a51b043" ,
"status" : "queued" ,
"message" : "PDF table extraction started" ,
"quota_remaining" : 48960
}
Job Management Integration
After creating a table extraction job, use the job management endpoints to monitor progress and retrieve results:
import time
import requests
def wait_for_table_extraction_completion ( job_id , api_key ):
"""Poll job status until completion"""
headers = { "api-key" : api_key}
status_url = f "https://prod.visionapi.unsiloed.ai/jobs/ { job_id } "
while True :
response = requests.get(status_url, headers = headers)
if response.status_code == 200 :
status_data = response.json()
print ( f "Job status: { status_data[ 'status' ] } " )
if status_data[ 'status' ] == 'completed' :
# Get results
results_url = f "https://prod.visionapi.unsiloed.ai/jobs/ { job_id } /results"
results_response = requests.get(results_url, headers = headers)
if results_response.status_code == 200 :
return results_response.json()
else :
raise Exception ( f "Failed to get results: { results_response.text } " )
elif status_data[ 'status' ] == 'failed' :
raise Exception ( f "Job failed: { status_data.get( 'error' , 'Unknown error' ) } " )
time.sleep( 5 ) # Wait 5 seconds before next check
# Usage
job_id = "e4a73f3c-869e-4129-bdef-614a6a51b043"
results = wait_for_table_extraction_completion(job_id, "your-api-key" )
print ( "Table extraction completed:" , results)
Expected Results Structure
When the job completes, the results will contain structured table data:
{
"job_id" : "e4a73f3c-869e-4129-bdef-614a6a51b043" ,
"status" : "completed" ,
"results" : {
"tables" : [
{
"table_id" : "table_1" ,
"page_number" : 1 ,
"bbox" : {
"x1" : 50 ,
"y1" : 100 ,
"x2" : 550 ,
"y2" : 300
},
"confidence" : 0.95 ,
"headers" : [
"Product" ,
"Quantity" ,
"Price" ,
"Total"
],
"rows" : [
[ "Widget A" , "10" , "$25.00" , "$250.00" ],
[ "Widget B" , "5" , "$40.00" , "$200.00" ],
[ "Widget C" , "8" , "$15.00" , "$120.00" ]
],
"structured_data" : {
"headers" : [ "Product" , "Quantity" , "Price" , "Total" ],
"data" : [
{
"Product" : "Widget A" ,
"Quantity" : "10" ,
"Price" : "$25.00" ,
"Total" : "$250.00"
},
{
"Product" : "Widget B" ,
"Quantity" : "5" ,
"Price" : "$40.00" ,
"Total" : "$200.00"
},
{
"Product" : "Widget C" ,
"Quantity" : "8" ,
"Price" : "$15.00" ,
"Total" : "$120.00"
}
]
},
"table_type" : "data_table" ,
"column_count" : 4 ,
"row_count" : 3
}
],
"summary" : {
"total_tables" : 1 ,
"pages_processed" : 1 ,
"extraction_method" : "ai_vision" ,
"processing_time" : 4.2
}
}
}
The PDF file to process for table extraction. Maximum file size: 100MB
Unique identifier for the table extraction job
Descriptive message about the job creation
Number of credits remaining in your quota