Overview
The Get Parse Job Status endpoint allows you to check the current status of parsing jobs and retrieve the complete results when processing is complete. This endpoint is specifically designed for the parsing API and returns comprehensive document analysis including text extraction, image recognition, table parsing, and OCR data.Parsing jobs are processed asynchronously. Use this endpoint to poll for completion and retrieve results when the job status is “Succeeded”.
Response
Unique identifier for the parsing job
Current job status: “Starting”, “Processing”, “Succeeded”, or “Failed”
Timestamp when the job was created
Timestamp when processing started (only present when status is not “Starting”)
Timestamp when processing completed (only present when status is “Succeeded” or “Failed”)
Number of chunks in the document (only present when status is “Succeeded”)
Array of document chunks with detailed analysis (only present when status is “Succeeded”)
Job Status Values
Starting
Starting
Job has been created and is waiting to be processed. This is the initial status when a parsing job is first created.
Processing
Processing
Job is currently being processed. This includes PDF parsing, text extraction, image analysis, table detection, and OCR processing.
Succeeded
Succeeded
Job has completed successfully. The response includes the complete analysis results with all extracted data, images, and metadata.
Failed
Failed
Job failed during processing. Check the message field for details about what went wrong.
Polling Strategy
For long-running parsing jobs, implement a polling strategy to check status periodically:Segment Types
When a job succeeds, the response includes detailed analysis of different document segments:Picture
Images and graphics within the document, including logos, charts, and illustrations.SectionHeader
Document headers and titles that define section boundaries.Text
Regular text content including paragraphs, sentences, and individual text elements.Table
Tabular data with structured rows and columns.Caption
Text captions associated with images or figures. Each segment includes:- segment_type: Type of content detected
- content: Extracted text content
- image: URL to extracted image (if applicable)
- page_number: Page where the segment appears
- confidence: Confidence score for the extraction
- bbox: Precise coordinates of the segment
- html: HTML-formatted content
- markdown: Markdown-formatted content
- ocr: Detailed OCR data with individual text elements
Error Handling
Common Error Scenarios
- Job Not Found: Invalid or expired job ID
- Invalid API Key: Authentication failed
- Processing Timeout: Job took too long to complete
- Server Error: Internal processing error
Best Practices
- Polling Frequency: Check status every 5-10 seconds for long-running jobs
- Timeout Handling: Implement reasonable timeouts to prevent infinite polling
- Error Recovery: Handle failed jobs gracefully with retry logic
- API Key Security: Keep your API key secure and never expose it in client-side code
Rate Limits
- Status Checks: Rate limits apply to prevent abuse
- Concurrent Jobs: Limited number of active parsing jobs per API key
- Request Frequency: Avoid excessive polling (recommended: 5-10 second intervals)
Authorizations
Path Parameters
The unique identifier of the parsing job to check
Response
Job status retrieved successfully
Unique identifier for the parsing job
Current job status: Starting, Processing, Succeeded, or Failed
Timestamp when the job was created
Timestamp when processing started
Timestamp when processing completed
Number of chunks in the document
Array of document chunks with detailed analysis
