GET
/
jobs
/
{job_id}
curl -X GET "https://visionapi.unsiloed.ai/jobs/b2094b38-e432-44b6-a5d0-67bed07d5de1" \
  -H "X-API-Key: your-api-key"
{
  "id": "b2094b38-e432-44b6-a5d0-67bed07d5de1",
  "status": "queued",
  "type": "extraction",
  "created_at": "2024-01-15T10:30:00.000Z",
  "updated_at": "2024-01-15T10:30:00.000Z",
  "pdf_name": "financial_report.pdf",
  "pdf_hash": "sha256:abc123...",
  "user_id": "user_123"
}

Overview

The Get Job Status endpoint allows you to check the current status of any asynchronous processing job. This is essential for monitoring long-running operations like document extraction, parsing, and batch processing.

Jobs are stored in Supabase and updated in real-time. Status checks are lightweight and can be polled frequently.

Request

job_id
string
required

The unique identifier of the job to check

X-API-Key
string

API key for authentication (optional for some endpoints)

Response

id
string

Unique identifier for the job

status
string

Current job status: “queued”, “PROCESSING”, “COMPLETED”, or “FAILED”

type
string

The operation type for this job (e.g., “extraction”, “chunking”, “classification”)

created_at
string

Timestamp when the job was created

updated_at
string

Timestamp when the job was last updated

pdf_name
string

Original filename of the processed document

pdf_hash
string

Hash of the PDF file for identification

error
string

Error message (if job failed)

curl -X GET "https://visionapi.unsiloed.ai/jobs/b2094b38-e432-44b6-a5d0-67bed07d5de1" \
  -H "X-API-Key: your-api-key"
{
  "id": "b2094b38-e432-44b6-a5d0-67bed07d5de1",
  "status": "queued",
  "type": "extraction",
  "created_at": "2024-01-15T10:30:00.000Z",
  "updated_at": "2024-01-15T10:30:00.000Z",
  "pdf_name": "financial_report.pdf",
  "pdf_hash": "sha256:abc123...",
  "user_id": "user_123"
}

Job Status Values

Polling for Completion

For long-running jobs, implement polling to check status periodically:

import time
import requests

def wait_for_job_completion(job_id, api_key, poll_interval=5, max_wait=300):
    """Wait for job to complete with polling"""
    
    start_time = time.time()
    headers = {"X-API-Key": api_key}
    
    while time.time() - start_time < max_wait:
        response = requests.get(f"https://visionapi.unsiloed.ai/jobs/{job_id}", headers=headers)
        
        if response.status_code == 200:
            job = response.json()
            status = job['status']
            
            print(f"Job status: {status}")
            
            if status == 'COMPLETED':
                return job, True
            elif status == 'FAILED':
                return job, False
                
        time.sleep(poll_interval)
    
    raise TimeoutError(f"Job {job_id} did not complete within {max_wait} seconds")

# Usage
job_id = "your-job-id"
job, success = wait_for_job_completion(job_id, "your-api-key")

if success:
    print("Job completed successfully!")
else:
    print(f"Job failed: {job.get('error', 'Unknown error')}")

Best Practices

Polling Frequency: Poll every 5-10 seconds for most jobs. Avoid polling more frequently than every 2 seconds.

Timeout Handling: Set reasonable timeouts (5-15 minutes) depending on document size and complexity.

Job Persistence: Completed jobs and their results are stored for 7 days before cleanup.

Common Issues

Batch Status Check

Check the status of all jobs in a batch.

response = requests.get(
    "https://api.example.com/jobs/batch/batch_xyz789abc123",
    headers=headers,
    params={"include_progress": True}
)

batch_status = response.json()
print(f"Batch Status: {batch_status['status']}")
print(f"Completed Jobs: {batch_status['completed_jobs']}/{batch_status['total_jobs']}")

for job in batch_status['jobs']:
    print(f"  {job['job_id']}: {job['status']}")
{
  "batch_id": "batch_xyz789abc123",
  "status": "processing",
  "created_at": "2024-01-15T10:30:00Z",
  "total_jobs": 5,
  "completed_jobs": 3,
  "failed_jobs": 0,
  "cancelled_jobs": 0,
  "overall_progress": 60,
  "jobs": [
    {
      "job_id": "job_1",
      "status": "completed",
      "operation": "extract",
      "progress": 100
    },
    {
      "job_id": "job_2",
      "status": "completed", 
      "operation": "parse",
      "progress": 100
    },
    {
      "job_id": "job_3",
      "status": "processing",
      "operation": "classify",
      "progress": 75
    }
  ]
}

Multiple Jobs Status

Check status of multiple jobs simultaneously.

Query Parameters

job_ids
string
required

Comma-separated list of job IDs

include_progress
boolean
default:"false"

Include progress details for all jobs

job_ids = ["job_abc123", "job_def456", "job_ghi789"]
params = {
    "job_ids": ",".join(job_ids),
    "include_progress": True
}

response = requests.get(
    "https://api.example.com/jobs/status",
    headers=headers,
    params=params
)

jobs_status = response.json()
for job in jobs_status['jobs']:
    print(f"{job['job_id']}: {job['status']} ({job.get('progress', {}).get('percentage', 0)}%)")

Job Status Polling

Polling Best Practices

import time

def wait_for_job_completion(job_id, max_wait_time=300, poll_interval=5):
    """
    Poll job status until completion or timeout
    """
    start_time = time.time()
    
    while time.time() - start_time < max_wait_time:
        response = requests.get(
            f"https://api.example.com/jobs/{job_id}",
            headers=headers
        )
        
        if response.status_code == 200:
            job = response.json()
            status = job['status']
            
            print(f"Job {job_id}: {status}")
            
            if status in ['completed', 'failed', 'cancelled']:
                return job
                
            # Show progress if available
            if 'progress' in job:
                print(f"  Progress: {job['progress']['percentage']}%")
        
        time.sleep(poll_interval)
    
    raise TimeoutError(f"Job {job_id} did not complete within {max_wait_time} seconds")

# Usage
try:
    completed_job = wait_for_job_completion("job_abc123def456")
    if completed_job['status'] == 'completed':
        print("Job completed successfully!")
    else:
        print(f"Job failed: {completed_job.get('error_details', {}).get('message')}")
except TimeoutError as e:
    print(f"Timeout: {e}")

Exponential Backoff Polling

def poll_with_backoff(job_id, initial_interval=2, max_interval=30, backoff_factor=1.5):
    """
    Poll with exponential backoff to reduce API calls
    """
    interval = initial_interval
    
    while True:
        response = requests.get(f"https://api.example.com/jobs/{job_id}", headers=headers)
        job = response.json()
        
        if job['status'] in ['completed', 'failed', 'cancelled']:
            return job
            
        print(f"Job {job_id}: {job['status']} - waiting {interval}s")
        time.sleep(interval)
        
        # Increase interval for next poll
        interval = min(interval * backoff_factor, max_interval)

Real-time Status Updates

WebSocket Connection

const ws = new WebSocket('wss://api.example.com/jobs/job_abc123def456/status');

ws.onopen = function() {
    console.log('Connected to job status stream');
};

ws.onmessage = function(event) {
    const statusUpdate = JSON.parse(event.data);
    
    console.log(`Job ${statusUpdate.job_id}: ${statusUpdate.status}`);
    
    if (statusUpdate.progress) {
        console.log(`Progress: ${statusUpdate.progress.percentage}%`);
        updateProgressBar(statusUpdate.progress.percentage);
    }
    
    if (statusUpdate.status === 'completed') {
        console.log('Job completed!');
        ws.close();
        handleJobCompletion(statusUpdate);
    } else if (statusUpdate.status === 'failed') {
        console.log('Job failed:', statusUpdate.error_details);
        ws.close();
        handleJobFailure(statusUpdate);
    }
};

ws.onerror = function(error) {
    console.error('WebSocket error:', error);
};

Job Status Filters

List jobs with status filtering.

Query Parameters

status
string

Filter by status: “queued”, “processing”, “completed”, “failed”, “cancelled”

operation
string

Filter by operation type: “extract”, “parse”, “split”, “classify”

created_after
string

ISO timestamp - only jobs created after this time

created_before
string

ISO timestamp - only jobs created before this time

tags
string

Comma-separated list of tags to filter by

limit
number
default:"50"

Maximum number of jobs to return (1-100)

offset
number
default:"0"

Number of jobs to skip for pagination

# Get all failed extraction jobs from the last 24 hours
from datetime import datetime, timedelta

yesterday = (datetime.utcnow() - timedelta(days=1)).isoformat() + 'Z'

params = {
    'status': 'failed',
    'operation': 'extract',
    'created_after': yesterday,
    'limit': 20
}

response = requests.get(
    "https://api.example.com/jobs",
    headers=headers,
    params=params
)

failed_jobs = response.json()
print(f"Found {len(failed_jobs['jobs'])} failed extraction jobs")

for job in failed_jobs['jobs']:
    print(f"  {job['job_id']}: {job['error_details']['message']}")
{
  "jobs": [
    {
      "job_id": "job_abc123def456",
      "status": "completed",
      "operation": "extract",
      "created_at": "2024-01-15T10:30:00Z",
      "completed_at": "2024-01-15T10:35:00Z",
      "file_count": 2,
      "tags": ["invoices", "batch-2024-01"]
    },
    {
      "job_id": "job_def456ghi789",
      "status": "failed",
      "operation": "parse", 
      "created_at": "2024-01-15T09:45:00Z",
      "error_details": {
        "error_type": "file_corrupted",
        "message": "Unable to read PDF file"
      }
    }
  ],
  "pagination": {
    "total": 156,
    "limit": 50,
    "offset": 0,
    "has_more": true
  }
}

Status Change Notifications

Webhook Events

Jobs automatically send webhook notifications for status changes:

{
  "event": "job.status_changed",
  "job_id": "job_abc123def456",
  "previous_status": "processing",
  "current_status": "completed",
  "timestamp": "2024-01-15T10:35:42Z",
  "progress": {
    "percentage": 100,
    "files_completed": 3,
    "total_files": 3
  }
}

Email Notifications

Configure email notifications for job completion:

# Enable email notifications when creating job
data = {
    'operation': 'extract',
    'parameters': extraction_params,
    'notification_settings': {
        'email': 'user@example.com',
        'events': ['completed', 'failed'],
        'include_summary': True
    }
}

Error Handling

Common Error Scenarios

404
Not Found

Job ID does not exist or has been deleted

403
Forbidden

Job belongs to different account or insufficient permissions

429
Too Many Requests

Status check rate limit exceeded

Error Response Format

{
  "error": "Job not found",
  "details": {
    "error_type": "not_found",
    "message": "Job with ID 'job_invalid123' does not exist",
    "job_id": "job_invalid123"
  },
  "suggestions": [
    "Check the job ID for typos",
    "Verify the job belongs to your account",
    "Job may have been automatically deleted after retention period"
  ]
}

Performance Optimization

Efficient Status Checking

# Instead of checking each job individually
for job_id in job_ids:
    response = requests.get(f"https://api.example.com/jobs/{job_id}")  # Multiple API calls

# Use batch status check
response = requests.get(
    "https://api.example.com/jobs/status",
    params={"job_ids": ",".join(job_ids)}  # Single API call
)

Caching Status Results

import time
from functools import lru_cache

@lru_cache(maxsize=100)
def get_job_status_cached(job_id, cache_duration=30):
    """Cache job status for 30 seconds to reduce API calls"""
    cache_key = f"{job_id}_{int(time.time() // cache_duration)}"
    
    response = requests.get(f"https://api.example.com/jobs/{job_id}", headers=headers)
    return response.json()

# Usage - will use cached result if called within 30 seconds
status = get_job_status_cached("job_abc123def456")

Rate Limits

  • Individual Status Check: 1000 requests per minute
  • Batch Status Check: 100 requests per minute
  • Job Listing: 60 requests per minute
  • WebSocket Connections: 10 concurrent connections per account

Rate limits are enforced per API key and reset on a rolling window basis.