Skip to main content
GET
/
extract
/
{job_id}
curl -X GET "https://prod.visionapi.unsiloed.ai/extract/{job_id}" \
  -H "api-key: your-api-key"
{
  "job_id": "36adb597-3c2c-43e8-a259-410553291f47",
  "status": "completed",
  "file_name": null,
  "file_url": null,
  "created_at": "2026-03-10T16:41:58.407237+00:00",
  "updated_at": "2026-03-10T16:42:26.232009+00:00",
  "metadata": {
    "order": ["EIN", "Address", "Officers", "Organisation", "telephone_number"],
    "schema": {
      "type": "object",
      "required": ["EIN", "Address", "Officers", "Organisation", "telephone_number"],
      "properties": {
        "EIN": {
          "type": "string",
          "description": "employee identification number"
        },
        "Address": {
          "type": "string",
          "description": "Full Address of organisation"
        },
        "Officers": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["Officers"],
            "properties": {
              "Officers": {
                "type": "string",
                "description": "List of officers"
              }
            },
            "additionalProperties": false
          },
          "description": "List of officers"
        },
        "Organisation": {
          "type": "string",
          "description": "Name of organisation"
        },
        "telephone_number": {
          "type": "string",
          "description": "telephone number"
        }
      },
      "additionalProperties": false
    },
    "page_count": 27
  },
  "result": {
    "EIN": {
      "value": "02-0624253",
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [441, 131, 520, 147],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "Address": {
      "value": "602 S OGDEN ST DENVER, CO 80209",
      "score": {
        "grounding_score": 0.97,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [84, 201, 181, 216],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "Officers": {
      "value": [
        {
          "Officers": {
            "value": "KIMBERLY TROGGIO",
            "score": {
              "grounding_score": 0.98,
              "extraction_score": 0.99
            },
            "citation": {
              "bbox": [0, 446, 107, 465],
              "page": 7,
              "page_width": 612,
              "page_height": 792
            }
          }
        }
      ],
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": null
    },
    "Organisation": {
      "value": "GLOBAL HUMANITARIAN EXPEDITIONS",
      "score": {
        "grounding_score": 0.99,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [84, 120, 239, 135],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "telephone_number": {
      "value": "(303) 858-8857",
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [441, 185, 533, 200],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    }
  }
}

Overview

The Get Job Results endpoint retrieves the processed data from a completed job. This endpoint should only be called after confirming the job status is “completed” using the status endpoint.
Results are only available for completed jobs. Check job status first to ensure processing has finished.

Path Parameters

job_id
string
required
The unique identifier of the extraction job

Response

The response structure depends on the job type (extraction, parsing, classification, etc.).

Extraction Job Results

job_id
string
Unique identifier for the extraction job
status
string
Current status of the job, lowercase: “queued”, “processing”, “completed”, “review”, or “failed”. The result field is populated when status is “completed” or “review”.
file_name
string | null
Original filename of the uploaded document
file_url
string | null
Presigned download URL for the uploaded document. Expires roughly an hour after the response is generated; re-issue this request to get a fresh URL.
created_at
string
ISO 8601 timestamp when the job was created
updated_at
string
ISO 8601 timestamp when the job was last updated
metadata
object
Job metadata containing:
  • order: Array of extracted field names in order
  • schema: The JSON schema used for extraction
  • page_count: Number of pages in the document
result
object
The extracted data matching the provided JSON schema. Present when status is “completed” or “review”. Each field contains:
  • value: The extracted value, matching the schema type
  • score: Confidence object with:
    • grounding_score: Confidence (0-1) that the value was located in the document; 0.0 when citations are disabled
    • extraction_score: Confidence (0-1) in the extracted value itself, or null
  • citation: Where the value was found, or null when citations are disabled or the value could not be grounded:
    • bbox: [left, top, right, bottom] in PDF point space (origin: top-left)
    • page: Page number where the value was found (1-indexed)
    • page_width: Width of the source page in points
    • page_height: Height of the source page in points
For array fields, the value is an array of objects whose sub-fields each carry their own value, score, and citation; the array field itself also carries an aggregated score and citation: null. Arrays nested below the top level omit the citation key entirely.
The nested score/citation shape applies to citation-enabled jobs and to the default gamma model tier. Jobs run with enable_citations=false on the alpha, beta, or delta tiers return a legacy flat shape instead: each field is {"value": ..., "score": <number>} with no citation key.
curl -X GET "https://prod.visionapi.unsiloed.ai/extract/{job_id}" \
  -H "api-key: your-api-key"
{
  "job_id": "36adb597-3c2c-43e8-a259-410553291f47",
  "status": "completed",
  "file_name": null,
  "file_url": null,
  "created_at": "2026-03-10T16:41:58.407237+00:00",
  "updated_at": "2026-03-10T16:42:26.232009+00:00",
  "metadata": {
    "order": ["EIN", "Address", "Officers", "Organisation", "telephone_number"],
    "schema": {
      "type": "object",
      "required": ["EIN", "Address", "Officers", "Organisation", "telephone_number"],
      "properties": {
        "EIN": {
          "type": "string",
          "description": "employee identification number"
        },
        "Address": {
          "type": "string",
          "description": "Full Address of organisation"
        },
        "Officers": {
          "type": "array",
          "items": {
            "type": "object",
            "required": ["Officers"],
            "properties": {
              "Officers": {
                "type": "string",
                "description": "List of officers"
              }
            },
            "additionalProperties": false
          },
          "description": "List of officers"
        },
        "Organisation": {
          "type": "string",
          "description": "Name of organisation"
        },
        "telephone_number": {
          "type": "string",
          "description": "telephone number"
        }
      },
      "additionalProperties": false
    },
    "page_count": 27
  },
  "result": {
    "EIN": {
      "value": "02-0624253",
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [441, 131, 520, 147],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "Address": {
      "value": "602 S OGDEN ST DENVER, CO 80209",
      "score": {
        "grounding_score": 0.97,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [84, 201, 181, 216],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "Officers": {
      "value": [
        {
          "Officers": {
            "value": "KIMBERLY TROGGIO",
            "score": {
              "grounding_score": 0.98,
              "extraction_score": 0.99
            },
            "citation": {
              "bbox": [0, 446, 107, 465],
              "page": 7,
              "page_width": 612,
              "page_height": 792
            }
          }
        }
      ],
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": null
    },
    "Organisation": {
      "value": "GLOBAL HUMANITARIAN EXPEDITIONS",
      "score": {
        "grounding_score": 0.99,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [84, 120, 239, 135],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    },
    "telephone_number": {
      "value": "(303) 858-8857",
      "score": {
        "grounding_score": 0.98,
        "extraction_score": 0.99
      },
      "citation": {
        "bbox": [441, 185, 533, 200],
        "page": 2,
        "page_width": 612,
        "page_height": 792
      }
    }
  }
}

Complete Workflow Example

Here’s a complete example of submitting a job, monitoring its progress, and retrieving results:
import requests
import time
import json

def process_document_with_results(file_path, schema, api_key):
    """Complete workflow: submit job, wait for completion, get results"""
    
    headers = {"api-key": api_key}
    
    # Step 1: Submit extraction job
    files = {"pdf_file": open(file_path, "rb")}
    data = {"schema_data": json.dumps(schema)}
    
    response = requests.post(
        "https://prod.visionapi.unsiloed.ai/v2/extract",
        files=files,
        data=data,
        headers=headers
    )
    
    if response.status_code != 200:
        raise Exception(f"Failed to submit job: {response.text}")
    
    job_id = response.json()["job_id"]
    print(f"Job submitted: {job_id}")
    
    # Step 2: Poll for completion
    while True:
        status_response = requests.get(
            f"https://prod.visionapi.unsiloed.ai/extract/{job_id}",
            headers=headers
        )
        
        if status_response.status_code != 200:
            raise Exception(f"Failed to check status: {status_response.text}")
            
        job = status_response.json()
        status = job["status"]
        print(f"Job status: {status}")
        
        if status in ("completed", "review"):
            break
        elif status == "failed":
            raise Exception(f"Job failed: {job.get('error', 'Unknown error')}")
        
        time.sleep(5)  # Wait 5 seconds before checking again
    
    # Step 3: Get results
    results_response = requests.get(
        f"https://prod.visionapi.unsiloed.ai/extract/{job_id}",
        headers=headers
    )
    
    if results_response.status_code != 200:
        raise Exception(f"Failed to get results: {results_response.text}")
    
    return results_response.json()

# Usage example
schema = {
    "type": "object",
    "properties": {
        "company_name": {
            "type": "string",
            "description": "Name of the company"
        },
        "total_amount": {
            "type": "number",
            "description": "Total financial amount"
        }
    },
    "required": ["company_name"],
    "additionalProperties": False
}

try:
    results = process_document_with_results("document.pdf", schema, "your-api-key")
    print("Extraction completed!")
    print("Results:", results)
except Exception as e:
    print(f"Error: {e}")

Result Data Structure

Extracted Field Format

Each extracted field in the result object contains:
  • value: The extracted value, matching the schema type (string, number, boolean, or an array of objects for array fields)
  • score: A confidence object with grounding_score (0-1 confidence the value was located in the document; 0.0 when citations are disabled) and extraction_score (0-1 confidence in the value itself, or null)
  • citation: A citation object indicating where the value was found, or null when citations are disabled or the value could not be grounded
For array fields, the value is an array of objects whose sub-fields each carry their own value, score, and citation.

Citation Format

Each citation object contains:
  • bbox: [left, top, right, bottom] coordinates in PDF point space (origin: top-left)
  • page: The page number where the value was found (1-indexed)
  • page_width: Width of the source page in points
  • page_height: Height of the source page in points

Confidence Scores

  • 0.9-1.0: Very high confidence, extraction is very likely correct
  • 0.8-0.9: High confidence, extraction is likely correct
  • 0.7-0.8: Good confidence, may warrant review for critical applications
  • 0.6-0.7: Medium confidence, should be reviewed
  • Below 0.6: Low confidence, likely needs manual verification

Error Handling

The endpoint always returns 200 for an existing job. While the job is queued or processing, the response simply has no result field; keep polling until the status is “completed” or “review”.
A failed job also returns 200, with status “failed” and the failure reason in the error field.
The job ID is invalid, belongs to another organization, or the job has been deleted. Verify you’re using the correct job ID.

Authorizations

api-key
string
header
required

Path Parameters

job_id
string
required

The unique identifier of the extraction job

Response

200 - application/json

Extraction job status and results retrieved successfully

Extraction job status and results