A successfulDocumentation Index
Fetch the complete documentation index at: https://docs.unsiloed.ai/llms.txt
Use this file to discover all available pages before exploring further.
/parse job returns the document organized into chunks. Each chunk has an embed Markdown string (concatenated content from its segments, ready for embedding) and an array of segments with bounding boxes and metadata. The example below is a real response from a single-page test document.
Top-Level Fields
These fall into three groups: identification, status, and timing; parsed content; and job configuration and metering.Identification, Status, and Timing
job_id: unique identifier for the parsing jobstatus: job state (Succeeded,Failed, or an in-progress value such asStartingorProcessing)message: human-readable status message ("Task succeeded"when the job completes)file_name: name of the uploaded filefile_type: MIME type of the uploaded file (e.g.,application/pdf)created_at: ISO 8601 timestamp when the job was createdstarted_at: ISO 8601 timestamp when processing beganfinished_at: ISO 8601 timestamp when processing completed
Parsed Content
chunks: array of content chunkstotal_chunks: total number of chunkspage_count: total number of pages in the documentpdf_url: temporary signed S3 URL to the processed PDFfile_url: temporary signed S3 URL to the original uploaded file
Job Configuration and Metering
configuration: the full configuration object used for this parse (OCR engine, layout strategy, segment processing settings, etc.); see the Parse API reference for every optionmetadata: additional job metadata; usually an empty objectmerge_tables: whether tables were merged across pagescredit_used: credits consumed by this job
Chunk Fields
chunk_id: unique identifier for the chunkchunk_length: character length of the chunk’sembedcontentembed: combined Markdown content from all segments in the chunk, ready for embedding into a vector storesegments: array of layout segments within the chunk
Segment Fields
segment_id: unique identifier for the segmentsegment_type: element classification; see the Element Types reference for the full listcontent: plain-text content of the segment (omitted forSignaturesegments)markdown: Markdown-formatted contenthtml: HTML-formatted contentimage: signed S3 URL to a cropped image of the segment (present for most types; omitted forSignature)bbox: bounding box relative to the page, withleft,top,width,heightin pointspage_number: page where the segment appearspage_width/page_height: page dimensions for coordinate referenceconfidence: model confidence score (0–1) for element detectionocr: array of word-level OCR resultsreferences: references to related segments; typicallynull
OCR Item Fields
Each item in a segment’socr array describes one word the OCR engine recognized within that segment. The bounding box is relative to the segment’s cropped image, not the full page.
text: the recognized word or tokenbbox: bounding box relative to the segment’s image, withleft,top,width,heightconfidence: per-word model confidence (0–1), ornullwhen not reportedcolor: optionalr,g,b, andhexsub-fields, present only whenextract_colors: trueis set in the parse configuration

