> ## Documentation Index
> Fetch the complete documentation index at: https://docs.unsiloed.ai/llms.txt
> Use this file to discover all available pages before exploring further.

# index

export const HeroCard = ({filename, title, description, href}) => {
  return <a className="group cursor-pointer pb-8" href={href}>
      <img src={`/images/${filename}.png`} className="block dark:hidden pointer-events-none group-hover:scale-105 transition-all duration-100" />
      <img src={`/images/${filename}-dark.png`} className="pointer-events-none group-hover:scale-105 transition-all duration-100 hidden dark:block" />
      <h3 className="mt-5 text-gray-900 dark:text-zinc-50 font-medium">
        {title}
      </h3>
      <span className="mt-1.5">{description}</span>
    </a>;
};

<div className="relative">
  <div className="relative z-10 px-4  lg:pb-24 max-w-3xl mx-auto">
    <h1 className="block text-4xl font-medium text-center text-gray-900 dark:text-zinc-50 tracking-tight">
      Welcome to Unsiloed AI
    </h1>

    <div className="max-w-xl mx-auto px-4 mt-4 text-lg text-center text-gray-500 dark:text-zinc-500">
      Agentic OCR for AI pipelines that need to trust the page. Turn PDFs, scans, and forms into Markdown and structured JSON, with confidence scores and bounding boxes on every value.
    </div>

    <div className="px-6 lg:px-0 mt-6 lg:mt-12 grid sm:grid-cols-2 gap-x-6 gap-y-4">
      <HeroCard filename="rocket" title="Quickstart" description="Explore our complete API documentation with interactive examples" href="/quickstart" />

      <HeroCard filename="cli" title="API Reference" description="REST API documentation for parsing, extraction, classification, and more" href="/api-reference/parser/parse-document" />
    </div>
  </div>
</div>

<div className="max-w-5xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
  <p>Unsiloed AI parses unstructured documents (PDFs, scans, slides, spreadsheets, and 20+ file formats) into Markdown and structured JSON that LLMs and agents can use directly. The API sits between your raw files and your retrieval, extraction, or automation pipeline.</p>

  <p>Generic OCR and text-only LLM parsers lose tables when columns wrap, mangle reading order in multi-column layouts, and produce brittle outputs on the real-world PDFs that show up in invoices, contracts, and forms. Unsiloed AI uses vision and layout models alongside OCR so the structure of the source survives the parse.</p>

  ## API Capabilities

  The API covers four document operations:

  <CardGroup cols={2}>
    <Card title="Parse Documents" icon="file-text" href="/document-processing/parsing/parsing">
      Convert PDFs, DOCX, PPTX, images, and more into hierarchical Markdown chunks. Tables, figures, formulas, and headers are preserved as first-class segments with bounding boxes.
    </Card>

    <Card title="Extract Structured Data" icon="database" href="/document-processing/extraction/extraction">
      Define a JSON schema and get back typed fields with word-level citations and per-field confidence scores. Useful for invoices, claims, KYC forms, and any pipeline that needs auditability.
    </Card>

    <Card title="Split Multi-Document Files" icon="scissors" href="/document-processing/splitting/splitting">
      Detect document boundaries inside merged or scanned batches and return each one separately. Works on layout, content, or custom rules.
    </Card>

    <Card title="Classify Documents" icon="tags" href="/document-processing/classification/classification">
      Route incoming files to the right downstream pipeline by classifying them against a list of categories you define.
    </Card>
  </CardGroup>

  ## Built for Production Pipelines

  The API is designed for the things teams hit when they move document workflows out of a prototype.

  <div className="grid sm:grid-cols-2 gap-x-10 gap-y-6 mt-6">
    <div>
      <div className="flex items-center gap-2">
        <Icon icon="bolt" />

        <h3 className="!mt-0 !mb-2 !text-lg">Production Workloads</h3>
      </div>

      <ul className="!mt-0">
        <li>Asynchronous processing for large and multi-page documents</li>
        <li>Deterministic outputs with confidence scores and word-level bounding boxes</li>
        <li>Broad multi-format support across PDFs, DOCX, PPTX, images, and more</li>
        <li>Scalable infrastructure for high-throughput enterprise workloads</li>
      </ul>
    </div>

    <div>
      <div className="flex items-center gap-2">
        <Icon icon="laptop-code" />

        <h3 className="!mt-0 !mb-2 !text-lg">Developer Experience</h3>
      </div>

      <ul className="!mt-0">
        <li>Clean REST APIs with stable versioned contracts</li>
        <li>Schema-driven extraction with validation, confidence, and traceability</li>
        <li>Interactive playground for testing API requests, schemas, and outputs</li>
        <li>Predictable error handling for reliable production integrations</li>
      </ul>
    </div>
  </div>

  ## Common Use Cases

  * <Icon icon="chart-line" /> **Finance:** Parse financial statements, reports, and regulatory filings into structured, machine-readable data.
  * <Icon icon="scale-balanced" /> **Legal:** Extract clauses, entities, dates, and obligations from contracts and legal documents.
  * <Icon icon="heart-pulse" /> **Healthcare:** Structure clinical documents, forms, and records for downstream systems and workflows.
  * <Icon icon="robot" /> **RAG & Automation:** Parse, chunk, classify, and route documents to power reliable RAG pipelines and document-driven automations.

  ## Next Steps

  <Steps>
    <Step title="Get an API Key">
      [Sign up on Unsiloed AI](https://cal.com/aman-mishra-p0ry57/15min) to receive a key.
    </Step>

    <Step title="Parse Your First Document">
      Follow the [Quickstart](/quickstart) guide to submit a document and read back chunks.
    </Step>

    <Step title="Extract Structured Fields">
      Define a JSON schema and pull typed values out of a document. See the [extraction guide](/document-processing/extraction/extraction).
    </Step>

    <Step title="Explore the Rest of the API">
      Browse the [API reference](/api-reference/parser/parse-document) for parsing strategies, classification, splitting, and batch endpoints.
    </Step>
  </Steps>

  ## Need Help?

  <CardGroup cols={2}>
    <Card title="Documentation" icon="book" href="/document-processing/parsing/parsing">
      Guides for parsing, extraction, classification, and splitting, plus the full API reference.
    </Card>

    <Card title="Support" icon="envelope" href="mailto:support@unsiloed.ai">
      Email [support@unsiloed.ai](mailto:support@unsiloed.ai) to reach the team.
    </Card>
  </CardGroup>
</div>

<div className="max-w-5xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
  <p>Unsiloed AI parses unstructured documents (PDFs, scans, slides, spreadsheets, and 20+ file formats) into Markdown and structured JSON that LLMs and agents can use directly. The API sits between your raw files and your retrieval, extraction, or automation pipeline.</p>

  <p>Generic OCR and text-only LLM parsers lose tables when columns wrap, mangle reading order in multi-column layouts, and produce brittle outputs on the real-world PDFs that show up in invoices, contracts, and forms. Unsiloed AI uses vision and layout models alongside OCR so the structure of the source survives the parse.</p>

  ## API Capabilities

  The API covers four document operations:

  <CardGroup cols={2}>
    <Card title="Parse Documents" icon="file-text" href="/document-processing/parsing/parsing">
      Convert PDFs, DOCX, PPTX, images, and more into hierarchical Markdown chunks. Tables, figures, formulas, and headers are preserved as first-class segments with bounding boxes.
    </Card>

    <Card title="Extract Structured Data" icon="database" href="/document-processing/extraction/extraction">
      Define a JSON schema and get back typed fields with word-level citations and per-field confidence scores. Useful for invoices, claims, KYC forms, and any pipeline that needs auditability.
    </Card>

    <Card title="Split Multi-Document Files" icon="scissors" href="/document-processing/splitting/splitting">
      Detect document boundaries inside merged or scanned batches and return each one separately. Works on layout, content, or custom rules.
    </Card>

    <Card title="Classify Documents" icon="tags" href="/document-processing/classification/classification">
      Route incoming files to the right downstream pipeline by classifying them against a list of categories you define.
    </Card>
  </CardGroup>

  ## Built for Production Pipelines

  The API is designed for the things teams hit when they move document workflows out of a prototype.

  <div className="grid sm:grid-cols-2 gap-x-10 gap-y-6 mt-6">
    <div>
      <div className="flex items-center gap-2">
        <Icon icon="bolt" />

        <h3 className="!mt-0 !mb-2 !text-lg">Production Workloads</h3>
      </div>

      <ul className="!mt-0">
        <li>Asynchronous processing for large and multi-page documents</li>
        <li>Deterministic outputs with confidence scores and word-level bounding boxes</li>
        <li>Broad multi-format support across PDFs, DOCX, PPTX, images, and more</li>
        <li>Scalable infrastructure for high-throughput enterprise workloads</li>
      </ul>
    </div>

    <div>
      <div className="flex items-center gap-2">
        <Icon icon="laptop-code" />

        <h3 className="!mt-0 !mb-2 !text-lg">Developer Experience</h3>
      </div>

      <ul className="!mt-0">
        <li>Clean REST APIs with stable versioned contracts</li>
        <li>Schema-driven extraction with validation, confidence, and traceability</li>
        <li>Interactive playground for testing API requests, schemas, and outputs</li>
        <li>Predictable error handling for reliable production integrations</li>
      </ul>
    </div>
  </div>

  ## Common Use Cases

  * <Icon icon="chart-line" /> **Finance:** Parse financial statements, reports, and regulatory filings into structured, machine-readable data.
  * <Icon icon="scale-balanced" /> **Legal:** Extract clauses, entities, dates, and obligations from contracts and legal documents.
  * <Icon icon="heart-pulse" /> **Healthcare:** Structure clinical documents, forms, and records for downstream systems and workflows.
  * <Icon icon="robot" /> **RAG & Automation:** Parse, chunk, classify, and route documents to power reliable RAG pipelines and document-driven automations.

  ## Next Steps

  <Steps>
    <Step title="Get an API Key">
      [Sign up on Unsiloed AI](https://cal.com/aman-mishra-p0ry57/15min) to receive a key.
    </Step>

    <Step title="Parse Your First Document">
      Follow the [Quickstart](/quickstart) guide to submit a document and read back chunks.
    </Step>

    <Step title="Extract Structured Fields">
      Define a JSON schema and pull typed values out of a document. See the [extraction guide](/document-processing/extraction/extraction).
    </Step>

    <Step title="Explore the Rest of the API">
      Browse the [API reference](/api-reference/parser/parse-document) for parsing strategies, classification, splitting, and batch endpoints.
    </Step>
  </Steps>

  ## Need Help?

  <CardGroup cols={2}>
    <Card title="Documentation" icon="book" href="/document-processing/parsing/parsing">
      Guides for parsing, extraction, classification, and splitting, plus the full API reference.
    </Card>

    <Card title="Support" icon="envelope" href="mailto:support@unsiloed.ai">
      Email [support@unsiloed.ai](mailto:support@unsiloed.ai) to reach the team.
    </Card>
  </CardGroup>
</div>

<div className="max-w-5xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
  <p>Unsiloed AI parses unstructured documents (PDFs, scans, slides, spreadsheets, and 20+ file formats) into Markdown and structured JSON that LLMs and agents can use directly. The API sits between your raw files and your retrieval, extraction, or automation pipeline.</p>

  <p>Generic OCR and text-only LLM parsers lose tables when columns wrap, mangle reading order in multi-column layouts, and produce brittle outputs on the real-world PDFs that show up in invoices, contracts, and forms. Unsiloed AI uses vision and layout models alongside OCR so the structure of the source survives the parse.</p>

  ## API Capabilities

  The API covers four document operations:

  <CardGroup cols={2}>
    <Card title="Parse Documents" icon="file-text" href="/document-processing/parsing/parsing">
      Convert PDFs, DOCX, PPTX, images, and more into hierarchical Markdown chunks. Tables, figures, formulas, and headers are preserved as first-class segments with bounding boxes.
    </Card>

    <Card title="Extract Structured Data" icon="database" href="/document-processing/extraction/extraction">
      Define a JSON schema and get back typed fields with word-level citations and per-field confidence scores. Useful for invoices, claims, KYC forms, and any pipeline that needs auditability.
    </Card>

    <Card title="Split Multi-Document Files" icon="scissors" href="/document-processing/splitting/splitting">
      Detect document boundaries inside merged or scanned batches and return each one separately. Works on layout, content, or custom rules.
    </Card>

    <Card title="Classify Documents" icon="tags" href="/document-processing/classification/classification">
      Route incoming files to the right downstream pipeline by classifying them against a list of categories you define.
    </Card>
  </CardGroup>

  ## Use Unsiloed from Claude

  <CardGroup cols={2}>
    <Card title="MCP Server" icon="plug" href="/integrations/mcp-server">
      Add Unsiloed as a remote MCP connector in Claude.ai or Claude Desktop. Parse, classify, and extract from chat — OAuth-based, no API key pasting.
    </Card>

    <Card title="Claude Tool-Use Integration" icon="code" href="/integrations/claude-integration">
      Drop-in Anthropic tool-use schemas for calling Unsiloed directly from the Claude API.
    </Card>
  </CardGroup>

  ## Built for Production Pipelines

  The API is designed for the things teams hit when they move document workflows out of a prototype.

  <div className="grid sm:grid-cols-2 gap-x-10 gap-y-6 mt-6">
    <div>
      <div className="flex items-center gap-2">
        <Icon icon="bolt" />

        <h3 className="!mt-0 !mb-2 !text-lg">Production Workloads</h3>
      </div>

      <ul className="!mt-0">
        <li>Asynchronous processing for large and multi-page documents</li>
        <li>Deterministic outputs with confidence scores and word-level bounding boxes</li>
        <li>Broad multi-format support across PDFs, DOCX, PPTX, images, and more</li>
        <li>Scalable infrastructure for high-throughput enterprise workloads</li>
      </ul>
    </div>

    <div>
      <div className="flex items-center gap-2">
        <Icon icon="laptop-code" />

        <h3 className="!mt-0 !mb-2 !text-lg">Developer Experience</h3>
      </div>

      <ul className="!mt-0">
        <li>Clean REST APIs with stable versioned contracts</li>
        <li>Schema-driven extraction with validation, confidence, and traceability</li>
        <li>Interactive playground for testing API requests, schemas, and outputs</li>
        <li>Predictable error handling for reliable production integrations</li>
      </ul>
    </div>
  </div>

  ## Common Use Cases

  * <Icon icon="chart-line" /> **Finance:** Parse financial statements, reports, and regulatory filings into structured, machine-readable data.
  * <Icon icon="scale-balanced" /> **Legal:** Extract clauses, entities, dates, and obligations from contracts and legal documents.
  * <Icon icon="heart-pulse" /> **Healthcare:** Structure clinical documents, forms, and records for downstream systems and workflows.
  * <Icon icon="robot" /> **RAG & Automation:** Parse, chunk, classify, and route documents to power reliable RAG pipelines and document-driven automations.

  ## Next Steps

  <Steps>
    <Step title="Get an API Key">
      [Sign up on Unsiloed AI](https://cal.com/aman-mishra-p0ry57/15min) to receive a key.
    </Step>

    <Step title="Parse Your First Document">
      Follow the [Quickstart](/quickstart) guide to submit a document and read back chunks.
    </Step>

    <Step title="Extract Structured Fields">
      Define a JSON schema and pull typed values out of a document. See the [extraction guide](/document-processing/extraction/extraction).
    </Step>

    <Step title="Explore the Rest of the API">
      Browse the [API reference](/api-reference/parser/parse-document) for parsing strategies, classification, splitting, and batch endpoints.
    </Step>
  </Steps>

  ## Need Help?

  <CardGroup cols={2}>
    <Card title="Documentation" icon="book" href="/document-processing/parsing/parsing">
      Guides for parsing, extraction, classification, and splitting, plus the full API reference.
    </Card>

    <Card title="Support" icon="envelope" href="mailto:support@unsiloed.ai">
      Email [support@unsiloed.ai](mailto:support@unsiloed.ai) to reach the team.
    </Card>
  </CardGroup>
</div>

<div className="max-w-5xl mx-auto px-4 sm:px-6 lg:px-8 py-12">
  <p>Unsiloed AI parses unstructured documents (PDFs, scans, slides, spreadsheets, and 20+ file formats) into Markdown and structured JSON that LLMs and agents can use directly. The API sits between your raw files and your retrieval, extraction, or automation pipeline.</p>

  <p>Generic OCR and text-only LLM parsers lose tables when columns wrap, mangle reading order in multi-column layouts, and produce brittle outputs on the real-world PDFs that show up in invoices, contracts, and forms. Unsiloed AI uses vision and layout models alongside OCR so the structure of the source survives the parse.</p>

  ## API Capabilities

  The API covers four document operations:

  <CardGroup cols={2}>
    <Card title="Parse Documents" icon="file-text" href="/document-processing/parsing/parsing">
      Convert PDFs, DOCX, PPTX, images, and more into hierarchical Markdown chunks. Tables, figures, formulas, and headers are preserved as first-class segments with bounding boxes.
    </Card>

    <Card title="Extract Structured Data" icon="database" href="/document-processing/extraction/extraction">
      Define a JSON schema and get back typed fields with word-level citations and per-field confidence scores. Useful for invoices, claims, KYC forms, and any pipeline that needs auditability.
    </Card>

    <Card title="Split Multi-Document Files" icon="scissors" href="/document-processing/splitting/splitting">
      Detect document boundaries inside merged or scanned batches and return each one separately. Works on layout, content, or custom rules.
    </Card>

    <Card title="Classify Documents" icon="tags" href="/document-processing/classification/classification">
      Route incoming files to the right downstream pipeline by classifying them against a list of categories you define.
    </Card>
  </CardGroup>

  ## Use Unsiloed from Claude

  <CardGroup cols={2}>
    <Card title="MCP Server" icon="plug" href="/integrations/mcp-server">
      Add Unsiloed as a remote MCP connector in Claude.ai or Claude Desktop. Parse, classify, and extract from chat — OAuth-based, no API key pasting.
    </Card>

    <Card title="Claude Tool-Use Integration" icon="code" href="/integrations/claude-integration">
      Drop-in Anthropic tool-use schemas for calling Unsiloed directly from the Claude API.
    </Card>
  </CardGroup>

  ## Built for Production Pipelines

  The API is designed for the things teams hit when they move document workflows out of a prototype.

  <div className="grid sm:grid-cols-2 gap-x-10 gap-y-6 mt-6">
    <div>
      <div className="flex items-center gap-2">
        <Icon icon="bolt" />

        <h3 className="!mt-0 !mb-2 !text-lg">Production Workloads</h3>
      </div>

      <ul className="!mt-0">
        <li>Asynchronous processing for large and multi-page documents</li>
        <li>Deterministic outputs with confidence scores and word-level bounding boxes</li>
        <li>Broad multi-format support across PDFs, DOCX, PPTX, images, and more</li>
        <li>Scalable infrastructure for high-throughput enterprise workloads</li>
      </ul>
    </div>

    <div>
      <div className="flex items-center gap-2">
        <Icon icon="laptop-code" />

        <h3 className="!mt-0 !mb-2 !text-lg">Developer Experience</h3>
      </div>

      <ul className="!mt-0">
        <li>Clean REST APIs with stable versioned contracts</li>
        <li>Schema-driven extraction with validation, confidence, and traceability</li>
        <li>Interactive playground for testing API requests, schemas, and outputs</li>
        <li>Predictable error handling for reliable production integrations</li>
      </ul>
    </div>
  </div>

  ## Common Use Cases

  * <Icon icon="chart-line" /> **Finance:** Parse financial statements, reports, and regulatory filings into structured, machine-readable data.
  * <Icon icon="scale-balanced" /> **Legal:** Extract clauses, entities, dates, and obligations from contracts and legal documents.
  * <Icon icon="heart-pulse" /> **Healthcare:** Structure clinical documents, forms, and records for downstream systems and workflows.
  * <Icon icon="robot" /> **RAG & Automation:** Parse, chunk, classify, and route documents to power reliable RAG pipelines and document-driven automations.

  ## Next Steps

  <Steps>
    <Step title="Get an API Key">
      [Sign up on Unsiloed AI](https://cal.com/aman-mishra-p0ry57/15min) to receive a key.
    </Step>

    <Step title="Parse Your First Document">
      Follow the [Quickstart](/quickstart) guide to submit a document and read back chunks.
    </Step>

    <Step title="Extract Structured Fields">
      Define a JSON schema and pull typed values out of a document. See the [extraction guide](/document-processing/extraction/extraction).
    </Step>

    <Step title="Explore the Rest of the API">
      Browse the [API reference](/api-reference/parser/parse-document) for parsing strategies, classification, splitting, and batch endpoints.
    </Step>
  </Steps>

  ## Need Help?

  <CardGroup cols={2}>
    <Card title="Documentation" icon="book" href="/document-processing/parsing/parsing">
      Guides for parsing, extraction, classification, and splitting, plus the full API reference.
    </Card>

    <Card title="Support" icon="envelope" href="mailto:support@unsiloed.ai">
      Email [support@unsiloed.ai](mailto:support@unsiloed.ai) to reach the team.
    </Card>
  </CardGroup>
</div>
