Skip to main content
Unsiloed exposes a remote Model Context Protocol server at https://mcp.unsiloed.ai/mcp. Add it once as a custom connector in Claude.ai or Claude Desktop and Claude can parse PDFs, classify documents, and extract structured JSON on your behalf. No API key pasting required.

What is the Unsiloed MCP Server?

The Unsiloed MCP Server is a remote Model Context Protocol server that gives Claude (and any MCP-compatible client) direct, authenticated access to Unsiloed’s document-processing tools. You connect once via OAuth. Claude then has six tools at its disposal:

parse_document

Convert a PDF, DOCX, PPTX, XLSX, or image into clean, LLM-ready markdown.

classify_document

Label a document against caller-defined categories (e.g. invoice vs contract vs receipt).

extract_data

Pull structured JSON from a PDF using a caller-provided JSON Schema.

get_*_status

Poll long-running jobs (get_parse_status, get_classify_status, get_extract_status).

How to connect

1

Open Claude.ai or Claude Desktop

Navigate to Settings → Connectors → Add custom connector.
2

Paste the MCP server URL

Use the production endpoint:
https://mcp.unsiloed.ai/mcp
No API key is collected here. Authentication happens via OAuth in the next step.
3

Click Connect

Claude opens a popup to sign in via Unsiloed.
4

Sign in to Unsiloed

Use your normal Unsiloed account credentials. If you don’t have an account yet, create one at unsiloed.ai. The free tier is enough to try it out.
5

Approve the consent screen

You’ll see “Claude wants access to your Unsiloed organization, with scopes: parse, classify, extract.” Click Allow.
6

Verify the connection

Back in Claude, the connector card should show six tools split into read-only (the three status pollers) and read/write groups (parse, classify, extract).Ask Claude in a new chat: “List the tools available from the Unsiloed connector.” You should see all six listed by name.

Authentication

The MCP server uses OAuth 2.1 with PKCE and Dynamic Client Registration (RFC 7591). No API key is ever pasted into Claude.
The first time you connect, Claude does the full dance behind the scenes:
  1. Claude POSTs /mcp; the server responds HTTP 401 with a WWW-Authenticate header pointing at Unsiloed’s authorization server.
  2. Claude discovers the authorization server via the standard .well-known/oauth-protected-resource metadata document.
  3. Claude registers itself dynamically (RFC 7591) and opens your browser to Unsiloed’s login + consent screens.
  4. After you click Allow, the authorization server issues a short-lived access token and a refresh token.
  5. Claude retries /mcp with Authorization: Bearer <token>. The server validates it and associates the call with your organization.
From your perspective: one login, one Allow button, done.
  • Access token: 15-minute lifetime. Sent on every Claude → MCP call.
  • Refresh token: ~30-day lifetime, rotated on every use per OAuth 2.1.
When the access token expires mid-conversation, Claude refreshes it transparently using the refresh token, and you see no interruption.If you sign out of Unsiloed or an admin revokes the refresh token, Claude will surface a “reconnect required” message; click Reconnect to re-run the consent step.
Three scopes are requested during consent:
ScopeAllows
parseparse_document + get_parse_status
classifyclassify_document + get_classify_status
extractextract_data + get_extract_status
You can approve all three or only some during consent. Tools requiring a denied scope return a clear “Missing required OAuth scope” message; reconnect to grant additional scopes.

The six tools

Parse

Accepts PDF, DOCX, DOC, PPTX, PPT, XLSX, XLS, PNG, JPEG, and TIFF. Returns full markdown content plus per-chunk structure and job metadata.Inputs:
  • file_url (string, optional): publicly fetchable HTTPS URL. Presigned S3 URLs work.
  • file_base64 (string, optional): base64-encoded file contents (provide either this OR file_url).
  • file_name (string, optional): defaults to document.pdf. Required for non-PDF formats when using file_base64.
  • mode (fast | accurate | agentic): see Processing modes.
  • page_range (string, optional): e.g. "1-5", "2,4,6", "1-3,7,10-12". Omit for the whole document.
Returns: the merged markdown inline + job metadata (page count, total chunks, credit used, timestamps). For documents over ~30 pages, the call may exceed the tool timeout and return a status: pending envelope with a job_id. Poll with get_parse_status.
Use when parse_document returned status: pending.Inputs:
  • job_id (UUID, required).
  • include_chunks (boolean, default true): set false for a cheap status-only poll.
Returns: full markdown + metadata once status: Succeeded. Otherwise just metadata.

Processing modes

All modes run through Unsiloed’s parsing pipeline with smart layout detection. Pick the one that matches your document complexity and latency budget:
ModeBest for
fastClean born-digital PDFs (system-generated invoices, receipts). Lowest latency and lowest cost.
accurateMost real-world documents: scanned PDFs, multi-column layouts, tables spanning pages.
agenticHighest fidelity. Use for legal contracts, 10-K/10-Q filings, equations, handwriting.

Classify

Inputs:
  • file_url or file_base64 (one required).
  • categories (array, required): 1 to 20 objects shaped {name: string, description?: string}. Descriptions strongly improve accuracy by giving the classifier label hints.
Returns: predicted classification plus per-page confidence scores.
Same shape as get_parse_status. Use when classify_document returned status: pending.

Extract

Inputs:
  • file_url or file_base64 (one required).
  • json_schema (object, required): JSON Schema (draft-07 compatible) defining the desired output. Per-field description values are passed to the underlying model as extraction hints.
  • model (alpha | beta | gamma | delta): see Model tiers.
  • enable_citations (boolean, default false): when true, includes bbox coordinates for each extracted value.
Returns: the extracted object, typed against your schema, with per-field confidence scores.
Same shape as get_parse_status. Use when extract_data returned status: pending.

Model tiers

TierPick when…
alphaSimple key/value or shallow schemas; fastest and cheapest.
betaNested objects and arrays of structured items; mid-tier latency and cost.
gamma (default)Strong balance of accuracy and latency. Recommended for production.
deltaHighest accuracy. Use for complex contracts, dense tables, and strict numerical extraction.

Example prompts

Once connected, just talk to Claude naturally. Some prompts to try:

Parse a PDF into markdown

“Use the Unsiloed connector to parse the PDF at https://www.w3.org/WAI/ER/tests/xhtml/testfiles/resources/pdf/dummy.pdf and show me the markdown.”

Extract contract terms

“Parse the MSA at https://... and extract {parties: string[], governing_law: string, effective_date: string, termination_clause: string} using the delta model.”

Classify a batch of documents

“Classify these PDFs as invoice, receipt, or purchase_order, then for each invoice extract line_items: [{description, quantity, unit_price, total}].”

Pull a 10-K's financials

“Parse this 10-K filing and extract the income-statement figures with the schema {revenue, cogs, gross_profit, operating_expenses, net_income}.”

Privacy and data handling

  • No persistent storage: the MCP server is stateless. Nothing about your request, file contents, or OAuth session is written to disk.
  • No access beyond the request: the server reads only the files you explicitly send (file_url or file_base64) plus the OAuth token Claude attaches. It does not browse arbitrary URLs, query chat history, or read other Unsiloed data outside your organization.
  • TLS end-to-end: every hop (Claude → MCP server → Unsiloed parser) is HTTPS.
  • OAuth scopes are honored: if you approve only parse, the classify_* and extract_* tools will refuse to act and tell Claude to request additional consent.

Troubleshooting

Browser pop-up blockers can intercept the OAuth window. Allow pop-ups for claude.ai and try Connect again. If still blocked, try Claude Desktop instead, which handles the redirect natively without relying on window.open.
The OAuth tokens expire on signout and on admin revocation. Disconnect the Unsiloed connector and re-add it to mint a fresh pair. No re-registration is needed; Claude reuses the stored client_id.
Your organization has been suspended by Unsiloed (usually for billing reasons). Contact support@unsiloed.ai or visit your dashboard to resolve. Once reactivated, the next tool call resumes within a minute, with no need to reconnect.
During consent you approved fewer scopes than the tool needs. Disconnect the connector and reconnect. When the consent screen appears, approve all three of parse, classify, extract.
Your organization has exhausted its monthly parse credits. Top up at your dashboard before retrying.

Support

General enquiries

See also

Claude Tool-Use Integration

Drop-in Anthropic tool-use schemas for direct API usage without the MCP layer.

REST API Reference

Call Unsiloed’s parser, classifier, and extractor directly via REST.