Skip to main content

Document Intelligence API
Document image → structured fields, answers, and tables — one endpoint family.

Intelligent document processing powered by Brainiall Document Intelligence engine. Send a document image and get back: structured fields tuned to the document type (receipt, invoice, ID, contract, form, or generic key-value), a grounded answer to a natural-language question about it, or every table reconstructed into headers and rows. $0.01/page — one simple per-page price instead of a per-feature pricing matrix.

How we compare

The hyperscaler IDP services are powerful but priced per feature and per document type — OCR is one rate, forms another, tables another, expense/ID parsers another again, and you stitch the calls together yourself. Brainiall folds recognition, doc-type-aware field extraction, document Q&A and table extraction into one endpoint family at a single per-page price, self-serve from the first call.

ProviderSurfacePricing modelApprox. priceOnboarding
Brainiall Document IntelligenceOne family: /extract (6 doc types), /query, /tablesPer page, all features included$0.01 / page ($0.012 / page for table extraction)Self-serve, instant API key
AWS TextractDetectDocumentText / AnalyzeDocument (Forms, Tables, Queries) / AnalyzeExpense / AnalyzeIDPer page, per feature~$0.0015 OCR · ~$0.05 forms · ~$0.015 tables/queries · ~$0.10 expense & IDSelf-serve (AWS account + IAM)
Azure AI Document IntelligenceRead (OCR) / prebuilt models (invoice, receipt, ID, …) / custom modelsPer page, per model class~$0.0015 Read · ~$0.01 prebuilt · ~$0.05 customSelf-serve (Azure resource)
Google Document AIDocument OCR / Form Parser / specialized & custom processorsPer page, per processor~$0.0015 OCR · ~$0.03 form parser · ~$0.065 specializedSelf-serve (GCP project + processor setup)

Prices are list-price approximations for orientation, not quotes — hyperscaler IDP pricing is tiered and feature-specific. Always check each vendor's current pricing page.

Pricing

One per-page price covers OCR, field extraction and Q&A; table extraction is a small step up. The free tier is generous enough to wire up an end-to-end pipeline.

Free

$0/mo

50 calls/month · extract + query + tables · forever free

Starter

$29/mo

~3,000 pages/month · all 6 doc types · batch-friendly

Pro

$99/mo

~12,000 pages/month · priority queue · 99.5% SLA

Business

$399/mo

~60,000 pages/month · dedicated capacity · email + Slack

PAYG: $0.01 / page for /extract and /query, $0.012 / page for /tables (Brainiall Document Intelligence engine). One page = one document image. No per-feature surcharges, no minimum spend, no contract.

Three calls: extract, query, tables

# 1. Extract structured fields — doc_type picks the schema
POST https://api.brainiall.com/v1/document/extract
  {"image": "<base64 png/jpeg>", "doc_type": "receipt"}
  -> {"doc_type": "receipt",
      "fields": {"merchant_name": "Blue Bottle Cafe", "date": "2026-05-12",
                 "items": [{"name": "Cappuccino", "quantity": 2, "unit_price": 4.00, "total": 8.00}, ...],
                 "subtotal": 15.50, "tax": 1.24, "total": 16.74, "payment_method": null},
      "text": "<recognised plain text>",
      "extraction_engine": "Brainiall Document Intelligence engine"}
  # doc_type ∈ receipt | invoice | id | contract | form | generic

# 2. Ask a natural-language question about the document
POST https://api.brainiall.com/v1/document/query
  {"image": "<base64 png/jpeg>", "question": "What was the total amount paid?"}
  -> {"answer": "16.74", "found": true, "supporting_text": "TOTAL  16.74"}

# 3. Pull every table out as headers + rows
POST https://api.brainiall.com/v1/document/tables
  {"image": "<base64 png/jpeg>"}
  -> {"table_count": 1,
      "tables": [{"title": "Line items", "headers": ["Item", "Qty", "Price"],
                  "rows": [["Cappuccino", "2", "8.00"], ["Croissant", "1", "3.50"]],
                  "row_count": 2, "column_count": 3}]}

Pass doc_type: "generic" when you don't know the document kind — you get a short description, a best-guess document type, all labelled key-value pairs, and detected dates, amounts and entities. For multi-page documents, split into page images and call once per page. If a page has no readable text the API returns a 422 rather than guessing.

What it's for

  • Accounts-payable & expense automation: drop in a scanned invoice or receipt and get the vendor, dates, line items, tax and total as JSON — straight into your ledger, no template configuration.
  • Onboarding & KYC document capture: parse the name, document number, dates and MRZ off an ID document into structured fields your verification flow can check.
  • Contract & agreement review: pull parties, effective dates, term, governing law and key obligations out of a contract page, or ask a direct question ("what's the notice period?") and get the answer plus the supporting line.
  • Form & questionnaire intake: turn a filled-in form into a list of label/value pairs and checkbox states — useful for digitising paper intake at scale.
  • Table-heavy reports: lift every table out of a financial statement, price list or lab report into clean headers and rows for downstream analysis.
  • One bill, one key: this runs on the same Brainiall API key and usage-based billing as PDF → Markdown and the rest of the catalog — no separate IDP vendor to procure.

Latency profile

Document Intelligence is a thin orchestration layer — it does not host a model itself. End-to-end timing is dominated by the underlying document reader and, where applicable, the OCR step. Plan around these envelopes.

  • Native-text input (the fast path): a document image with sharp printed text or a PDF page with an embedded text layer recognises in ~1.3 s/page. /extract, /query and /tables all sit comfortably inside a synchronous HTTP call here.
  • Scanned or image-only input (the OCR path): camera scans, photos of paper, faxed documents — the document model has to run full OCR on CPU. Budget ~20–30 s/page. A two-page scanned contract through /extract or /v1/document/layout can run ~25 s end-to-end.
  • Translation flow: /v1/document/translate (available via the Document AI Expansion bundle) routes text through the translation model after extraction; on CPU this is around 12–15 s end-to-end, almost all of it the translation step.
  • Recommended pattern for long or scanned input: treat the request as an async job, the same way dubbing and speech-to-speech already do — submit, render a progress indicator, fetch the result when ready.

Press kit & resources

What reviewers, integrators and procurement teams typically ask for.

One-page datasheet

Pricing, doc types and a copy-pasteable curl snippet on one page — built for buyer review.

Download PDF

API reference

OpenAPI spec, request/response shapes, the six doc-type schemas, error codes and rate limits.

Read docs →

Try it now

Free API key in 30 seconds — 50 calls/month, no card.

Get a key →

Compare the catalog

How Brainiall's specialty APIs line up against AWS, Azure and Google, use case by use case.

See the comparison →

More document & text APIs

Same single API key, same usage-based pricing, different problem solved.

Get your free API key in 30 seconds

Start free →
Document Intelligence API — Brainiall (receipts, invoices, IDs, contracts, forms → structured JSON; document Q&A; table extraction) | Brainiall