Document Intelligence API
Document image → structured fields, answers, and tables — one endpoint family.
Intelligent document processing powered by Brainiall Document Intelligence engine. Send a document image and get back: structured fields tuned to the document type (receipt, invoice, ID, contract, form, or generic key-value), a grounded answer to a natural-language question about it, or every table reconstructed into headers and rows. $0.01/page — one simple per-page price instead of a per-feature pricing matrix.
How we compare
The hyperscaler IDP services are powerful but priced per feature and per document type — OCR is one rate, forms another, tables another, expense/ID parsers another again, and you stitch the calls together yourself. Brainiall folds recognition, doc-type-aware field extraction, document Q&A and table extraction into one endpoint family at a single per-page price, self-serve from the first call.
| Provider | Surface | Pricing model | Approx. price | Onboarding |
|---|---|---|---|---|
| Brainiall Document Intelligence | One family: /extract (6 doc types), /query, /tables | Per page, all features included | $0.01 / page ($0.012 / page for table extraction) | Self-serve, instant API key |
| AWS Textract | DetectDocumentText / AnalyzeDocument (Forms, Tables, Queries) / AnalyzeExpense / AnalyzeID | Per page, per feature | ~$0.0015 OCR · ~$0.05 forms · ~$0.015 tables/queries · ~$0.10 expense & ID | Self-serve (AWS account + IAM) |
| Azure AI Document Intelligence | Read (OCR) / prebuilt models (invoice, receipt, ID, …) / custom models | Per page, per model class | ~$0.0015 Read · ~$0.01 prebuilt · ~$0.05 custom | Self-serve (Azure resource) |
| Google Document AI | Document OCR / Form Parser / specialized & custom processors | Per page, per processor | ~$0.0015 OCR · ~$0.03 form parser · ~$0.065 specialized | Self-serve (GCP project + processor setup) |
Prices are list-price approximations for orientation, not quotes — hyperscaler IDP pricing is tiered and feature-specific. Always check each vendor's current pricing page.
Pricing
One per-page price covers OCR, field extraction and Q&A; table extraction is a small step up. The free tier is generous enough to wire up an end-to-end pipeline.
Free
$0/mo
50 calls/month · extract + query + tables · forever free
Starter
$29/mo
~3,000 pages/month · all 6 doc types · batch-friendly
Pro
$99/mo
~12,000 pages/month · priority queue · 99.5% SLA
Business
$399/mo
~60,000 pages/month · dedicated capacity · email + Slack
PAYG: $0.01 / page for /extract and /query, $0.012 / page for /tables (Brainiall Document Intelligence engine). One page = one document image. No per-feature surcharges, no minimum spend, no contract.
Three calls: extract, query, tables
# 1. Extract structured fields — doc_type picks the schema
POST https://api.brainiall.com/v1/document/extract
{"image": "<base64 png/jpeg>", "doc_type": "receipt"}
-> {"doc_type": "receipt",
"fields": {"merchant_name": "Blue Bottle Cafe", "date": "2026-05-12",
"items": [{"name": "Cappuccino", "quantity": 2, "unit_price": 4.00, "total": 8.00}, ...],
"subtotal": 15.50, "tax": 1.24, "total": 16.74, "payment_method": null},
"text": "<recognised plain text>",
"extraction_engine": "Brainiall Document Intelligence engine"}
# doc_type ∈ receipt | invoice | id | contract | form | generic
# 2. Ask a natural-language question about the document
POST https://api.brainiall.com/v1/document/query
{"image": "<base64 png/jpeg>", "question": "What was the total amount paid?"}
-> {"answer": "16.74", "found": true, "supporting_text": "TOTAL 16.74"}
# 3. Pull every table out as headers + rows
POST https://api.brainiall.com/v1/document/tables
{"image": "<base64 png/jpeg>"}
-> {"table_count": 1,
"tables": [{"title": "Line items", "headers": ["Item", "Qty", "Price"],
"rows": [["Cappuccino", "2", "8.00"], ["Croissant", "1", "3.50"]],
"row_count": 2, "column_count": 3}]}Pass doc_type: "generic" when you don't know the document kind — you get a short description, a best-guess document type, all labelled key-value pairs, and detected dates, amounts and entities. For multi-page documents, split into page images and call once per page. If a page has no readable text the API returns a 422 rather than guessing.
What it's for
- Accounts-payable & expense automation: drop in a scanned invoice or receipt and get the vendor, dates, line items, tax and total as JSON — straight into your ledger, no template configuration.
- Onboarding & KYC document capture: parse the name, document number, dates and MRZ off an ID document into structured fields your verification flow can check.
- Contract & agreement review: pull parties, effective dates, term, governing law and key obligations out of a contract page, or ask a direct question ("what's the notice period?") and get the answer plus the supporting line.
- Form & questionnaire intake: turn a filled-in form into a list of label/value pairs and checkbox states — useful for digitising paper intake at scale.
- Table-heavy reports: lift every table out of a financial statement, price list or lab report into clean headers and rows for downstream analysis.
- One bill, one key: this runs on the same Brainiall API key and usage-based billing as PDF → Markdown and the rest of the catalog — no separate IDP vendor to procure.
Latency profile
Document Intelligence is a thin orchestration layer — it does not host a model itself. End-to-end timing is dominated by the underlying document reader and, where applicable, the OCR step. Plan around these envelopes.
- Native-text input (the fast path): a document image with sharp printed text or a PDF page with an embedded text layer recognises in ~1.3 s/page.
/extract,/queryand/tablesall sit comfortably inside a synchronous HTTP call here. - Scanned or image-only input (the OCR path): camera scans, photos of paper, faxed documents — the document model has to run full OCR on CPU. Budget ~20–30 s/page. A two-page scanned contract through
/extractor/v1/document/layoutcan run ~25 s end-to-end. - Translation flow:
/v1/document/translate(available via the Document AI Expansion bundle) routes text through the translation model after extraction; on CPU this is around 12–15 s end-to-end, almost all of it the translation step. - Recommended pattern for long or scanned input: treat the request as an async job, the same way dubbing and speech-to-speech already do — submit, render a progress indicator, fetch the result when ready.
Press kit & resources
What reviewers, integrators and procurement teams typically ask for.
One-page datasheet
Pricing, doc types and a copy-pasteable curl snippet on one page — built for buyer review.
Download PDFAPI reference
OpenAPI spec, request/response shapes, the six doc-type schemas, error codes and rate limits.
Read docs →Compare the catalog
How Brainiall's specialty APIs line up against AWS, Azure and Google, use case by use case.
See the comparison →More document & text APIs
Same single API key, same usage-based pricing, different problem solved.



