Document AI Expansion
Five new prebuilt document types, a Markdown layout endpoint, multi-skill enrichment, and translation glossaries.
Catch up to AWS Textract Specialty and Azure AI Document Intelligence prebuilt models โ for ~1/3 the price.
What you get
/v1/document/extract (doc_type: business_card | w2 | health_card | mortgage | pay_stub)+5 Prebuilt Doc Types
Five new doc-type schemas now extract structured fields out of business cards, US W-2/1099 tax forms, health insurance cards, mortgage statements, and pay stubs. Drop-in extension of the existing extract endpoint.
Engine: Brainiall Doc Intelligence (extended schemas)
/v1/document/layoutDocument Layout (Markdown)
Return the document as structured Markdown โ headings, tables, lists, code blocks, math โ while preserving page structure. The single API for converting documents to a LLM-friendly format.
Engine: Brainiall Doc Layout engine
/v1/skillsets/enrichAI Skillsets (enrichment pipeline)
Run multiple enrichment skills over a doc image or text in one call: OCR + entities + language + key phrases + sentiment. Returns a JSON of skill outputs ready for indexing or RAG.
Engine: Brainiall Skillsets engine
/v1/translate (with glossary={src:tgt})Custom Translation Glossary
Pin your brand names, product names, technical jargon to their canonical translations. Per-call glossary โ no training, no setup, just pass a dict.
Engine: Brainiall Custom Glossary
How we compare
| Provider | Equivalent surface |
|---|---|
| AWS Textract | Has Forms + Tables + Queries + Specialty (invoice/receipt/ID/lending) โ pay per page per feature ($0.05+/page) |
| Azure AI Document Intelligence | Has ~15 prebuilts + Layout + Custom Models โ $10-50/1k pages |
| Brainiall | 11 prebuilt schemas now (6 baseline + 5 added), Layout via Brainiall Doc Layout engine, Skillsets pipeline โ $0.01-0.03/page. |
Latency profile
Two of the endpoints in this bundle run a CPU-bound step end-to-end โ plan accordingly:
/v1/document/layout: latency tracks the underlying document reader. Native-text PDFs and clean printed pages return in about 1.3 s/page; scanned or image-only input goes through OCR and lands closer to 20โ25 s/page on CPU./v1/document/translate: the translation step dominates โ roughly 12โ15 s end-to-end on CPU for a typical page, ~97% of that the translation model itself.- Recommended pattern: for long documents or scanned input, treat the call as an async job. Submit, render a progress indicator, fetch the result when ready โ the same pattern used by dubbing and speech-to-speech.
Pricing
Document AI Expansion endpoints share your existing Brainiall NLP, Document, and Speech AI usage โ no separate bundle subscription. The Free tier covers ~100-1000 calls/month per endpoint.