PDF-to-Markdown API
Marker engine. $0.001/page. Built for RAG and LLM ingestion.
Drop a PDF → get clean Markdown with preserved structure: headings, tables, code blocks, math, footnotes. $0.001/page — 5× cheaper than Datalab, 50× cheaper than Adobe Extract.

How we compare
Marker is research SOTA on academic and technical PDFs (~95% structure preservation on multi-column papers). v1.0 calibration vs published competitor pricing and metrics; v1.1 will replace with a 100-document head-to-head benchmark.
| Provider | Quality | Price/page | vs market avg | Position |
|---|---|---|---|---|
| Datalab (Marker on-demand) | 9.4/10 | $0.0050 | 33% | — |
| LlamaIndex Cloud (LlamaParse) | 9.0/10 | $0.0030 | 20% | — |
| Adobe PDF Services Extract | 9.2/10 | $0.050 | 329% | — |
| Microsoft Document Intelligence | 8.8/10 | $0.010 | 66% | — |
| Reducto AI | 9.3/10 | $0.0080 | 53% | — |
| Brainiall FAST | 9.4/10 | $0.0010(93% cheaper) | 7% | Parity |
Pricing rule: 90% off when inferior · 80% off at parity · 50% off when superior. Position determined by objective benchmark, refreshed quarterly.
Pricing
Discount derived from quality position vs the closest competitor. 90% off when inferior, 80% off at parity, 50% off when superior.
Free
$0/mo
30 pages/month · fast tier · forever free
Starter
$19/mo
1,500 pages/month · markdown + JSON output · all formats
Pro
$99/mo
15,000 pages/month · priority queue · 99.5% SLA
Business
$299/mo
75,000 pages/month · dedicated capacity · email + Slack
PAYG: $0.001/page (Marker). HD tier (Marker + Surya OCR refinement for scanned/multilingual/handwritten) is on the v1.1 roadmap — not yet available.
One endpoint, structured output
# Convert PDF to clean Markdown
POST https://api.brainiall.com/v1/document/pdf-to-markdown/base64
{"pdf": "<base64 pdf>"}
# With page range (skip cover, ToC, etc.)
POST https://api.brainiall.com/v1/document/pdf-to-markdown/base64
{"pdf": "<base64>", "page_range": "3-50"}
# Markdown-only response (no JSON wrapper)
POST https://api.brainiall.com/v1/document/pdf-to-markdown/base64
{"pdf": "<base64>", "output_format": "markdown"}
# Response includes structure metadata
# { "markdown": "# Title\n\n...", "metadata": {"pages": 48, "char_count": 12048}, "tier": "fast" }What Marker does well
- Multi-column layouts: academic papers, magazines, technical reports — preserves reading order.
- Tables: extracted as Markdown tables with header/cell preservation, not flattened text.
- Math + code: equations rendered as LaTeX inline; code blocks preserved with monospace fences.
- Headings + structure: H1/H2/H3 hierarchy detected from font sizes + position cues.
- Footnotes + references: linked at-paragraph, not dropped.
- Multilingual: HD tier adds Surya OCR for non-English PDFs and scanned documents.
Built for RAG pipelines
Most PDF parsers output JSON or HTML — your RAG pipeline then has to re-flatten it back to text-with-structure. Marker outputs Markdown directly, which embedders (OpenAI, Voyage, Cohere) handle natively. Cut the conversion step.
Press kit & resources
Everything reviewers, integrators and procurement teams typically ask for.
One-page datasheet
Pricing, KPIs and a copy-pasteable curl snippet on one page. Ideal for buyer review.
Download PDFTry it with our sample
Sample academic paper PDF — feed it through the API and compare the output against your own input.
Download sampleAPI reference
OpenAPI spec, request/response shapes, error codes, rate limits and quota model.
Read docs →More specialty APIs
Same single API key, same usage-based pricing, different problem solved.



