Knowledge API
Ingest your docs. Ask questions. Get a grounded, cited answer — no vector database to run.
Managed retrieval-augmented generation as two REST calls. Ingest a document into a namespace — it's chunked, embedded and stored for you. Query the namespace with a question and get the most relevant passages back, plus (optionally) a concise answer grounded only in those passages, with the passages it cited. No vector database to provision, no embedding model to host, no retrieval code to write. Powered by the Brainiall Knowledge engine.
How we compare
"Managed RAG" usually means either a vector database you still operate, or an enterprise search platform you adopt. Brainiall ships the primitive: ingest and query, two calls, with retrieval and a grounded cited answer bundled into the query call — billed per call, with a self-serve key.
| Provider | Shape | Retrieval + answer? | Pricing model | Setup |
|---|---|---|---|---|
| Brainiall Knowledge API | Two REST calls: ingest, query (namespaced knowledge bases) | Yes — retrieval + optional grounded cited answer in one call | Per call ($0.001/1K chars ingest · $0.005/query) | Instant API key |
| Pinecone | Managed vector database | No — you bring embeddings, write retrieval, and call an LLM yourself | Per read/write unit + storage (serverless) or per pod | Account + index + your own embedding/LLM pipeline |
| Vectara | Managed RAG platform | Yes — retrieval + generation | Per query + stored-text tiers; enterprise plans | Account + corpus setup |
| Glean | Enterprise work-search / RAG over connected apps | Yes — enterprise search + answers | Per-seat annual contract | Sales-led, connector onboarding |
| Azure AI Search | Search service (vector + keyword), pair with Azure OpenAI for RAG | Partial — retrieval; you compose generation separately | Per search unit / hour + your generation calls | Provision a Search service |
Prices are list-price approximations for orientation, not quotes. Always check each vendor's current pricing page.
Pricing
Pay for what you do: a small per-1,000-characters charge to ingest, a flat per-query charge to ask. Storage is included. Free tier is generous enough to wire up an end-to-end RAG flow.
Free
$0/mo
50 ingest + query calls/month · namespaced knowledge bases · forever free
Starter
$19/mo
~5,000 queries/month · reranking · grounded cited answers
Pro
$99/mo
~30,000 queries/month · priority queue · 99.5% SLA
Business
$299/mo
~120,000 queries/month · dedicated capacity · email + Slack
PAYG: $0.001 / 1,000 characters ingested · $0.005 / query (retrieval; +nothing extra for the synthesized answer or reranking). Storage included. No minimums, no contracts.
Two calls: ingest, query
# 1. Ingest a document into a knowledge base (your namespace)
POST https://api.brainiall.com/v1/knowledge/my-kb/ingest
{"text": "<your document text>", "title": "Refund policy"}
-> {"doc_id": "doc_5a3eda5bc7344f90", "n_chunks": 14, "chars_in": 18230, ...}
# 2. Query it — passages + a grounded, cited answer
POST https://api.brainiall.com/v1/knowledge/my-kb/query
{"question": "How long do customers have to request a refund?", "top_k": 6}
-> {"answer": "Customers may request a refund within 30 days of purchase.",
"found": true,
"passages": [{"text": "...within 30 days...", "score": 0.81, "doc_id": "doc_5a3...", "cited": true}, ...],
"synthesized": true}
# Query, retrieval only (no synthesized answer)
POST https://api.brainiall.com/v1/knowledge/my-kb/query
{"question": "...", "synthesize": false}
# Re-order the retrieved passages by relevance before answering
POST https://api.brainiall.com/v1/knowledge/my-kb/query
{"question": "...", "rerank": true}
# What's in this knowledge base?
GET https://api.brainiall.com/v1/knowledge/my-kb/documentsAnswers are grounded: drawn only from the retrieved passages, with the passages they cite flagged (cited: true). If the passages don't contain the answer, the response says so (found: false) rather than guessing. Namespaces isolate knowledge bases — one per tenant, per project, per document set, whatever fits your model.
What it's for
- Chat over your docs: a help center, a product manual, a knowledge base — ingest once, then answer questions about it with citations users can verify.
- Support deflection: surface the exact policy paragraph that answers a ticket, before it reaches a human.
- Agent grounding: give an agent a retrieval tool that returns passages plus a short grounded answer — no hallucinated facts, every claim traceable to a source.
- Internal search with answers: ingest wikis, runbooks, contracts; query in natural language; get the relevant passages back.
- No infrastructure to run: no vector database to operate, no embedding model to host, no chunking or retrieval code to maintain. Two REST calls.
Press kit & resources
What reviewers, integrators and procurement teams typically ask for.
One-page datasheet
Pricing, KPIs and a copy-pasteable curl snippet on one page — built for buyer review.
Download PDFAPI reference
OpenAPI spec, request/response shapes, error codes, rate limits and the quota model.
Read docs →Compare the catalog
How Brainiall's specialty APIs line up against AWS, Azure, GCP and the specialists, use case by use case.
See the comparison →More specialty APIs
Same single API key, same usage-based pricing, different problem solved.


