Vision Labels API

Name: Vision Labels API
Brand: Brainiall
SKU: vision-labels
Availability: InStock

Object detection 236 ms p50 + multi-task captioning Brainiall Vision Tagger engine. Capability AWS Rekognition does NOT offer: zero-shot detection by custom text prompt + image captioning in one model.

Get free API key Read docs

Live demo

Try Florence-2 zero-shot, no signup

Standard tier hitting /v1/vision/labels/base64 live. Rate-limited to 5 calls / 5 min per IP. For unlimited access, sign up for $10 free credit.

1. Choose a sample image

2. Pick a task

Florence-2 dense caption — describes the entire image in natural language. AWS Rekognition does NOT offer this.

Click Run to send a request.

AWS Rekognition Labels would return generic top-5 COCO classes for this image. Florence-2 can caption, OCR, segment, and answer free-text queries — see full capability matrix.

⚡ Performance KPIs (measured)

Metric	Brainiall	AWS Rekognition
Fast tier p50 (Brainiall object detection module closed-set)	236 ms [source]	250–500 ms (DetectLabels)	★
Standard tier p50 (Brainiall Vision Tagger engine multi-task)	1624 ms	Not available (separate calls needed)	★
Throughput per CPU core	Fast 4 RPS · Standard 0.6 RPS	Cloud auto-scale	=

🎯 Capability matrix

Metric	Brainiall	AWS Rekognition
Closed-set object detection (~80 COCO)	✅ Brainiall object detection module	✅ DetectLabels (~2500 categories)	R
Zero-shot detection (custom text prompt)	✅ Brainiall Vision Tagger engine grounding + future OWL-v2	❌ Not supported (only pre-trained labels)	★
Caption generation (image → natural language)	✅ Brainiall Vision Tagger engine SOTA captioning	❌ Not supported	★
Multi-task (caption + detect + OCR + segment in 1 model)	✅ Brainiall Vision Tagger engine (single model, MIT)	❌ Requires separate API calls per task	★
OCR within image	✅ Brainiall Vision Tagger engine multi-task includes OCR	🟡 Use DetectText (separate)	★
Open weights you can audit	✅ Brainiall object detection module permissive license + Brainiall Vision Tagger engine MIT	❌ Proprietary closed	★
LGPD / GDPR-by-default	✅ EU/BR datacenter	🟡 us-east default	★

📊 Quality benchmarks

Metric	Brainiall	AWS Rekognition
COCO mAP	Brainiall object detection module-Base ~50 mAP	Not published
Multi-task SOTA	Brainiall Vision Tagger engine SOTA on caption + grounding + OCR + segment	Single-task per API

Pricing

Free

500 imgs/month

Get started.

Fast

$0.0015 / image

Brainiall object detection module closed-set object detection. p50 236ms.

Standard

$0.008 / image

Brainiall Vision Tagger engine multi-task: caption + detect + OCR + segment in one call. p50 1.6s.

Quickstart (Python)

Request (Standard tier, multi-task)

import base64, httpx
img = base64.b64encode(open("photo.jpg", "rb").read()).decode()
resp = httpx.post(
 "https://api.brainiall.com/v1/vision/labels/base64",
 headers={"Authorization": "Bearer brnl-..."},
 json={
 "image": img,
 "tier": "standard",
 "task": "<CAPTION>" # or <OCR>, <OD>, <CAPTION_TO_PHRASE_GROUNDING>
 },
)
print(resp.json())

Example response

{
 "request_id": "req_4d8e2a…",
 "processing_ms": 1624,
 "tier": "standard",
 "task": "<CAPTION>",
 "caption": "Two cats laying on a pink couch with remote controls.",
 "labels": [
 {"label": "cat", "confidence": 0.97,
 "box": [120, 240, 380, 520]},
 {"label": "couch", "confidence": 0.94,
 "box": [0, 100, 800, 600]}
 ],
 "output": {
 "<CAPTION>": "Two cats laying on a pink couch…"
 },
 "warnings": []
}

💰 Why "paridade per-call" is the wrong comparison

Per-image, Brainiall Standard ($0.008) and Google Vision ($0.0015) look comparable. But Google Vision charges per feature. To get caption + OCR + object detection on the same image with Google Vision Premium, you need 3 separate features × $0.0015 = $0.0045. With Brainiall Brainiall Vision Tagger engine Standard, all three come in one call.

Use case	Brainiall	Google Vision	AWS Rekognition	Savings
Object detection only (Fast tier)	$0.0015 / img (Brainiall object detection module)	$0.0015 / img	$0.001 / img	parity
Caption + OCR + detection (one image)	$0.008 / img (1 call)	$0.0045 / img (3 features)	$0.0035 + Textract $0.0015 (2 APIs)	—
Same workload for 1M images / month	$8,000 / mo	$4,500 / mo	$5,000 / mo	—
+ Zero-shot (find specific object via text query)	✅ Same call	❌ Not supported	❌ Not supported	capability
+ Caption in 80+ languages (Brainiall Vision Tagger engine multilingual)	✅ Same call	🟡 Translation API extra	❌ English only	capability

Real-world TCO: at 1M images/mo, Google Vision is ~44% cheaper for the multi-feature use case but you lose zero-shot prompting (cannot ask "find a Brazilian flag in this image") and pay extra for caption translation. For PT-BR, ZH, AR, JP catalogs, Brainiall Vision Tagger engine ships native multilingual captioning in the same $0.008 — Google would charge add'l Translation API + still lack zero-shot.

Volume tier: starting at 500K imgs/mo, Standard tier drops to $0.0056 / img (-30%). See /pricing for the full ladder.

Comparison methodology & disclaimer

Brainiall measurements: our production infrastructure, May 2026. Models: Brainiall object detection module-Base (Roboflow) + Brainiall Vision Tagger engine-base (MIT, Microsoft). Full report: Phase 1.5 Eval Report.

AWS data: Rekognition DetectLabels claims ~2500 categories (broader catalog than Brainiall object detection module's 80 COCO classes — that's why we mark capability ✅ for Rekognition there). However, Rekognition does NOT offer zero-shot prompting (e.g., "find a Brazilian flag" on demand) nor image captioning — both confirmed via AWS documentation as of May 2026.

Notes:

Brainiall S9 v1 covers Brainiall object detection module (closed-set ~80 COCO classes) + Brainiall Vision Tagger engine (multi-task: caption + detect + OCR + segment). For broader closed-set coverage similar to Rekognition's 2500 categories, use Brainiall Vision Tagger engine multi-task or wait for v1.1 (planned: OWL-v2 zero-shot for arbitrary prompts).
Brainiall Vision Tagger engine is permissive licensed and runs entirely on CPU — no API egress, no cloud dependency.
Brainiall object detection module mAP from Roboflow benchmark; methodologies may differ vs AWS evaluation.
Trademarks: Amazon Web Services, Rekognition are trademarks of Amazon.com, Inc. This page is informational comparison; not endorsed by AWS.

Last reviewed: May 2026.