Vision Labels API
Try Florence-2 zero-shot, no signup
Standard tier hitting /v1/vision/labels/base64 live. Rate-limited to 5 calls / 5 min per IP. For unlimited access, sign up for $10 free credit.
Florence-2 dense caption β describes the entire image in natural language. AWS Rekognition does NOT offer this.

Click Run to send a request.
AWS Rekognition Labels would return generic top-5 COCO classes for this image. Florence-2 can caption, OCR, segment, and answer free-text queries β see full capability matrix.
β‘ Performance KPIs (measured)
| Metric | Brainiall | AWS Rekognition | |
|---|---|---|---|
| Fast tier p50 (Brainiall object detection module closed-set) | 236 ms [source] | 250β500 ms (DetectLabels) | β |
| Standard tier p50 (Brainiall Vision Tagger engine multi-task) | 1624 ms | Not available (separate calls needed) | β |
| Throughput per CPU core | Fast 4 RPS Β· Standard 0.6 RPS | Cloud auto-scale | = |
π― Capability matrix
| Metric | Brainiall | AWS Rekognition | |
|---|---|---|---|
| Closed-set object detection (~80 COCO) | β Brainiall object detection module | β DetectLabels (~2500 categories) | R |
| Zero-shot detection (custom text prompt) | β Brainiall Vision Tagger engine grounding + future OWL-v2 | β Not supported (only pre-trained labels) | β |
| Caption generation (image β natural language) | β Brainiall Vision Tagger engine SOTA captioning | β Not supported | β |
| Multi-task (caption + detect + OCR + segment in 1 model) | β Brainiall Vision Tagger engine (single model, MIT) | β Requires separate API calls per task | β |
| OCR within image | β Brainiall Vision Tagger engine multi-task includes OCR | π‘ Use DetectText (separate) | β |
| Open weights you can audit | β Brainiall object detection module permissive license + Brainiall Vision Tagger engine MIT | β Proprietary closed | β |
| LGPD / GDPR-by-default | β EU/BR datacenter | π‘ us-east default | β |
π Quality benchmarks
| Metric | Brainiall | AWS Rekognition | |
|---|---|---|---|
| COCO mAP | Brainiall object detection module-Base ~50 mAP | Not published | |
| Multi-task SOTA | Brainiall Vision Tagger engine SOTA on caption + grounding + OCR + segment | Single-task per API |
Pricing
Free
500 imgs/month
Get started.
Fast
$0.0015 / image
Brainiall object detection module closed-set object detection. p50 236ms.
Standard
$0.008 / image
Brainiall Vision Tagger engine multi-task: caption + detect + OCR + segment in one call. p50 1.6s.
Quickstart (Python)
Request (Standard tier, multi-task)
import base64, httpx
img = base64.b64encode(open("photo.jpg", "rb").read()).decode()
resp = httpx.post(
"https://api.brainiall.com/v1/vision/labels/base64",
headers={"Authorization": "Bearer brnl-..."},
json={
"image": img,
"tier": "standard",
"task": "<CAPTION>" # or <OCR>, <OD>, <CAPTION_TO_PHRASE_GROUNDING>
},
)
print(resp.json())Example response
{
"request_id": "req_4d8e2aβ¦",
"processing_ms": 1624,
"tier": "standard",
"task": "<CAPTION>",
"caption": "Two cats laying on a pink couch with remote controls.",
"labels": [
{"label": "cat", "confidence": 0.97,
"box": [120, 240, 380, 520]},
{"label": "couch", "confidence": 0.94,
"box": [0, 100, 800, 600]}
],
"output": {
"<CAPTION>": "Two cats laying on a pink couchβ¦"
},
"warnings": []
}π° Why "paridade per-call" is the wrong comparison
Per-image, Brainiall Standard ($0.008) and Google Vision ($0.0015) look comparable. But Google Vision charges per feature. To get caption + OCR + object detection on the same image with Google Vision Premium, you need 3 separate features Γ $0.0015 = $0.0045. With Brainiall Brainiall Vision Tagger engine Standard, all three come in one call.
| Use case | Brainiall | Google Vision | AWS Rekognition | Savings |
|---|---|---|---|---|
| Object detection only (Fast tier) | $0.0015 / img (Brainiall object detection module) | $0.0015 / img | $0.001 / img | parity |
| Caption + OCR + detection (one image) | $0.008 / img (1 call) | $0.0045 / img (3 features) | $0.0035 + Textract $0.0015 (2 APIs) | β |
| Same workload for 1M images / month | $8,000 / mo | $4,500 / mo | $5,000 / mo | β |
| + Zero-shot (find specific object via text query) | β Same call | β Not supported | β Not supported | capability |
| + Caption in 80+ languages (Brainiall Vision Tagger engine multilingual) | β Same call | π‘ Translation API extra | β English only | capability |
Real-world TCO: at 1M images/mo, Google Vision is ~44% cheaper for the multi-feature use case but you lose zero-shot prompting (cannot ask "find a Brazilian flag in this image") and pay extra for caption translation. For PT-BR, ZH, AR, JP catalogs, Brainiall Vision Tagger engine ships native multilingual captioning in the same $0.008 β Google would charge add'l Translation API + still lack zero-shot.
Volume tier: starting at 500K imgs/mo, Standard tier drops to $0.0056 / img (-30%). See /pricing for the full ladder.
Comparison methodology & disclaimer
Brainiall measurements: our production infrastructure, May 2026. Models: Brainiall object detection module-Base (Roboflow) + Brainiall Vision Tagger engine-base (MIT, Microsoft). Full report: Phase 1.5 Eval Report.
AWS data: Rekognition DetectLabels claims ~2500 categories (broader catalog than Brainiall object detection module's 80 COCO classes β that's why we mark capability β for Rekognition there). However, Rekognition does NOT offer zero-shot prompting (e.g., "find a Brazilian flag" on demand) nor image captioning β both confirmed via AWS documentation as of May 2026.
Notes:
- Brainiall S9 v1 covers Brainiall object detection module (closed-set ~80 COCO classes) + Brainiall Vision Tagger engine (multi-task: caption + detect + OCR + segment). For broader closed-set coverage similar to Rekognition's 2500 categories, use Brainiall Vision Tagger engine multi-task or wait for v1.1 (planned: OWL-v2 zero-shot for arbitrary prompts).
- Brainiall Vision Tagger engine is permissive licensed and runs entirely on CPU β no API egress, no cloud dependency.
- Brainiall object detection module mAP from Roboflow benchmark; methodologies may differ vs AWS evaluation.
- Trademarks: Amazon Web Services, Rekognition are trademarks of Amazon.com, Inc. This page is informational comparison; not endorsed by AWS.
Last reviewed: May 2026.