Apache-2.0 Self-host or cloud

Similarity search
at the speed of thought.

Vector is the open-source vector database built for AI — sub-millisecond queries, billion-scale indexes, a Python SDK that feels native.

Star on GitHub ↗

Adopted in production

Powering retrieval at Series-B AI startups and Fortune-500 search teams.

14.2k★ GitHub 312kweekly pip downloads 280+contributors Apache-2.0license

query.py — vector 0.38ms · p99

from vector import Client

db = Client("vector://localhost:6333")
db.upsert("docs", embeddings)

hits = db.query(
    collection="docs",
    vector=q,
    limit=5,
)

p99 query latency · lower is better benchmark [1] ↗

vector

0.38ms

field avg

3.8ms

single node · 1B vectors · 1024-dim · cold-cache p99

Benchmark · not a claim

Numbers we'd put our name on.

view methodology

10×

faster than Pinecone

ANN benchmark · 1B vectors

0.38ms

p99 query latency

1024-dim embeddings

1B+

vectors per node

HNSW + product quantization

99.99%

recall@10

vs. exact brute-force

[1] Measured on c6i.4xlarge, 1B 1024-dim float32 vectors, HNSW (M=16, ef=128), 99th-percentile cold-cache reads. Full methodology and reproducible scripts in our public benchmark repo ↗.

Developer experience

Under a dozen lines. Zero config.

No config files. No schema ceremony. Connect, upsert, query — that's the whole API surface you need to ship.

from vector import Client

db = Client("vector://localhost:6333")

db.create_collection("docs", dim=1536)
db.upsert("docs", embeddings)

hits = db.query("docs", vector=q, limit=5)

import { Client } from "vector-db";

const db = new Client("vector://localhost:6333");

await db.createCollection("docs", { dim: 1536 });

await db.upsert("docs", embeddings);

const hits = await db.query("docs", { vector: q, limit: 5 });

# create collection
curl -X PUT localhost:6333/collections/docs \
  -H "Content-Type: application/json" \
  -d '{"vectors": {"size": 1536, "distance": "cosine"}}'

# query
curl -X POST localhost:6333/collections/docs/points/query \
  -H "Content-Type: application/json" \
  -d '{"vector": [0.1, ...], "limit": 5}'

How it works

01
connect
One URI, no driver soup. Same client works in-process or across the network.
02
model
Declare a collection with a dimension. Vector infers the rest.
03
load
Batch upsert streams straight into the index, no rebuild step.
04
query
Vector + metadata filter in one call. Recall doesn't flinch.

Read the quickstart

Built for production AI.

The features you'd reach for after the demo — already first-class.

No enterprise upsell required.

Hybrid Search

Dense and sparse vectors fused in one query. Keyword recall meets semantic intent.

sub-50ms @ 10M docs

Filter at Scale

Metadata filtering that respects your index. Predicate pushdown, no recompute.

zero recompute

Drop-in Scalable

Sharding and replication built in. Grow from laptop to cluster without a rewrite.

1B vectors · one command

Pricing

Simple pricing. Open core.

Self-host forever, free. Pay only when you want it managed.

Free

self-host

For builders & solo projects.

$0 /forever

Full feature set, Apache-2.0
Billion-scale indexes
Community Discord support
Run anywhere — your metal, your cloud
No telemetry, no lock-in

Start free

Pro

managed

For teams shipping to production.

$39 /mo · or $0.025/hr

Fully managed clusters
Auto-scaling & replication
99.99% uptime SLA
Priority engineering support
SOC 2 backups & PITR

Start 14-day trial

Enterprise

dedicated

For regulated & hyperscale.

Custom

Dedicated infra / VPC peering
On-prem & air-gapped deploy
HIPAA · FedRAMP · SOC 2 Type II
Named solutions architect
Custom SLAs & 24/7 incident

Talk to us

Self-host forever, free. Apache-2.0. No vendor lock-in. · See the fair comparison vs. managed alternatives ↗

Similarity search at the speed of thought.