Qdrant API on YDB
Qdrant‑compatible REST API service for storing and searching vectors in YDB with exact or approximate KNN search — no separate vector database cluster.
Public demo Qdrant base URL:
http://ydb-qdrant.tech:8080Checking…Ideal as a drop‑in Qdrant base URL for IDE agents (Roo Code, Cline) and RAG services on YDB.
Why YDB‑Qdrant
Persistent storage
All data is written to YDB's distributed storage and survives restarts. No in-memory-only mode — your vectors are safe.Transactional & consistent
Built on YDB's distributed ACID transactions with Serializable isolation, so vectors live next to your source‑of‑truth data.Single data platform
Store business rows, events, and vector embeddings together — queries use YQL, vectors stored directly in YDB.Reliability & operations
Reuse YDB's multi‑AZ setups, backup/restore, and disaster recovery instead of operating a separate Qdrant cluster.Self-healing collections
Collections auto-recreate if dropped. The service tracks last access time per collection for tenant lifecycle management.Flexible integration
Use either the hosted HTTP endpoint or the ydb-qdrant Node.js package directly inside your backend services.Comparison: YDB-Qdrant vs. Standalone Qdrant
| Feature | YDB-Qdrant | Standalone Qdrant |
|---|---|---|
| Storage Engine | YDB (Distributed SQL) | RocksDB / In-memory |
| Consistency | Strong (ACID Serializable) | Eventual / Tunable |
| Scalability | Horizontal (YDB native) | Sharding (manual/managed) |
| Query Language | Qdrant REST API (YQL internal) | Qdrant API (gRPC/REST) |
| Operational Complexity | Low (Reuse YDB ops) | Medium (Separate cluster) |
Where it fits best
- Prototyping and experiments with vector search on YDB.
- Datasets roughly up to 10K–50K vectors per collection.
- IDE agents (Roo Code, Cline) expecting a Qdrant API.
- Apps that already use YDB and need a quick vector API (HTTP or in‑process via the ydb-qdrant Node.js package).
Plans
- Stronger per-tenant authentication (IAM/OAuth binding, per‑collection ACLs) beyond the existing X‑Tenant‑Id header and forTenant() API.
- Quotas and rate limiting per tenant (collections, RPS, payload and batch sizes) plus richer audit logging.
- Support for larger collections (>100K vectors) via index tuning and YDB auto-partitioning optimizations.
- Better support for high‑throughput, multi‑million‑vector search with tighter latency through scaling patterns.
- Extending Qdrant API coverage on YDB (filters, facets, recommend/discover, batch search and other advanced modes).
- Hybrid search combining vector similarity with payload filtering for more precise retrieval.
How it works under the hood
Docs
Getting started
Configure in Roo Code/Kilo Code
Hosted demo endpoint for IDEs:
http://ydb-qdrant.tech:8080Checking…Public demo Qdrant base URL for IDEs: http://ydb-qdrant.tech:8080 (paste into your IDE/agent as the Qdrant base URL).
Configure self‑hosted (Node.js)
- Clone and install:
npm install - Set env:
YDB_ENDPOINT,YDB_DATABASE - Auth via env:
YDB_SERVICE_ACCOUNT_KEY_FILE_CREDENTIALS|YDB_METADATA_CREDENTIALS|YDB_ACCESS_TOKEN_CREDENTIALS|YDB_ANONYMOUS_CREDENTIALS - Run:
npm run dev(dev) ornpm start(prod) - Point client to
http://localhost:8080
Or run as a container (Docker or docker-compose) using the published image ghcr.io/astandrik/ydb-qdrant:latest.
Run via Docker / docker-compose
- Pull image:
docker pull ghcr.io/astandrik/ydb-qdrant:latest - Run with Docker:
docker run -d --name ydb-qdrant -p 8080:8080 ghcr.io/astandrik/ydb-qdrant:latest - Or with docker-compose:
docker-compose up -dusing the sample config from the README.
Details and full examples: GitHub README
Use as Node.js library
- Install:
npm install ydb-qdrant - Usage:
import { createYdbQdrantClient } from "ydb-qdrant"; const client = await createYdbQdrantClient({ endpoint: "grpcs://ydb.serverless.yandexcloud.net:2135", database: "/ru-central1/...", defaultTenant: "myapp", // Auth via YDB_*_CREDENTIALS env vars });
See npm package for more details.
All-in-One local YDB + ydb-qdrant (Docker)
- Run a single container with embedded local YDB:
docker run -d --name ydb-qdrant-local \ -p 8080:8080 -p 8765:8765 \ ghcr.io/astandrik/ydb-qdrant-local:latest - Ports:
8080— ydb-qdrant API,8765— YDB Embedded UI - Point client to
http://localhost:8080
Ideal for local development without an external YDB cluster. See GitHub README for details.
API at a glance
Purpose
Use as a Qdrant base URL for IDE agents or apps; vectors persist in YDB.Key features
- Qdrant‑compatible endpoints (collections, points, search)
- Two search modes: exact (default) and approximate (bit‑quantized)
- Batch upserts and batch deletes for bulk operations
- Per‑tenant isolation via X‑Tenant‑Id header
- Collection last access tracking for tenant management
- Self‑host or use public demo endpoint
- Also available as Node.js library (createYdbQdrantClient)
Use as HTTP server
curl -X PUT http://ydb-qdrant.tech:8080/collections/mycol \-H 'Content-Type: application/json' \-d '{"vectors":{"size":384,"distance":"Cosine","data_type":"float"}}'curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/upsert \-H 'Content-Type: application/json' \-d '{"points":[{"id":"1","vector":[0.1,0.2,...384 vals...]}]}'curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/search \-H 'Content-Type: application/json' \-d '{"vector":[0.1,0.2,...],"limit":5,"with_payload":true}'
Use as Node.js library
Health: GET /health → {"status":"ok"}// Install: npm install ydb-qdrantimport { createYdbQdrantClient } from "ydb-qdrant";async function main() {// defaultTenant is optional; defaults to "default"const client = await createYdbQdrantClient({defaultTenant: "myapp",endpoint: "grpcs://lb.etn01g9tcilcon2mrt3h.ydb.mdb.yandexcloud.net:2135",database: "/ru-central1/b1ge4v9r1l3h1q4njclp/etn01g9tcilcon2mrt3h",});// Switch tenant dynamically (returns a new client instance)const otherClient = client.forTenant("other-tenant");await client.createCollection("documents", {vectors: {size: 1536,distance: "Cosine",data_type: "float",},});await client.upsertPoints("documents", {points: [{ id: "doc-1", vector: [/* embedding */], payload: { title: "Doc 1" } },],});const result = await client.searchPoints("documents", {vector: [/* query embedding */],top: 10,with_payload: true,});console.log(result.points);}
Recommended Vector Dimensions
When creating a collection, you must specify the vector size matching your embedding model. Below are popular models with their dimensions and typical use cases:Commercial API Models
OpenAI
text-embedding-3-small
Dimensions1536 (default, can reduce to 256-1536)
Use CasesRAG, semantic search, general-purpose embeddings
text-embedding-3-large
Dimensions3072 (default, can reduce to 256, 512, 1024, 1536, 3072)
Use CasesHigh-accuracy RAG, multilingual tasks
text-embedding-ada-002
Dimensions1536
Use CasesLegacy model, widely adopted
OpenAI (Legacy)
text-search-curie-doc-001
Dimensions4096
Use CasesLegacy GPT-3 model, deprecated
text-search-davinci-doc-001
Dimensions12288
Use CasesLegacy GPT-3 model, deprecated
Cohere
embed-v4.0
Dimensions256, 512, 1024, 1536 (default)
Use CasesMultimodal (text + image), RAG, enterprise search
embed-english-v3.0
Dimensions1024
Use CasesEnglish text, semantic search, classification
embed-multilingual-v3.0
Dimensions1024
Use Cases100+ languages, long-document retrieval, clustering
gemini-embedding-001
Dimensions3072 (configurable)
Use CasesMultilingual, general-purpose, RAG
text-embedding-004
Dimensions768
Use CasesGeneral-purpose text embeddings
text-embedding-005
Dimensions768
Use CasesImproved version of text-embedding-004
text-multilingual-embedding-002
Dimensions768
Use CasesMultilingual text embeddings
Open-Source Models (HuggingFace)
sentence-transformers/all-MiniLM-L6-v2
Dimensions384
Use CasesFast semantic search, low-resource environments
BAAI/bge-base-en-v1.5
Dimensions768
Use CasesRAG, retrieval, English text
BAAI/bge-large-en-v1.5
Dimensions1024
Use CasesHigh-accuracy RAG, English text
BAAI/bge-m3
Dimensions1024
Use CasesMultilingual, dense/sparse/multi-vector
intfloat/e5-base-v2
Dimensions768
Use CasesGeneral retrieval, English text
intfloat/e5-large-v2
Dimensions1024
Use CasesHigh-accuracy retrieval, English text
intfloat/e5-mistral-7b-instruct
Dimensions4096
Use CasesHigh-dimensional embeddings, advanced RAG
nomic-ai/nomic-embed-text-v1
Dimensions768
Use CasesGeneral-purpose, open weights
Choosing Dimensions
- Higher dimensions (1024-4096): Better semantic fidelity, higher storage/compute costs
- Lower dimensions (384-768): Faster queries, lower costs, suitable for many use cases
- Variable dimensions: Some models (OpenAI v3, Cohere v4) allow dimension reduction with minimal accuracy loss
- Legacy models: Older OpenAI GPT-3 models (Curie: 4096, Davinci: 12288) are deprecated but may still be in use