Qdrant API on YDB
Qdrant‑compatible REST API service and Node.js library for storing vectors in YDB with exact top‑k search over a global one-table layout — no separate vector database cluster.
Public demo Qdrant base URL:
http://ydb-qdrant.tech:8080Checking…Ideal as a drop‑in Qdrant base URL for IDE agents (Roo Code, Cline), RAG services, and million‑vector YDB-backed collections.
Why YDB‑Qdrant
Persistent storage
Data is written to YDB's distributed storage. The all-in-one local image can keep embedded YDB data across restarts when configured with a mounted volume.Transactional & consistent
Built on YDB's distributed ACID transactions with Serializable isolation, so vectors live next to your source‑of‑truth data.Single data platform
Store business rows, events, and vector embeddings together — queries use YQL, vectors stored directly in YDB.Reliability & operations
Reuse YDB's multi‑AZ setups, backup/restore, and disaster recovery instead of operating a separate Qdrant cluster.One-table storage model
Collection metadata lives in qdr__collections, points live in qdrant_all_points, and the service tracks last access time per collection.Flexible integration
Use either the hosted HTTP endpoint or the ydb-qdrant Node.js package directly inside your backend services.Comparison: YDB-Qdrant vs. Standalone Qdrant
| Feature | YDB-Qdrant | Standalone Qdrant |
|---|---|---|
| Storage Engine | YDB (Distributed SQL) | RocksDB / In-memory |
| Consistency | Strong (ACID Serializable) | Eventual / Tunable |
| Scalability | Horizontal (YDB native) | Sharding (manual/managed) |
| Query Language | Qdrant REST API (YQL internal) | Qdrant API (gRPC/REST) |
| Operational Complexity | Low (Reuse YDB ops) | Medium (Separate cluster) |
Where it fits best
- Prototyping and experiments with vector search on YDB.
- Million‑vector collections backed by YDB, with capacity and latency tuned per deployment.
- IDE agents (Roo Code, Cline) expecting a Qdrant API.
- Apps that already use YDB and need a quick vector API (HTTP or in‑process via the ydb-qdrant Node.js package).
Plans
- Stronger per-tenant authentication (IAM/OAuth binding, per‑collection ACLs) beyond the current api-key/userUid namespace isolation.
- Quotas and rate limiting per tenant (collections, RPS, payload and batch sizes) plus richer audit logging.
- Operational presets for million‑vector deployments: capacity planning, load-test profiles, and SLO dashboards.
- Lower latency for high‑throughput workloads through query tuning, batching, and scaling patterns.
- Extending Qdrant API coverage on YDB (broader filters, facets, recommend/discover, batch search and other advanced modes).
- Hybrid search combining vector similarity with payload filtering for more precise retrieval.
How it works under the hood
Docs
Getting started
Configure in Roo Code/Kilo Code
Hosted demo endpoint for IDEs:
http://ydb-qdrant.tech:8080Checking…Public demo Qdrant base URL for IDEs: http://ydb-qdrant.tech:8080 (paste into your IDE/agent as the Qdrant base URL). Use api-key for stable isolation.
Configure self‑hosted (Node.js)
- Clone and install:
npm install - Set env:
YDB_QDRANT_ENDPOINT,YDB_QDRANT_DATABASE - Auth via env:
YDB_SERVICE_ACCOUNT_KEY_FILE_CREDENTIALS|YDB_METADATA_CREDENTIALS|YDB_ACCESS_TOKEN_CREDENTIALS|YDB_ANONYMOUS_CREDENTIALS - Run:
npm run dev(dev) ornpm start(prod) - Point client to
http://localhost:8080
Or run as a container (Docker or docker-compose) using the published image ghcr.io/astandrik/ydb-qdrant:latest.
Run via Docker / docker-compose
- Pull image:
docker pull ghcr.io/astandrik/ydb-qdrant:latest - Run with Docker:
docker run -d --name ydb-qdrant \ -p 8080:8080 \ -e YDB_QDRANT_ENDPOINT=grpcs://ydb.serverless.yandexcloud.net:2135 \ -e YDB_QDRANT_DATABASE=/ru-central1/<cloud>/<db> \ -e YDB_SERVICE_ACCOUNT_KEY_FILE_CREDENTIALS=/sa-key.json \ -v /abs/path/sa-key.json:/sa-key.json:ro \ ghcr.io/astandrik/ydb-qdrant:latest - Or with docker-compose:
docker-compose up -dusing the sample config from the README.
Details and full examples: GitHub README
Use as Node.js library
- Install:
npm install ydb-qdrant - Usage:
import { createYdbQdrantClient } from "ydb-qdrant"; const client = await createYdbQdrantClient({ apiKey: "my-stable-namespace-key", endpoint: "grpcs://ydb.serverless.yandexcloud.net:2135", database: "/ru-central1/...", // Auth via YDB_*_CREDENTIALS env vars });
See npm package for more details.
All-in-One local YDB + ydb-qdrant (Docker)
- Run a single container with embedded local YDB:
docker run -d --name ydb-qdrant-local \ -p 8080:8080 -p 8765:8765 \ ghcr.io/astandrik/ydb-qdrant-local:latest - Ports:
8080— ydb-qdrant API,8765— YDB Embedded UI - Point client to
http://localhost:8080
Ideal for local development without an external YDB cluster. See GitHub README for details.
API at a glance
Purpose
Use as a Qdrant base URL for IDE agents or apps; vectors persist in YDB.Key features
- Qdrant‑compatible endpoints (collections, points, search/query aliases)
- Exact top‑k search over serialized vectors in YDB
- Batch upserts and batch deletes for bulk operations
- Million‑vector collections on YDB with deployment‑level capacity tuning
- Namespace isolation via api-key with optional X‑Tenant‑Id suffix
- pathSegments.* filters for search, query, and delete paths
- Collection last access tracking for tenant management
- Self‑host or use public demo endpoint
- Also available as Node.js library (createYdbQdrantClient)
Use as HTTP server
curl -X PUT http://ydb-qdrant.tech:8080/collections/mycol \-H 'Content-Type: application/json' \-H 'api-key: demo-key' \-d '{"vectors":{"size":384,"distance":"Cosine","data_type":"float"}}'curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/upsert \-H 'Content-Type: application/json' \-H 'api-key: demo-key' \-d '{"points":[{"id":"1","vector":[0.1,0.2,...384 vals...]}]}'curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/search \-H 'Content-Type: application/json' \-H 'api-key: demo-key' \-d '{"vector":[0.1,0.2,...],"top":10,"with_payload":true,"score_threshold":0.4}'
Use as Node.js library
Health: GET /health → {"status":"ok"}// Install: npm install ydb-qdrantimport { createYdbQdrantClient } from "ydb-qdrant";async function main() {const client = await createYdbQdrantClient({apiKey: "my-stable-namespace-key",endpoint: "grpcs://lb.etn01g9tcilcon2mrt3h.ydb.mdb.yandexcloud.net:2135",database: "/ru-central1/b1ge4v9r1l3h1q4njclp/etn01g9tcilcon2mrt3h",});await client.createCollection("documents", {vectors: {size: 1536,distance: "Cosine",data_type: "float",},});await client.upsertPoints("documents", {points: [{ id: "doc-1", vector: [/* embedding */], payload: { title: "Doc 1" } },],});const result = await client.searchPoints("documents", {vector: [/* query embedding */],top: 10,with_payload: true,});console.log(result.points);}
Recommended Vector Dimensions
When creating a collection, you must specify the vector size matching your embedding model. Below are popular models with their dimensions and typical use cases:Commercial API Models
OpenAI
text-embedding-3-small
Dimensions1536 (default, can reduce to 256-1536)
Use CasesRAG, semantic search, general-purpose embeddings
text-embedding-3-large
Dimensions3072 (default, can reduce to 256, 512, 1024, 1536, 3072)
Use CasesHigh-accuracy RAG, multilingual tasks
text-embedding-ada-002
Dimensions1536
Use CasesLegacy model, widely adopted
OpenAI (Legacy)
text-search-curie-doc-001
Dimensions4096
Use CasesLegacy GPT-3 model, deprecated
text-search-davinci-doc-001
Dimensions12288
Use CasesLegacy GPT-3 model, deprecated
Cohere
embed-v4.0
Dimensions256, 512, 1024, 1536 (default)
Use CasesMultimodal (text + image), RAG, enterprise search
embed-english-v3.0
Dimensions1024
Use CasesEnglish text, semantic search, classification
embed-multilingual-v3.0
Dimensions1024
Use Cases100+ languages, long-document retrieval, clustering
gemini-embedding-001
Dimensions3072 (configurable)
Use CasesMultilingual, general-purpose, RAG
text-embedding-004
Dimensions768
Use CasesGeneral-purpose text embeddings
text-embedding-005
Dimensions768
Use CasesImproved version of text-embedding-004
text-multilingual-embedding-002
Dimensions768
Use CasesMultilingual text embeddings
Open-Source Models (HuggingFace)
sentence-transformers/all-MiniLM-L6-v2
Dimensions384
Use CasesFast semantic search, low-resource environments
BAAI/bge-base-en-v1.5
Dimensions768
Use CasesRAG, retrieval, English text
BAAI/bge-large-en-v1.5
Dimensions1024
Use CasesHigh-accuracy RAG, English text
BAAI/bge-m3
Dimensions1024
Use CasesMultilingual, dense/sparse/multi-vector
intfloat/e5-base-v2
Dimensions768
Use CasesGeneral retrieval, English text
intfloat/e5-large-v2
Dimensions1024
Use CasesHigh-accuracy retrieval, English text
intfloat/e5-mistral-7b-instruct
Dimensions4096
Use CasesHigh-dimensional embeddings, advanced RAG
nomic-ai/nomic-embed-text-v1
Dimensions768
Use CasesGeneral-purpose, open weights
Choosing Dimensions
- Higher dimensions (1024-4096): Better semantic fidelity, higher storage/compute costs
- Lower dimensions (384-768): Faster queries, lower costs, suitable for many use cases
- Variable dimensions: Some models (OpenAI v3, Cohere v4) allow dimension reduction with minimal accuracy loss
- Legacy models: Older OpenAI GPT-3 models (Curie: 4096, Davinci: 12288) are deprecated but may still be in use