Qdrant API on YDB

Qdrant‑compatible REST API service for storing and searching vectors in YDB with exact or approximate KNN search — no separate vector database cluster.

Explore on GitHub Architecture & diagrams →

Public demo Qdrant base URL: http://ydb-qdrant.tech:8080Checking…

Ideal as a drop‑in Qdrant base URL for IDE agents (Roo Code, Cline) and RAG services on YDB.

Why YDB‑Qdrant

Persistent storage

All data is written to YDB's distributed storage and survives restarts. No in-memory-only mode — your vectors are safe.

Transactional & consistent

Built on YDB's distributed ACID transactions with Serializable isolation, so vectors live next to your source‑of‑truth data.

Single data platform

Store business rows, events, and vector embeddings together — queries use YQL, vectors stored directly in YDB.

Reliability & operations

Reuse YDB's multi‑AZ setups, backup/restore, and disaster recovery instead of operating a separate Qdrant cluster.

Self-healing collections

Collections auto-recreate if dropped. The service tracks last access time per collection for tenant lifecycle management.

Flexible integration

Use either the hosted HTTP endpoint or the ydb-qdrant Node.js package directly inside your backend services.

Comparison: YDB-Qdrant vs. Standalone Qdrant

Feature	YDB-Qdrant	Standalone Qdrant
Storage Engine	YDB (Distributed SQL)	RocksDB / In-memory
Consistency	Strong (ACID Serializable)	Eventual / Tunable
Scalability	Horizontal (YDB native)	Sharding (manual/managed)
Query Language	Qdrant REST API (YQL internal)	Qdrant API (gRPC/REST)
Operational Complexity	Low (Reuse YDB ops)	Medium (Separate cluster)

Where it fits best

Prototyping and experiments with vector search on YDB.
Datasets roughly up to 10K–50K vectors per collection.
IDE agents (Roo Code, Cline) expecting a Qdrant API.
Apps that already use YDB and need a quick vector API (HTTP or in‑process via the ydb-qdrant Node.js package).

Plans

Stronger per-tenant authentication (IAM/OAuth binding, per‑collection ACLs) beyond the existing X‑Tenant‑Id header and forTenant() API.
Quotas and rate limiting per tenant (collections, RPS, payload and batch sizes) plus richer audit logging.
Support for larger collections (>100K vectors) via index tuning and YDB auto-partitioning optimizations.
Better support for high‑throughput, multi‑million‑vector search with tighter latency through scaling patterns.
Extending Qdrant API coverage on YDB (filters, facets, recommend/discover, batch search and other advanced modes).
Hybrid search combining vector similarity with payload filtering for more precise retrieval.

How it works under the hood

Request flow: IDE/Agent → ydb-qdrant (Node.js) → YDB vectors + index

Docs

Getting started

Configure in Roo Code/Kilo Code

Hosted demo endpoint for IDEs: http://ydb-qdrant.tech:8080Checking…

Public demo Qdrant base URL for IDEs: http://ydb-qdrant.tech:8080 (paste into your IDE/agent as the Qdrant base URL).

Configure self‑hosted (Node.js)

Clone and install: npm install
Set env: YDB_ENDPOINT, YDB_DATABASE
Auth via env: YDB_SERVICE_ACCOUNT_KEY_FILE_CREDENTIALS | YDB_METADATA_CREDENTIALS | YDB_ACCESS_TOKEN_CREDENTIALS | YDB_ANONYMOUS_CREDENTIALS
Run: npm run dev (dev) or npm start (prod)
Point client to http://localhost:8080

Or run as a container (Docker or docker-compose) using the published image ghcr.io/astandrik/ydb-qdrant:latest.

Run via Docker / docker-compose

Pull image: docker pull ghcr.io/astandrik/ydb-qdrant:latest
Run with Docker: docker run -d --name ydb-qdrant -p 8080:8080 ghcr.io/astandrik/ydb-qdrant:latest
Or with docker-compose: docker-compose up -d using the sample config from the README.

Details and full examples: GitHub README

Use as Node.js library

Install: npm install ydb-qdrant

Usage:

import { createYdbQdrantClient } from "ydb-qdrant";

const client = await createYdbQdrantClient({
  endpoint: "grpcs://ydb.serverless.yandexcloud.net:2135",
  database: "/ru-central1/...",
  defaultTenant: "myapp",
  // Auth via YDB_*_CREDENTIALS env vars
});

See npm package for more details.

All-in-One local YDB + ydb-qdrant (Docker)

Run a single container with embedded local YDB:

docker run -d --name ydb-qdrant-local \
  -p 8080:8080 -p 8765:8765 \
  ghcr.io/astandrik/ydb-qdrant-local:latest

Ports: 8080 — ydb-qdrant API, 8765 — YDB Embedded UI
Point client to http://localhost:8080

Ideal for local development without an external YDB cluster. See GitHub README for details.

API at a glance

Purpose

Use as a Qdrant base URL for IDE agents or apps; vectors persist in YDB.

Key features

Qdrant‑compatible endpoints (collections, points, search)
Two search modes: exact (default) and approximate (bit‑quantized)
Batch upserts and batch deletes for bulk operations
Per‑tenant isolation via X‑Tenant‑Id header
Collection last access tracking for tenant management
Self‑host or use public demo endpoint
Also available as Node.js library (createYdbQdrantClient)

Use as HTTP server

curl -X PUT http://ydb-qdrant.tech:8080/collections/mycol \
  -H 'Content-Type: application/json' \
  -d '{"vectors":{"size":384,"distance":"Cosine","data_type":"float"}}'

curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/upsert \
  -H 'Content-Type: application/json' \
  -d '{"points":[{"id":"1","vector":[0.1,0.2,...384 vals...]}]}'

curl -X POST http://ydb-qdrant.tech:8080/collections/mycol/points/search \
  -H 'Content-Type: application/json' \
  -d '{"vector":[0.1,0.2,...],"limit":5,"with_payload":true}'

Use as Node.js library

// Install: npm install ydb-qdrant
import { createYdbQdrantClient } from "ydb-qdrant";

async function main() {
  // defaultTenant is optional; defaults to "default"
  const client = await createYdbQdrantClient({
    defaultTenant: "myapp",
    endpoint: "grpcs://lb.etn01g9tcilcon2mrt3h.ydb.mdb.yandexcloud.net:2135",
    database: "/ru-central1/b1ge4v9r1l3h1q4njclp/etn01g9tcilcon2mrt3h",
  });

  // Switch tenant dynamically (returns a new client instance)
  const otherClient = client.forTenant("other-tenant");

  await client.createCollection("documents", {
    vectors: {
      size: 1536,
      distance: "Cosine",
      data_type: "float",
    },
  });

  await client.upsertPoints("documents", {
    points: [
      { id: "doc-1", vector: [/* embedding */], payload: { title: "Doc 1" } },
    ],
  });

  const result = await client.searchPoints("documents", {
    vector: [/* query embedding */],
    top: 10,
    with_payload: true,
  });

  console.log(result.points);
}

Health: GET /health → {"status":"ok"}

Recommended Vector Dimensions

When creating a collection, you must specify the vector size matching your embedding model. Below are popular models with their dimensions and typical use cases:

Commercial API Models

OpenAI

text-embedding-3-small

Dimensions1536 (default, can reduce to 256-1536)

Use CasesRAG, semantic search, general-purpose embeddings

text-embedding-3-large

Dimensions3072 (default, can reduce to 256, 512, 1024, 1536, 3072)

Use CasesHigh-accuracy RAG, multilingual tasks

text-embedding-ada-002

Dimensions1536

Use CasesLegacy model, widely adopted

OpenAI (Legacy)

text-search-curie-doc-001

Dimensions4096

Use CasesLegacy GPT-3 model, deprecated

text-search-davinci-doc-001

Dimensions12288

Use CasesLegacy GPT-3 model, deprecated

Cohere

embed-v4.0

Dimensions256, 512, 1024, 1536 (default)

Use CasesMultimodal (text + image), RAG, enterprise search

embed-english-v3.0

Dimensions1024

Use CasesEnglish text, semantic search, classification

embed-multilingual-v3.0

Dimensions1024

Use Cases100+ languages, long-document retrieval, clustering

Google

gemini-embedding-001

Dimensions3072 (configurable)

Use CasesMultilingual, general-purpose, RAG

text-embedding-004

Dimensions768

Use CasesGeneral-purpose text embeddings

text-embedding-005

Dimensions768

Use CasesImproved version of text-embedding-004

text-multilingual-embedding-002

Dimensions768

Use CasesMultilingual text embeddings

Open-Source Models (HuggingFace)

sentence-transformers/all-MiniLM-L6-v2

Dimensions384

Use CasesFast semantic search, low-resource environments

BAAI/bge-base-en-v1.5

Dimensions768

Use CasesRAG, retrieval, English text

BAAI/bge-large-en-v1.5

Dimensions1024

Use CasesHigh-accuracy RAG, English text

BAAI/bge-m3

Dimensions1024

Use CasesMultilingual, dense/sparse/multi-vector

intfloat/e5-base-v2

Dimensions768

Use CasesGeneral retrieval, English text

intfloat/e5-large-v2

Dimensions1024

Use CasesHigh-accuracy retrieval, English text

intfloat/e5-mistral-7b-instruct

Dimensions4096

Use CasesHigh-dimensional embeddings, advanced RAG

nomic-ai/nomic-embed-text-v1

Dimensions768

Use CasesGeneral-purpose, open weights

Choosing Dimensions

Higher dimensions (1024-4096): Better semantic fidelity, higher storage/compute costs
Lower dimensions (384-768): Faster queries, lower costs, suitable for many use cases
Variable dimensions: Some models (OpenAI v3, Cohere v4) allow dimension reduction with minimal accuracy loss
Legacy models: Older OpenAI GPT-3 models (Curie: 4096, Davinci: 12288) are deprecated but may still be in use

Qdrant API on YDB

Why YDB‑Qdrant

Persistent storage

Transactional & consistent

Single data platform

Reliability & operations

Self-healing collections

Flexible integration

Comparison: YDB-Qdrant vs. Standalone Qdrant

Where it fits best

Plans

Docs

Getting started

Configure in Roo Code/Kilo Code

Configure self‑hosted (Node.js)

Run via Docker / docker-compose

Use as Node.js library

All-in-One local YDB + ydb-qdrant (Docker)

API at a glance

Purpose

Key features

Use as HTTP server

Use as Node.js library

Recommended Vector Dimensions

Commercial API Models

OpenAI

OpenAI (Legacy)

Cohere

Google

Open-Source Models (HuggingFace)

Choosing Dimensions

References