Go to file

Piotr Dulikowski 3ec4f67407 Merge 'vector_index: Implement rescoring' from Szymon Malewski

This series implements rescoring algorithm.

Index options allowing to enable this functionality were introduced in earlier PR https://github.com/scylladb/scylladb/pull/28165.

When Vector Index has enabled quantization, Vector Store uses reduced vector representation to save memory, but it may degrade correctness of ANN queries. For quantized index we can enable rescoring algorithm, which recalculates similarity score from full vector representation stored in Scylla and reorder returned result set.
It works also with oversampling - we fetch more candidates from Vector Store, rescore them at Scylla and return only requested number of results.

Example:

Creating a Vector Index with Rescoring

```sql
-- Create a table with a vector column
CREATE TABLE ks.products (
    id int PRIMARY KEY,
    embedding vector<float, 128>
);

-- Create a vector index with rescoring enabled
CREATE INDEX products_embedding_idx ON ks.products (embedding)
    USING 'vector_index'
    WITH OPTIONS = {
        'similarity_function': 'cosine',
        'quantization': 'i8',
        'oversampling': '2.0',
        'rescoring': 'true'
    };
```

1. **Quantization** (`i8`) compresses vectors in the index, reducing memory usage but introducing precision loss in distance calculations
2. **Oversampling** (`2.0`) retrieves 2× more candidates than requested from the vector store (e.g., `LIMIT 10` fetches 20 candidates)
3. **Rescoring** (`true`) recalculates similarity scores using full-precision (`f32`) vectors from the base table and re-ranks results

Query example:

```sql
-- Find 10 most similar products
SELECT id, similarity_cosine(embedding, [0.1, 0.2, ...]) AS score
FROM ks.products
ORDER BY embedding ANN OF [0.1, 0.2, ...]
LIMIT 10;
```

With rescoring enabled, the query:
1. Fetches 20 candidates from the quantized index (due to oversampling=2.0)
2. Reads full-precision embeddings from the base table
3. Recalculates similarity scores with full precision
4. Re-ranks and returns the top 10 results

In this implementation we use CQL similarity function implementation to calculate new score values and use them in post query ordering. We add that column manually to selection, but it has to be removed from the final response.

Follow-up https://github.com/scylladb/scylladb/pull/28165
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-83

New feature - doesn't need backport.

Closes scylladb/scylladb#27769

* github.com:scylladb/scylladb:
  vector_index: rescoring: Fetch oversampled rows
  vector_index: rescoring: Sort by similarity column
  select_statement: Modify `needs_post_query_ordering` condition
  vector_index: rescoring: Add hidden similarity score column
  vector_index: Refactor extracting ANN query information

2026-01-23 15:20:10 +01:00

.github

.github/workflows/docs-validate-metrics.yml: add workflow permissions

2026-01-13 10:16:35 +02:00

abseil @ d7aaad83b4

…

alternator

alternator: don't require rf_rack flag for indexes, validate instead

2026-01-22 16:11:35 +01:00

api

db: snapshot_ctl: move skip_flush to struct snapshot_options

2026-01-22 09:12:56 +02:00

audit

audit: Stop using deprecated seastar UDP sending API

2026-01-20 10:51:23 +02:00

auth

auth: use paged internal queries during migration

2026-01-20 09:32:21 +01:00

bin

…

cdc

treewide: #include Seastar headers with angle brackets

2026-01-13 14:56:15 +02:00

cmake

build: drop -fexperimental-assignment-tracking clang option

2025-12-22 14:33:48 +02:00

compaction

treewide: fix some spelling errors

2025-12-29 13:53:56 +01:00

conf

Reapply "audit: enable some subset of auditing by default"

2025-12-12 09:18:54 +01:00

cql3

Merge 'vector_index: Implement rescoring' from Szymon Malewski

2026-01-23 15:20:10 +01:00

data_dictionary

data_dictionary: table: add get_truncation_time()

2025-12-02 14:21:25 +02:00

Merge 'alternator: don't require rf_rack flag for indexes, validate instead' from Michael Litvak

2026-01-23 11:49:02 +01:00

debug

…

dht

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

dist

dist: scylla_coredump_setup: force unmount /var/lib/systemd/coredump before setup

2026-01-22 14:35:26 +02:00

docs

Merge 'strongly consistent tables: basic implementation' from Petr Gusev

2026-01-23 09:52:33 +01:00

ent

Merge 'The system_replicated_keys should be mark as a system keyspace' from Amnon Heiman

2026-01-19 09:37:41 +02:00

exceptions

exceptions.hh: fix message argument passing

2025-08-13 13:39:52 +02:00

gms

Merge 'raft_topology, tablets: Drain tablets in parallel with other topology operations' from Tomasz Grabiec

2026-01-22 13:06:53 +01:00

idl

strong_consistency: add state_machine and raft_command

2026-01-21 14:56:00 +01:00

index

vector_index: rescoring: Add hidden similarity score column

2026-01-22 15:38:40 +01:00

keys

api/storage_service: add GET 'natural_endpoints' v2 to support composite keys with ':'

2025-10-01 15:53:25 +02:00

lang

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

licenses

utils: license: import crypt_sha512.c from musl to the project

2025-12-10 15:36:18 +01:00

locator

Merge 'alternator: don't require rf_rack flag for indexes, validate instead' from Michael Litvak

2026-01-23 11:49:02 +01:00

message

messaging: improve the error messages of closed_errors

2025-12-29 18:36:07 +02:00

mutation

streamed_mutation_freezer: use chunked_vector instead of std::deque for clustering rows

2026-01-21 10:13:44 +02:00

mutation_writer

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

node_ops

node_ops: task_manager_module: Populate entity field also for active requests

2026-01-18 15:36:06 +01:00

pgo

Update pgo profiles - aarch64

2026-01-15 05:13:03 +02:00

query

code: Replace distributed<> with sharded<>

2025-09-19 12:22:51 +02:00

raft

raft.hh: make server::wait_for_leader() public

2026-01-21 14:56:01 +01:00

readers

reader_permit: remove check_abort()

2026-01-13 10:47:57 +02:00

reloc

…

repair

service: pass topology guard to RBNO

2026-01-20 10:06:34 +01:00

replica

Merge 'alternator: don't require rf_rack flag for indexes, validate instead' from Michael Litvak

2026-01-23 11:49:02 +01:00

rust

build: apply sccache to rust builds too

2025-12-22 15:36:15 +02:00

schema

Merge 'schema: Apply sstable_compression_user_table_options to CQL aux and Alternator tables' from Nikos Dragazis

2026-01-22 06:50:48 +02:00

scripts

scripts/pull_github_pr.sh: Update instructions for creating token

2026-01-09 17:45:00 +02:00

seastar @ f55dc7ebed

Update seastar submodule

2026-01-21 08:44:20 +02:00

service

Merge 'test_lwt_shutdown: fix flakiness by removing storage_proxy::stop injection' from Petr Gusev

2026-01-23 15:18:17 +01:00

sstables

Merge 'Extend snapshot manifest.json with tablet-aware metadata' from Benny Halevy

2026-01-22 15:19:11 +03:00

streaming

repair: Fix sstable_list_to_mark_as_repaired with multishard writer

2026-01-08 21:55:18 +02:00

swagger-ui @ 12f1da1082

…

tasks

tasks, topology: Make pending node operations abortable

2026-01-18 15:36:05 +01:00

test

Merge 'vector_index: Implement rescoring' from Szymon Malewski

2026-01-23 15:20:10 +01:00

tools

Merge 'schema: Apply sstable_compression_user_table_options to CQL aux and Alternator tables' from Nikos Dragazis

2026-01-22 06:50:48 +02:00

tracing

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

transport

transport: unify lambda capture lifetime for control connections

2026-01-17 20:36:31 +02:00

types

fix rjson::value to bytes conversion with missing GetStringLength call

2025-12-09 19:27:22 +01:00

unified

…

utils

lsa: Export metrics for reclaim/evict/compact time

2026-01-19 12:08:16 +03:00

vector_search

vector_search: cache restrictions JSON at prepare time

2026-01-20 17:15:52 +01:00

.clang-format

…

.dockerignore

…

.gitattributes

…

.gitignore

.gitignore: add rust target

2025-08-19 13:09:18 +03:00

.gitmodules

…

.gitorderfile

…

.mailmap

…

absl-flat_hash_map.cc

…

absl-flat_hash_map.hh

…

amplify.yml

…

backlog_controller_fwd.hh

db/config: introduce new config parameter compaction_max_shares

2025-11-24 12:52:29 -03:00

backlog_controller.hh

db/config: introduce new config parameter compaction_max_shares

2025-11-24 12:52:29 -03:00

build_mode.hh

…

bytes_fwd.hh

…

bytes_ostream.hh

…

bytes.cc

…

bytes.hh

…

cartesian_product.hh

…

client_data.cc

…

client_data.hh

service/client_state and alternator/server: use cached values for driver_name and driver_version fields

2025-12-20 12:26:22 -05:00

clocks-impl.cc

treewide: Move mutation related files to a mutation directory

2025-09-24 13:23:38 +03:00

clocks-impl.hh

…

CMakeLists.txt

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

configure.py

cql: add select_statement.cc

2026-01-21 14:56:01 +01:00

CONTRIBUTING.md

docs: fix typos and spelling errors

2025-09-30 13:16:49 +02:00

coverage_excludes.txt

…

coverage_sources.list

…

db_clock.hh

…

debug.cc

…

debug.hh

…

default.nix

…

Doxyfile

…

encoding_stats.hh

treewide: Move mutation related files to a mutation directory

2025-09-24 13:23:38 +03:00

enum_set.hh

auth: add possibilty to check for any permission in set

2025-10-03 16:55:57 +02:00

exported_templates.cc

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

exported_templates.hh

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

fix_system_distributed_tables.py

…

flake.lock

…

flake.nix

…

gc_clock.hh

…

gdbinit

…

gen_segmented_compress_params.py

compress: move compress.cc/hh to sstables/compressor

2025-07-31 13:10:41 +03:00

HACKING.md

docs: fix typos and spelling errors

2025-09-30 13:16:49 +02:00

hashing_partition_visitor.hh

…

idl-compiler.py

idl-compiler.py: raise TypeError instead of raw str

2026-01-13 08:33:17 +02:00

inet_address_vectors.hh

storage_proxy: handle node_local_only in mutate

2025-07-24 19:48:08 +02:00

init.cc

db: experimental consistent-tablets option

2025-10-15 11:27:10 +03:00

init.hh

Revert "Merge 'Unify configuration of object storage endpoints' from Pavel Emelyanov"

2026-01-05 08:53:41 +02:00

install-dependencies.sh

test.py: add pexpect to the dependencies

2026-01-14 10:17:37 +02:00

install.sh

scripts: fixes flagged by CodeQL/PyLens

2026-01-09 15:13:12 +02:00

LICENSE-ScyllaDB-Source-Available.md

…

main.cc

Merge 'strongly consistent tables: basic implementation' from Petr Gusev

2026-01-23 09:52:33 +01:00

marshal_exception.hh

…

mutation_query.cc

…

mutation_query.hh

treewide: Move query related files to a new query directory

2025-09-16 23:40:47 +03:00

NOTICE.txt

…

ORIGIN

…

partition_builder.hh

mutation: async_utils: add unfreeze_and_split_gently

2025-09-30 17:15:41 +03:00

partition_range_compat.hh

treewide: Move misc files to utils directory

2025-07-21 11:56:40 +03:00

partition_slice_builder.cc

…

partition_slice_builder.hh

treewide: Move query related files to a new query directory

2025-09-16 23:40:47 +03:00

partition_snapshot_reader.hh

replica: add abort polling to memtable and cache readers

2026-01-16 18:03:04 +01:00

query_ranges_to_vnodes.cc

…

query_ranges_to_vnodes.hh

…

reader_concurrency_semaphore_group.cc

…

reader_concurrency_semaphore_group.hh

…

reader_concurrency_semaphore.cc

reader_concurrency_semaphore: improve handling of base resources

2026-01-19 11:37:51 +03:00

reader_concurrency_semaphore.hh

…

reader_permit.hh

reader_permit: remove check_abort()

2026-01-13 10:47:57 +02:00

README.md

docs: fix typos and spelling errors

2025-09-30 13:16:49 +02:00

real_dirty_memory_accounter.hh

…

release.cc

release: adjust doc_link() for the post source-available world

2025-09-29 17:02:55 +03:00

release.hh

…

reversibly_mergeable.hh

…

schema_upgrader.hh

treewide: Move mutation related files to a mutation directory

2025-09-24 13:23:38 +03:00

scylla_post_install.sh

…

scylla-gdb.py

scylla-gdb.py: scylla small-objects: make freelist traversal more robust

2025-12-25 13:26:09 +03:00

SCYLLA-VERSION-GEN

Update ScyllaDB version to: 2026.1.0-dev

2025-09-30 18:54:09 +03:00

seastarx.hh

…

serialization_visitors.hh

…

serializer_impl.hh

…

serializer.cc

…

serializer.hh

treewide: include boost headers as "system" headers

2025-08-22 17:21:24 +03:00

service_permit.hh

…

shell.nix

…

sstable_dict_autotrainer.cc

storage_service: hold group0 gate in publish_new_sstable_dict

2025-07-28 12:42:37 +02:00

sstable_dict_autotrainer.hh

…

sstables_loader.cc

Merge 'streaming: tablet_sstable_streamer::stream refactoring' from Ernest Zaslavsky

2025-12-09 10:53:57 +03:00

sstables_loader.hh

streaming: refactor get_sstables_for_tablets to make it accessible

2025-12-08 12:30:23 +02:00

stdafx.cc

Add precompiled headers to CMakeLists.txt

2025-11-21 12:27:41 +02:00

stdafx.hh

code: Stop using seastar::compat::source_location

2025-11-27 19:10:11 +02:00

supervisor.hh

…

table_helper.cc

schema: Allow configuring consistency setting for a keyspace

2025-10-16 13:34:49 +03:00

table_helper.hh

…

test.py

test.py: pass correctly extra cmd line arguments

2026-01-20 15:52:40 +01:00

timeout_config.cc

…

timeout_config.hh

…

tombstone_gc_extension.hh

…

tombstone_gc_options.cc

…

tombstone_gc_options.hh

…

tombstone_gc-internals.hh

treewide: Add missing #pragma once

2025-09-01 14:58:21 +03:00

tombstone_gc.cc

tombstone_gc: don't use 'repair' mode for colocated tables

2025-11-25 09:15:46 +01:00

tombstone_gc.hh

tombstone_gc: don't use 'repair' mode for colocated tables

2025-11-25 09:15:46 +01:00

ubsan-suppressions.supp

…

unimplemented.cc

…

unimplemented.hh

…

validation.cc

treewide: Move keys related files to a new keys directory

2025-07-25 10:45:32 +03:00

validation.hh

…

version.hh

…

view_info.hh

treewide: Move query related files to a new query directory

2025-09-16 23:40:47 +03:00

vint-serialization.cc

…

vint-serialization.hh

…

README.md

Scylla

What is Scylla?

Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.

For more information, please see the ScyllaDB web site.

Build Prerequisites

Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain. This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).

Building Scylla

Building Scylla with the frozen toolchain dbuild is as easy as:

$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla

For further information, please see:

Developer documentation for more information on building Scylla.
Build documentation on how to build Scylla binaries, tests, and packages.
Docker image build documentation for information on how to build Docker images.

Running Scylla

To start Scylla server, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1

This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory. The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations). Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.

For more run options, run:

$ ./tools/toolchain/dbuild ./build/release/scylla --help

Testing

See test.py manual.

Scylla APIs and compatibility

By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.

Documentation

Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.

Training

Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.

Contributing to Scylla

If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.

If you are a developer working on Scylla, please read the developer guidelines.

Contact

The community forum and Slack channel are for users to discuss configuration, management, and operations of ScyllaDB.
The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.

Languages

C++ 72.7%

Python 26.1%

CMake 0.3%

GAP 0.3%

Shell 0.3%