This series introduces workload prioritization: an extension of the service levels feature which allows specifying "shares" per service level. The number of shares determines the priority of the user which has this service level attached (if multiple are attached then the one with the lowest shares wins). Different service levels will be isolated in the following way: - Each service level gets its own scheduling group with the number of shares (corresponding to the service level's number of shares), which controls the priority of the CPU and I/O used for user operations running on that service level. - Each service level gets two reader concurrency semaphores, one for user reads and the other for read-before-write done for view updates. - Each service level gets its own TCP connections for RPC to prevent priority inversion issues. Because of the mandatory use of scheduling groups, which are a globally limited resource, the number of service levels is now limited to 7 user created service levels + 1 created by default that cannot be removed. This feature has been previously only available in ScyllaDB Enterprise but has been made available for the source available ScyllaDB. The series was created by comparing the master branch with source-available-workbranch / enterprise branch and taking the workload prioritization related parts from the diff, then molding the resulting diff into a proper series. Some very minor changes were made such as fixing whitespace, removing unused or unnecessary code, adding some boilerplate (in api/) which was missing, but otherwise no major changes have been made. No backport is required. Closes scylladb/scylladb#22031 * github.com:scylladb/scylladb: tracing: record scheduling group in trace event record qos: un-shared-from-this standard_service_level_distributed_data_accessor alternator: execute under scheduling group for service level test.py: support multiple commands in prepare_cql in suite.yml docs: add documentation for workload prioritization docs/dev: describe workload prioritization features in service_levels test/auth_cluster: test workload prioritization in service level tests cqlpy/test_service_levels: add workload prioritization tests api: introduce service levels specific API api/cql_server_test: add information about scheduling group db/virtual_tables: add scheduling group column to system.clients test/boost: update service_level_controller_test for workload prio qos: include number of shares in DESCRIBE cql3/statements: update SL statements for workload prioritization transport/server: use scheduling group assigned to current user messaging_service: use separate set of connections per service levels replica/database: add reader concurrency semaphore groups qos: manage and assign scheduling groups to service levels qos: use the shares field in service level reads/writes qos: add shares to service_level_options qos: explicitly specify columns when querying service level tables db/system_distributed_keyspace: add shares column and upgrade code db/system_keyspace: adjust SL schema for workload prioritization gms: introduce WORKLOAD_PRIORITIZATION cluster feature build: increase the max number of scheduling groups qos: return correct error code when SL does not exist
Scylla
What is Scylla?
Scylla is the real-time big data database that is API-compatible with Apache Cassandra and Amazon DynamoDB. Scylla embraces a shared-nothing approach that increases throughput and storage capacity to realize order-of-magnitude performance improvements and reduce hardware costs.
For more information, please see the ScyllaDB web site.
Build Prerequisites
Scylla is fairly fussy about its build environment, requiring very recent versions of the C++23 compiler and of many libraries to build. The document HACKING.md includes detailed information on building and developing Scylla, but to get Scylla building quickly on (almost) any build machine, Scylla offers a frozen toolchain, This is a pre-configured Docker image which includes recent versions of all the required compilers, libraries and build tools. Using the frozen toolchain allows you to avoid changing anything in your build machine to meet Scylla's requirements - you just need to meet the frozen toolchain's prerequisites (mostly, Docker or Podman being available).
Building Scylla
Building Scylla with the frozen toolchain dbuild is as easy as:
$ git submodule update --init --force --recursive
$ ./tools/toolchain/dbuild ./configure.py
$ ./tools/toolchain/dbuild ninja build/release/scylla
For further information, please see:
- Developer documentation for more information on building Scylla.
- Build documentation on how to build Scylla binaries, tests, and packages.
- Docker image build documentation for information on how to build Docker images.
Running Scylla
To start Scylla server, run:
$ ./tools/toolchain/dbuild ./build/release/scylla --workdir tmp --smp 1 --developer-mode 1
This will start a Scylla node with one CPU core allocated to it and data files stored in the tmp directory.
The --developer-mode is needed to disable the various checks Scylla performs at startup to ensure the machine is configured for maximum performance (not relevant on development workstations).
Please note that you need to run Scylla with dbuild if you built it with the frozen toolchain.
For more run options, run:
$ ./tools/toolchain/dbuild ./build/release/scylla --help
Testing
See test.py manual.
Scylla APIs and compatibility
By default, Scylla is compatible with Apache Cassandra and its API - CQL. There is also support for the API of Amazon DynamoDB™, which needs to be enabled and configured in order to be used. For more information on how to enable the DynamoDB™ API in Scylla, and the current compatibility of this feature as well as Scylla-specific extensions, see Alternator and Getting started with Alternator.
Documentation
Documentation can be found here. Seastar documentation can be found here. User documentation can be found here.
Training
Training material and online courses can be found at Scylla University. The courses are free, self-paced and include hands-on examples. They cover a variety of topics including Scylla data modeling, administration, architecture, basic NoSQL concepts, using drivers for application development, Scylla setup, failover, compactions, multi-datacenters and how Scylla integrates with third-party applications.
Contributing to Scylla
If you want to report a bug or submit a pull request or a patch, please read the contribution guidelines.
If you are a developer working on Scylla, please read the developer guidelines.
Contact
- The community forum and Slack channel are for users to discuss configuration, management, and operations of the ScyllaDB open source.
- The developers mailing list is for developers and people interested in following the development of ScyllaDB to discuss technical topics.