scylla

Author	SHA1	Message	Date
Benny Halevy	d6071945c8	compaction, table: ignore foreign sstables replay_position The sstables replay_position in stats_metadata is valid only on the originating node and shard. Therefore, validate the originating host and shard before using it in compaction or table truncate. Fixes #10080 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16550	2024-01-16 18:45:59 +02:00
Benny Halevy	7a7a1db86b	sstables_loader: load_new_sstables: auto-enable load-and-stream for tablets And call on_internal_error if process_upload_dir is called for tablets-enabled keyspace as it isn't supported at the moment (maybe it could be in the future if we make sure that the sstables are confined to tablets boundaries). Refs #12775 Fixes #16743 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16788	2024-01-16 18:43:52 +02:00
Kefu Chai	0092700ad1	memtable: add formatter for replica::{memtable,memtable_entry} before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for replica::memtable and replica::memtable_entry, and remove their operator<<(). Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16793	2024-01-16 16:46:52 +02:00
Kefu Chai	2dbf044b91	cql3: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16791	2024-01-16 16:43:17 +02:00
Avi Kivity	a9844ed69a	Merge 'view: revert cleanup filter that doesn't work with tablets' from Nadav Har'El The goal of this PR is fix Scylla so that the dtest test_mvs_populating_from_existing_data, which starts to fail when enabling tablets, will pass. The main fix (the second patch) is reverting code which doesn't work with tablets, and I explain why I think this code was not necessary in the first place. Fixes #16598 Closes scylladb/scylladb#16670 * github.com:scylladb/scylladb: view: revert cleanup filter that doesn't work with tablets mv: sleep a bit before view-update-generator restart	2024-01-16 16:42:20 +02:00
Benny Halevy	e277ec6aef	force_keyspace_cleanup: skip keyspaces that do not require or support cleanup Local keyspaces do not need cleanup, and keyspaces configured with tablets, where their replication strategy is per-table do not support cleanup. In both cases, just skip their cleanup via the api. Fixes #16738 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16785	2024-01-16 15:01:49 +03:00
Tomasz Grabiec	49026dc319	Merge 'Turn on tablets on keyspace by default when the feature is enabled' from Pavel Emelyanov To enable tablets replication one needs to turn on the (experimental) feature and specify the `initial_tablets: N` option when creating a keyspace. We want tablets to become default in the future and allow users to explicitly opt it out if they want to. This PR solves this by changing the CREATE KEYSPACE syntax wrt tablets options. Now there's a new TABLETS options map and the usage is * `CREATE KEYSPACE ...` will turn tablets on or off based on cluster feature being enabled/disabled * `CREATE KEYSPACE ... WITH TABLETS = { 'enabled': false }` will turn tablets off regardless of what * `CREATE KEYSPACE ... WITH TABLETS = { 'enabled': true }` will try to enable tablets with default configuration * `CREATE KEYSPACE ... WITH TABLETS = { 'initial': <int> }` is now the replacement for `REPLICATION = { ... 'initial_tablets': <int> }` thing fixes: #16319 Closes scylladb/scylladb#16364 * github.com:scylladb/scylladb: code: Enable tablets if cluster feature is enabled test: Turn off tablets feature by default test: Move test_tablet_drain_failure_during_decommission to another suite test/tablets: Enable tables for real on test keyspace test/tablets: Make timestamp local cql3: Add feature service to as_ks_metadata_update() cql3: Add feature service to ks_prop_defs::as_ks_metadata() cql3: Add feature service to get_keyspace_metadata() cql: Add tablets on/off switch to CREATE KEYSPACE cql: Move initial_tablets from REPLICATION to TABLETS in DDL network_topology_strategy: Estimate initial_tablets if 0 is set	2024-01-16 00:15:10 +01:00
Avi Kivity	5e70dd1dbe	database: don't allow keyspace objects to be copied keyspace objects are heavyweight and copies are immediately our-of-date, so copying them is bad. Fix by deleting the copy constructor and copy assignment operator. One call site is fixed. This call site is safe since the it's only used for accessing a few attributes (introduced in `f70c4127c6`). Closes scylladb/scylladb#16782	2024-01-15 21:48:32 +01:00
Botond Dénes	204d3284fa	readers/multishard: evictable_reader::fast_forward_to(): close reader on exception When the reader is currently paused, it is resumed, fast-forwarded, then paused again. The fast forwarding part can throw and this will lead to destroying the reader without it being closed first. Add a try-catch surrounding this part in the code. Also mark `maybe_pause()` and `do_pause()` as noexcept, to make it clear why that part doesn't need to be in the try-catch. Fixes: #16606 Closes scylladb/scylladb#16630	2024-01-15 20:55:55 +01:00
Kefu Chai	e5300f3e21	topology_state_machine: add formatter for service::cleanup_status before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for service::cleanup_status, and remove its operator<<(). Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16778	2024-01-15 21:31:42 +02:00
Anna Stuchlik	af1405e517	doc: remove support for CentOS 7 This commit removes support for CentOS 7 from the docs. The change applies to version 5.4,so it must be backported to branch-5.4. Refs https://github.com/scylladb/scylla-enterprise/issues/3502 In addition, this commit removes the information about Amazon Linux and Oracle Linux, unnecessarily added without request, and there's no clarity over which versions should be documented. Closes scylladb/scylladb#16279	2024-01-15 15:37:29 +02:00
Anna Stuchlik	bca39b2a93	doc: remove Serverless from the Drivers page This commit removes the information about ScyllaDB Cloud Serverless, which is no longer valid. Closes scylladb/scylladb#16700	2024-01-15 15:36:51 +02:00
Botond Dénes	66bef6e961	cql3: cluster_describe_statement: don't produce range ownership for tablet keyspaces Tablet keyspaces have per/table range ownership, which cannot currently be expressed in a DESC CLUSTER statement, which describes range ownership in the current keyspace (if set). Until we figure out how to represent range ownership (tablets) of all tables of a keyspace, we disable range ownership for tablet keyspaces. Fixes: #16483 Closes scylladb/scylladb#16713	2024-01-15 14:03:54 +01:00
Patryk Wrobel	aec0db1b96	cql_auth_query_test.cc: do not rely on templated operator<< This change is intended to remove the dependency to operator<<(std::ostream&, const std::unordered_set<seastar::sstring>&) from test/boost/cql_auth_query_test.cc. It prepares the test for removal of the templated helpers. Such removal is one of goals of the referenced issue that is linked below. Refs: #13245 Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com> Closes scylladb/scylladb#16758	2024-01-15 13:30:05 +02:00
Kefu Chai	ece2bd2f6e	service: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16764	2024-01-15 13:29:33 +02:00
Kefu Chai	fc97d91f1a	auth: add fmt::format for auth::resource and friends before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we * define a formatter for `auth::resource` and friends, * update their callers of `operator<<` to use `fmt::print()`. * drop `operator<<`, as they are not used anymore. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16765	2024-01-15 13:26:39 +02:00
Kefu Chai	f344e13066	types: add formatter for data_value before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for data_value, but its its operator<<() is preserved as we are still using the generic homebrew formatter for formatting std::vector, which in turn uses operator<< of the element type. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16767	2024-01-15 13:18:23 +02:00
Kefu Chai	218334eaf5	test/nodetool: use build/$CMAKE_BUILD_TYPE when appropriate because the CMake-generated build.ninja is located under build/, and it puts the `scylla` executable at build/$CMAKE_BUILD_TYPE/scylla, instead of at build/$scylla_build_mode/scylla, so let's adapt to this change accordingly. we will promote this change to a shared place if we have similar needs in other tests as well. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16775	2024-01-15 12:52:35 +02:00
Pavel Emelyanov	dd892b0d8a	code: Enable tablets if cluster feature is enabled If the TABLETS map is missing in the CREATE KEYSPACE statement the tablets are anyway enabled if the respective cluster feature is enabled. To opt-out keyspaces one may use TABLETS = { 'enabled': false } syntax. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	4838eeb201	test: Turn off tablets feature by default Next patches will make per-keyspace initial_tables option really optional and turn tablets ON when the feature is ON. This will break all other tests' assumptions, that they are testing vnodes replication. So turn the feature off by default, tests that do need tables will need to explicitly enable this feature on their own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	ae7da54f88	test: Move test_tablet_drain_failure_during_decommission to another suite In its current location it will be started with 3 pre-created scylla nodes with default features ON. Next patch will exclude `tablets` from the default list, so the test needs to create servers on its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	46b36d8c07	test/tablets: Enable tables for real on test keyspace When started cql_test_env creates a test keyspace. Some tablets test cases create a table in this keyspace, but misuse the whole feature. The thing is that while tablets feature is ON in those test cases, the keyspace itself doesn _not_ have the initial_tables option and thus tablets are not enabled for the ks' table for real. Currently test cases work just because this table is only used as a transparent table ID placeholder. If turning on tablets for the keyspace, several test cases would get broken for two reasons. First, the tables map will no longer be empty on test start. Second, applying changes to tablet metadata may not be visible, becase test case uses "ranom" timestamp, that can be less that the initial metadata mutations' timestamp. This patch fixes all three places: 1. enables tables for the test keyspace 2. removes assumption that the initial metadata is empty 3. uses large enough timestamp for subsequent mutations Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	2376b699e0	test/tablets: Make timestamp local Just to make next patching simpler Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	f3a69bfaca	cql3: Add feature service to as_ks_metadata_update() To call prepare_options() with tablets feature state later Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	4dede19e4f	cql3: Add feature service to ks_prop_defs::as_ks_metadata() To call prepare_options() with tablets feature state later Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	267770bf0f	cql3: Add feature service to get_keyspace_metadata() To be passed down to ks_prop_defs::as_ks_metadata() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:12 +03:00
Pavel Emelyanov	6cb3055059	cql: Add tablets on/off switch to CREATE KEYSPACE Now the user can do CREATE KEYSPACE ... WITH TABLETS = { 'enabled': false } to turn tablets off. It will be useful in the future to opt-out keyspace from tablets when they will be turned on by default based on cluster features only. Also one can do just CREATE KEYSPACE ... WITH TABLETS = { 'enabled': true } and let Scylla select the initial tablets value by its own Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:12:11 +03:00
Pavel Emelyanov	941f6d8fca	cql: Move initial_tablets from REPLICATION to TABLETS in DDL This patch changes the syntax of enabling tablets from CREATE KEYSPACE ... WITH REPLICATION = { ..., 'initial_tablets': <int> } to be CREATE KEYSPACE ... WITH TABLETS = { 'initial': <int> } and updates all tests accordingly. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:04:48 +03:00
Pavel Emelyanov	4c4a9679d8	network_topology_strategy: Estimate initial_tablets if 0 is set If user configured zero initial tablets (spoiler: or this value was set automagically when enabling tablets begind the scenes) we still need some value to start with and this patch calculates one. The math is based on topology and RF so that all shards are covered: initial_tablets = max(nr_shards_in(dc) / RF_in(dc) for dc in datacenters) The estimation is done when a table is created, not when the keyspace is created. For that, the keyspace is configured with zero initial tabled, and table-creation time zero is converted into auto-estimated value. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-01-15 13:04:48 +03:00
Kamil Braun	423234841e	Merge 'add automatic sstable cleanup to the topology coordinator' from Gleb For correctness sstable cleanup has to run between (some) topology changes. Sometimes even a failed topology change may require running the cleanup. The series introduces automatic sstable cleanup step to the topology change coordinator. Unlike other operations it is not represented as a global transition state, but done by each node independently which allows cleanup to run without locking the topology state machine so tablet code can run in parallel with the cleanup. It is done by having a cleanup state flag for each node in the topology. The flag is a tri state: "clean" - the node is clean, "needed" - cleanup is needed (but not running), "running" - cleanup is running. No topology operation can proceed if there is a node in "running" state, but some operation can proceed even if there are nodes in "needed" state. If the coordinator needs to perform a topology operation that cannot run while there are nodes that need cleanup the coordinator will start one automatically and continue only after cleanup completes. There is also a possibility to kick cleanup manually through the new RAFT API call. * 'cleanup-needed-v8' of https://github.com/gleb-cloudius/scylla: test: add test for automatic cleanup procedure test: add test for topology requests queue management storage_service: topology coordinator: add error injection point to be able to pause the topology coordinator storage_service: topology coordinator: add logging to removenode and decommission storage_service: topology_coordinator: introduce cleanup REST API integrated with the topology coordinator storage_service: topology coordinator: manage cluster cleanup as part of the topology management storage_service: topology coordinator: provide a version of get_excluded_nodes that does not need node_to_work_on as a parameter test: use servers_see_each_other when needed test: add servers_see_each_other helper storage_service: topology coordinator: make topology coordinator lifecycle subscriber system_keyspace: raft topology: load ignore nodes parameter together with removenode topology request storage_service: topology coordinator: introduce sstable cleanup fiber storage_proxy: allow to wait for all ongoing writes storage_service: topology coordinator: mark nodes as needing cleanup when required storage_service: add mark_nodes_as_cleanup_needed function vnode_effective_replication_map: add get_all_pending_nodes() function vnode_effective_replication_map: pre calculate dirty endpoints during topology change raft topology: add cleanup state to the topology state machine	2024-01-14 18:54:02 +01:00
Gleb Natapov	f8b90aeb14	test: add test for automatic cleanup procedure The test runs two bootstraps and checks that there is no cleanup in between. Then it runs a decommission and checks that cleanup runs automatically and then it runs one more decommission and checks that no cleanup runs again. Second part checks manual cleanup triggering. It adds a node, triggers cleanup through the REST API, checks that is runs, decommissions a node and check that the cleanup did not run again.	2024-01-14 15:45:53 +02:00
Gleb Natapov	5882855669	test: add test for topology requests queue management This test creates a 5 node cluster with 2 down nodes (A and B). After that it creates a queue of 3 topology operation: bootstrap, removenode A and removenode B with ignore_nodes=A. Check that all operation manage to complete. Then it downs one node and creates a queue with two requests: bootstrap and decommission. Since none can proceed both should be canceled.	2024-01-14 15:45:53 +02:00
Gleb Natapov	ba7aa0d582	storage_service: topology coordinator: add error injection point to be able to pause the topology coordinator	2024-01-14 15:45:53 +02:00
Gleb Natapov	1afc891bd5	storage_service: topology coordinator: add logging to removenode and decommission Add some useful logging to removenode and decommission to be used by tests later.	2024-01-14 15:45:53 +02:00
Gleb Natapov	97ab3f6622	storage_service: topology_coordinator: introduce cleanup REST API integrated with the topology coordinator Introduce new REST API "/storage_service/cleanup_all" that, when triggered, instructs the topology coordinator to initiate cluster wide cleanup on all dirty nodes. It is done by introducing new global command "global_topology_request::cleanup".	2024-01-14 15:45:53 +02:00
Gleb Natapov	0adb3904d8	storage_service: topology coordinator: manage cluster cleanup as part of the topology management Sometimes it is unsafe to start a new topology operation before cleanup runs on dirty nodes. This patch detects the situation when the topology operation to be executed cannot be run safely until all dirty nodes do cleanup and initiates the cleanup automatically. It also waits for cleanup to complete before proceeding with the topology operation. There can be a situation that nodes that needs cleanup dies and will never clear the flag. In this case if a topology operation that wants to run next does not have this node in its ignore node list it may stuck forever. To fix this the patch also introduces the "liveness aware" request queue management: we do not simple choose _a_ request to run next, but go over the queue and find requests that can proceed considering the nodes liveness situation. If there are multiple requests eligible to run the patch introduces the order based on the operation type: replace, join, remove, leave, rebuild. The order is such so to not trigger cleanup needlessly.	2024-01-14 15:45:50 +02:00
Nadav Har'El	2d04070120	Update seastar submodule * seastar 0ffed835...8b9ae36b (4): > net/posix: Track ap-server ports conflict Fixes #16720 > include/seastar/core: do not include unused header > build: expose flag like -std=c++20 via seastar.pc > src: include used headers for C++ modules build Closes scylladb/scylladb#16769	2024-01-14 14:51:11 +02:00
Gleb Natapov	c9b7bd5a33	storage_service: topology coordinator: provide a version of get_excluded_nodes that does not need node_to_work_on as a parameter Needed by the next patch.	2024-01-14 14:44:07 +02:00
Gleb Natapov	0e68073b22	test: use servers_see_each_other when needed In the next patch we want to abort topology operations if there is no enough live nodes to perform them. This will break tests that do a topology operation right after restarting a node since a topology coordinator may still not see the restarted node as alive. Fix all those tests to wait between restart and a topology operation until UP state propagates.	2024-01-14 14:44:07 +02:00
Gleb Natapov	455ffaf5d8	test: add servers_see_each_other helper The helper makes sure that all nodes in the cluster see each other as alive.	2024-01-14 14:44:07 +02:00
Gleb Natapov	067267ff76	storage_service: topology coordinator: make topology coordinator lifecycle subscriber We want to change the coordinator to consider nodes liveness when processing the topology operation queue. If there is no enough live nodes to process any of the ops we want to cancel them. For that to work we need to be able to kick the coordinator if liveness situation changes.	2024-01-14 14:44:07 +02:00
Gleb Natapov	a4ac64a652	system_keyspace: raft topology: load ignore nodes parameter together with removenode topology request Next patch will need ignore nodes list while processing removenode request. Load it.	2024-01-14 14:44:07 +02:00
Gleb Natapov	f70c4127c6	storage_service: topology coordinator: introduce sstable cleanup fiber Introduce a fiber that waits on a topology event and when it sees that the node it runs on needs to perform sstable cleanup it initiates one for each non tablet, non local table and resets "cleanup" flag back to "clean" in the topology.	2024-01-14 14:44:07 +02:00
Gleb Natapov	5b246920ae	storage_proxy: allow to wait for all ongoing writes We want to be able to wait for all writes started through the storage proxy before a fence is advanced. Add phased_barrier that is entered on each local write operation before checking the fence to do so. A write will be either tracked by the phased_barrier or fenced. This will be needed to wait for all non fenced local writes to complete before starting a cleanup.	2024-01-14 14:44:07 +02:00
Gleb Natapov	b2ba77978c	storage_service: topology coordinator: mark nodes as needing cleanup when required A cleanup needs to run when a node loses an ownership of a range (during bootstrap) or if a range movement to an normal node failed (removenode, decommission failure). Mark all dirty node as "cleanup needed" in those cases.	2024-01-14 14:43:59 +02:00
Gleb Natapov	dbededb1a6	storage_service: add mark_nodes_as_cleanup_needed function The function creates a mutation that sets cleanup to "needed" for each normal node that, according to the erm, has data it does not own after successful or unsuccessful topology operation.	2024-01-14 14:43:33 +02:00
Gleb Natapov	23a27ccc24	vnode_effective_replication_map: add get_all_pending_nodes() function Add a function that returns all nodes that have vnode been moved to them during a topology change operation. Needed to know which nodes need to do cleanup in case of failed topology change operation.	2024-01-14 14:37:16 +02:00
Gleb Natapov	a8f11852da	vnode_effective_replication_map: pre calculate dirty endpoints during topology change Some topology change operations causes some nodes loose ranges. This information is needed to know which nodes need to do cleanup after topology operation completes. Pre calculate it during erm creation.	2024-01-14 14:11:19 +02:00
Gleb Natapov	cc54796e23	raft topology: add cleanup state to the topology state machine The patch adds cleanup state to the persistent and in memory state and handles the loading. The state can be "clean" which means no cleanup needed, "needed" which means the node is dirty and needs to run cleanup at some point, "running" which means that cleanup is running by the node right now and when it will be completed the state will be reset to "clean".	2024-01-14 13:30:54 +02:00
Nadav Har'El	1bcaeb89c7	view: revert cleanup filter that doesn't work with tablets This patch reverts commit `10f8f13b90` from November 2022. That commit added to the "view update generator", the code which builds view updates for staging sstables, a filter that ignores ranges that do not belong to this node. However, 1. I believe this filter was never necessary, because the view update code already silently ignores base updates which do not belong to this replica (see get_view_natural_endpoint()). After all, the view update needs to know that this replica is the Nth owner of the base update to send its update to the Nth view replica, but if no such N exists, no view update is sent. 2. The code introduced for that filter used a per-keyspace replication map, which was ok for vnodes but no longer works for tablets, and causes the operation using it to fail. 3. The filter was used every time the "view update generator" was used, regardless of whether any cleanup is necessary or not, so every such operation would fail with tablets. So for example the dtest test_mvs_populating_from_existing_data fails with tablets: * This test has view building in parallel with automatic tablet movement. * Tablet movement is streaming. * When streaming happens before view building has finished, the streamed sstables get "view update generator" run on them. This causes the problematic code to be called. Before this patch, the dtest test_mvs_populating_from_existing_data fails when tablets are enabled. After this patch, it passes. Fixes #16598 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2024-01-14 13:24:44 +02:00

1 2 3 4 5 ...

40643 Commits