Commit Graph

46613 Commits

Author SHA1 Message Date
Pavel Emelyanov
f650e75137 test/cql_env: Move stream manager start lower
This is to keep it in-sync with main code, where stream manager is
started after storage_proxy's and query_processor's remotes. This
doesn't change nothing for now, but next patches will move other
services around main/cql_test_env and early start of stream manager in
cql_test_env will be problematic.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-02-14 20:25:20 +03:00
Michał Chojnowski
294b839e34 test_rpc_compression.py: fix an overly-short timeout
The timeout of 10 seconds is too small for CI.
I didn't mean to make it so short, it was an accident.

Fix that by changing the timeout to 10 minutes.

Fixes scylladb/scylladb#22832

Closes scylladb/scylladb#22836
2025-02-13 17:49:39 +01:00
Gleb Natapov
d288d79d78 api: initialize token metadata API after starting the gossiper
Token metadata API now depend on gossiper to do ip to host id mappings,
so initialized it after the gossiper is initialized and de-initialized
it before gossiper is stopped.

Fixes: scylladb/scylladb#22743

Closes scylladb/scylladb#22760
2025-02-13 14:39:05 +01:00
Takuya ASADA
b5e306047f dist: fix upgrade error from 2024.1
We need to allow replacing nodetool from scylla-enterprise-tools < 2024.2,
just like we did for scylla-tools < 5.5.
This is required to make packages able to upgrade from 2024.1.

Fixes #22820

Closes scylladb/scylladb#22821
2025-02-13 12:36:24 +02:00
Botond Dénes
c57492bd73 Update tools/java submodule
* tools/java 807e991d...4f1353ba (1):
  > dist: support smooth upgrade from enterprise to source availalbe

Refs scylladb/scylladb#22820
2025-02-13 12:32:07 +02:00
Nadav Har'El
e6dcb605cb Merge 'Fix typos' from Dmitriy Rokhfeld (TripleChecker)
Hey, our tool caught a few typos in your repository.

Also, here is your site's error report: https://triplechecker.com/s/Dza11H/scylladb.com

Hope it's helpful!

Closes scylladb/scylladb#22787

* github.com:scylladb/scylladb:
  Fix typos
  Fix typos
2025-02-13 11:14:29 +02:00
TripleChecker
8d64be94e2 Fix typos 2025-02-13 01:54:08 +02:00
Wojciech Mitros
86838a147d test: skip test_complex_null_values in uf_typest_test
test_complex_null_values is currently flaky, causing many failures
in CI. The reason for the failures is unclear, and a fix might not
be simple, so because UDFs are experimental, for now let's skip
this test until the corresponding issue is fixed.

Refs scylladb/scylladb#22799

Closes scylladb/scylladb#22818
2025-02-12 21:37:34 +01:00
Andrei Chekun
54c165c94c test: Skip test_raft_voters because of existing issue
https://github.com/scylladb/scylladb/issues/18793

Closes scylladb/scylladb#22710
2025-02-12 16:41:17 +03:00
Anna Stuchlik
b860b2109f doc: add a warning for admins launching ScyllaDB on Azure
Fixes scylladb/scylladb#22686
Refs scylladb/scylladb#22505

Closes scylladb/scylladb#22687
2025-02-12 14:27:19 +01:00
Tomasz Grabiec
d8ea780244 Merge 'scylla-gdb.py: introduce scylla tablet-metadata command' from Botond Dénes
Dumps the content of the tablet metadata. Very useful for debugging tablet related problems.

Example output:
```
(gdb) scylla tablet-metadata --table usertable_no_lwt

This node: host_id: b90662a9-98b1-4452-bc45-44d460ecab62, shard: 0

table alternator_usertable_no_lwt.usertable_no_lwt: id: 68316fa0-78ec-11ef-af10-98d4ab71aac4, tablets: 32, resize decision: merge#1, transitions: 0
  tablet#0: last token: -8646911284551352321, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#0, 84d0cb45-1c6c-4870-b727-03db3130641f#0, b933959e-8134-4ba0-8c44-33dbd51170e9#0]
  tablet#1: last token: -8070450532247928833, replicas: [fb0167dc-7a7d-476d-b4a5-4a55a52dadff#0, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#0, ac2fdd20-2f54-4960-9856-27fd07ed38ef#0]
  tablet#2: last token: -7493989779944505345, replicas: [fb0167dc-7a7d-476d-b4a5-4a55a52dadff#1, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#1, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#1]
  tablet#3: last token: -6917529027641081857, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#1, b933959e-8134-4ba0-8c44-33dbd51170e9#1, 84d0cb45-1c6c-4870-b727-03db3130641f#1]
  tablet#4: last token: -6341068275337658369, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#2, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#2, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#2]
  tablet#5: last token: -5764607523034234881, replicas: [4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#2, b933959e-8134-4ba0-8c44-33dbd51170e9#2, 84d0cb45-1c6c-4870-b727-03db3130641f#2]
  tablet#6: last token: -5188146770730811393, replicas: [84d0cb45-1c6c-4870-b727-03db3130641f#3, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#3, ac2fdd20-2f54-4960-9856-27fd07ed38ef#3]
  tablet#7: last token: -4611686018427387905, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#3, b933959e-8134-4ba0-8c44-33dbd51170e9#3, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#3]
  tablet#8: last token: -4035225266123964417, replicas: [b933959e-8134-4ba0-8c44-33dbd51170e9#4, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#4, ac2fdd20-2f54-4960-9856-27fd07ed38ef#4]
  tablet#9: last token: -3458764513820540929, replicas: [4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#4, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#4, 84d0cb45-1c6c-4870-b727-03db3130641f#4]
  tablet#10: last token: -2882303761517117441, replicas: [84d0cb45-1c6c-4870-b727-03db3130641f#5, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#5, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#5]
  tablet#11: last token: -2305843009213693953, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#5, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#5, b933959e-8134-4ba0-8c44-33dbd51170e9#5]
  tablet#12: last token: -1729382256910270465, replicas: [b933959e-8134-4ba0-8c44-33dbd51170e9#6, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#6, 84d0cb45-1c6c-4870-b727-03db3130641f#6]
  tablet#13: last token: -1152921504606846977, replicas: [4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#6, ac2fdd20-2f54-4960-9856-27fd07ed38ef#6, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#6]
  tablet#14: last token: -576460752303423489, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#7, 84d0cb45-1c6c-4870-b727-03db3130641f#7, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#7]
  tablet#15: last token: -1, replicas: [b933959e-8134-4ba0-8c44-33dbd51170e9#7, ac2fdd20-2f54-4960-9856-27fd07ed38ef#7, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#7]
  tablet#16: last token: 576460752303423487, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#8, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#8, 84d0cb45-1c6c-4870-b727-03db3130641f#8]
  tablet#17: last token: 1152921504606846975, replicas: [b933959e-8134-4ba0-8c44-33dbd51170e9#8, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#8, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#8]
  tablet#18: last token: 1729382256910270463, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#9, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#9, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#9]
  tablet#19: last token: 2305843009213693951, replicas: [84d0cb45-1c6c-4870-b727-03db3130641f#9, ac2fdd20-2f54-4960-9856-27fd07ed38ef#9, b933959e-8134-4ba0-8c44-33dbd51170e9#9]
  tablet#20: last token: 2882303761517117439, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#10, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#10, b933959e-8134-4ba0-8c44-33dbd51170e9#10]
  tablet#21: last token: 3458764513820540927, replicas: [84d0cb45-1c6c-4870-b727-03db3130641f#10, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#10, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#10]
  tablet#22: last token: 4035225266123964415, replicas: [4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#11, 84d0cb45-1c6c-4870-b727-03db3130641f#11, b933959e-8134-4ba0-8c44-33dbd51170e9#11]
  tablet#23: last token: 4611686018427387903, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#11, ac2fdd20-2f54-4960-9856-27fd07ed38ef#11, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#11]
  tablet#24: last token: 5188146770730811391, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#12, 84d0cb45-1c6c-4870-b727-03db3130641f#12, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#12]
  tablet#25: last token: 5764607523034234879, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#12, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#12, b933959e-8134-4ba0-8c44-33dbd51170e9#12]
  tablet#26: last token: 6341068275337658367, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#13, b933959e-8134-4ba0-8c44-33dbd51170e9#13, 84d0cb45-1c6c-4870-b727-03db3130641f#13]
  tablet#27: last token: 6917529027641081855, replicas: [ac2fdd20-2f54-4960-9856-27fd07ed38ef#13, fb0167dc-7a7d-476d-b4a5-4a55a52dadff#13, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#13]
  tablet#28: last token: 7493989779944505343, replicas: [b5ddcd7e-45ed-4f20-8841-353bd82cc04c#0, b933959e-8134-4ba0-8c44-33dbd51170e9#0, ac2fdd20-2f54-4960-9856-27fd07ed38ef#0]
  tablet#29: last token: 8070450532247928831, replicas: [fb0167dc-7a7d-476d-b4a5-4a55a52dadff#0, 84d0cb45-1c6c-4870-b727-03db3130641f#0, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#0]
  tablet#30: last token: 8646911284551352319, replicas: [fb0167dc-7a7d-476d-b4a5-4a55a52dadff#1, ac2fdd20-2f54-4960-9856-27fd07ed38ef#1, b5ddcd7e-45ed-4f20-8841-353bd82cc04c#1]
  tablet#31: last token: 9223372036854775807, replicas: [b933959e-8134-4ba0-8c44-33dbd51170e9#1, 4b1e8a42-e8b3-432e-bf7c-b0f7a10eb3cd#1, 84d0cb45-1c6c-4870-b727-03db3130641f#1]
```

The PR includes two marginally related small fixes too.

Improvement, no backport needed.

Closes scylladb/scylladb#20940

* github.com:scylladb/scylladb:
  scylla-gdb.py: add scylla tablet-metadata command
  scylla-gdb.py: register the scylla table command
  scylla-gdb.py: unordered_map: improve flat_hash_map matching
2025-02-12 13:27:36 +01:00
Andrei Chekun
9540e056a4 test: Add the possibility to run raft tests with pytest
Closes scylladb/scylladb#22775
2025-02-12 14:10:19 +02:00
Botond Dénes
7150442f6a service/storage_proxy: schedule_repair(): materialize the range into a vector
Said method passes down its `diff` input to `mutate_internal()`, after
some std::ranges massaging. Said massaging is destructive -- it moves
items from the diff. If the output range is iterated-over multiple
times, only the first time will see the actual output, further
iterations will get an empty range.
When trace-level logging is enabled, this is exactly what happens:
`mutate_internal()` iterates over the range multiple times, first to log
its content, then to pass it down the stack. This ends up resulting in
a range with moved-from elements being pased down and consequently write
handlers being created with nullopt mutations.

Make the range re-entrant by materializing it into a vector before
passing it to `mutate_internal()`.

Fixes: scylladb/scylladb#21907
Fixes: scylladb/scylladb#21714

Closes scylladb/scylladb#21910
2025-02-12 12:38:47 +02:00
Kefu Chai
6e1fb2c74e build: limit ThinLTO link parallelism to prevent OOM in release builds
When building Scylla with ThinLTO enabled (default with Clang), the linker
spawns threads equal to the number of CPU cores during linking. This high
parallelism can cause out-of-memory (OOM) issues in CI environments,
potentially freezing the build host or triggering the OOM killer.

In this change:

1. Rename `LINK_MEM_PER_JOB` to `Scylla_RAM_PER_LINK_JOB` and make it
   user-configurable
2. Add `Scylla_PARALLEL_LINK_JOBS` option to directly control concurrent
   link jobs (useful for hosts with large RAM)
3. Increase the default value of `Scylla_PARALLEL_LINK_JOBS` to 16 GiB
   when LTO is enabled
4. Default to 2 parallel link jobs when LTO is enabled if the calculated
   number if less than 2 for faster build.

Notes:
- Host memory is shared across job pools, so pool separation alone doesn't help
- Ninja lacks per-job memory quota support
- Only affects link parallelism in LTO-enabled builds

See
https://clang.llvm.org/docs/ThinLTO.html#controlling-backend-parallelism

Fixes scylladb/scylladb#22275

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22383
2025-02-12 10:24:13 +02:00
Alexander Turetskiy
3ac533251a allow "UTC" and "GMT" in string format of timestamp
fix problem with statements like:
INSERT INTO tbl (pk, time) VALUES (1, '2016-09-27 16:10:00 UTC');

fixes #20501

Closes scylladb/scylladb#22426
2025-02-12 09:38:28 +02:00
Alexander Turetskiy
47011ab830 Materialized view name length should be limited
Oversized materialized view and index names are rejected;
Materialized view names with invalid symbols are rejected.

fixes: #20755

Closes scylladb/scylladb#21746
2025-02-11 22:16:09 +02:00
Avi Kivity
5c647408c7 systemd: map libraries close to the executable
The Intel Optimizaton Manual states that branches with relative offsets
greater than 2GB suffer a penalty. They cite a 6% improvement when this
is avoided. Our code doesn't rely heavily on dynamically linked
libraries, so I don't expect a similar win, but it's still better to do
it than not.

Eliminate long branches by asking the dynamic linker to restrict itself
to the lower 4GB of the address space. I saw that it maps libraries
at 1GB+ addresses, so this satisfies the limitation.

Fix is from the Intel Optimization Manual as well.

This change was ported from ScyllaDB Enterprise.

Closes scylladb/scylladb#22498
2025-02-11 22:16:09 +02:00
Avi Kivity
de3b2c827f service: topology coordinator: demote log message about refreshing stats
This repeats every minute and isn't very interesting. Demote to debug
to reduce log clutter.

Closes scylladb/scylladb#22784
2025-02-11 22:16:09 +02:00
Botond Dénes
f808f84a45 db/config: improve description of repair_multishard_reader_enable_read_ahead
The current description has a typo and in general not informative enough
on when this option should be used.

Closes scylladb/scylladb#21758
2025-02-11 22:16:09 +02:00
Botond Dénes
be5c28e149 scylla-gdb.py: add scylla tablet-metadata command
Dumps the content of the tablet-metadata. Very useful for debugging
tablet-replated problems.
2025-02-11 07:29:46 -05:00
Botond Dénes
23db82b957 scylla-gdb.py: register the scylla table command
This command exists but is not registered. There is a test for it, but
it happens to work only because scylla table is a prefix of scylla
tables (another command), so gdb invokes that other command instead.
2025-02-11 07:29:46 -05:00
Botond Dénes
3ec8ef90fe scylla-gdb.py: unordered_map: improve flat_hash_map matching
Strip typedefs from the type before matching.
2025-02-11 07:29:30 -05:00
Avi Kivity
5adaf0a605 Merge 'tree: migrate from boost::remove_if() to the standard library based alternatives' from Kefu Chai
Replace boost::remove_if() with the standard library's std::erase_if() or std::ranges::remove_if() to reduce external dependencies and simplify the codebase. This change eliminates the requirement for boost::range and makes the implementation more maintainable.

---

it's a cleanup, hence no need to backport.

Closes scylladb/scylladb#22788

* github.com:scylladb/scylladb:
  service: migrate from boost::range::remove_if() to std::ranges::remove_if
  sstable: migrate from boost::remove_if() to std::erase_if()
2025-02-11 14:07:48 +02:00
Kefu Chai
481397317d sstables, test: migrate from boost::copy() to std::ranges::copy()
Replace boost::copy() with the standard library's std::ranges::copy()
to reduce external dependencies and simplify the codebase. This change
eliminates the requirement for boost::range and makes the implementation
more maintainable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22789
2025-02-11 14:55:25 +03:00
Asias He
fb318d0c81 repair: Add await_completion option for tablet_repair api
Set true to wait for the repair to complete. Set false to skip waiting
for the repair to complete. When the option is not provided, it defaults
to false.

It is useful for management tool that wants the api to be async.

Fixes #22418

Closes scylladb/scylladb#22436
2025-02-11 12:49:12 +02:00
Avi Kivity
770dc37f0f tools: toolchain: prepare: fix optimized_clang archive printout
prepare helpfully prints out the path where optimized clang is stored,
but a couple of typos mean it prints out an empty string. Fix that.

Closes scylladb/scylladb#22714
2025-02-11 11:50:01 +02:00
Nadav Har'El
1842d456a1 test/cqlpy: fix some false failures on Cassandra
Developers are expected to run new cqlpy tests against Cassandra - to
verify that the new test itself is correct. Usually there is no need
to run the entire cqlpy test suite against Cassandra, but when users do
this, it isn't confidence-inspiring to see hundreds of tests failing.
In this patch I fix many but not all of these failures.

Refs #11690 (which will remain open until we fix all the failures on
Cassandra)

* Fixed the "compact_storage" fixture recently introduced to enable the
  deprecated feature in Scylla for the tests. This fixture was broken on
  Cassandra and caused all compact-storage related tests to fail
  on Cassandra.

* Marked all tests in test_tombstone_limit.py as scylla_only - as they
  check the Scylla-only query_tombstone_page_limit configuration option.

* Marked all tests in test_service_level_api.py as scylla_only - as they
  check the Scylla-only service levels feature.

* Marked a test specific to the Scylla-only IncrementalCompactionStrategy
  as scylla_only. Some tests mix STCS and ICS testing in one test - this
  is a mistake and isn't fixed in this patch.

* Various tests in test_tablets.py forgot to use skip_without_tablets
  to skip them on Cassandra or older Scylla that doesn't have the
  tablets feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

x

Closes scylladb/scylladb#22561
2025-02-11 11:48:40 +02:00
Botond Dénes
4a7a75dfcb Merge 'tasks: use host_id in task manager' from Aleksandra Martyniuk
Use host_id in a children list of a task in task manager to indicate
a node on which the child was created.

Move TASKS_CHILDREN_REQUEST to IDL. Send it by host_id.

Fixes: https://github.com/scylladb/scylladb/issues/22284.

Ip to host_id transition; backport isn't needed.

Closes scylladb/scylladb#22487

* github.com:scylladb/scylladb:
  tasks: drop task_manager::config::broadcast_address as it's unused
  tasks: replace ip with host_id in task_identity
  api: task_manager: pass gossiper to api::set_task_manager
  tasks: keep host_id in task_manager
  tasks: move tasks_get_children to IDL
2025-02-11 11:32:27 +02:00
Patryk Jędrzejczak
7b8344faa8 Merge 'Fix a regression that sometimes causes an internal error and demote barrier_and_drain rpc error log to a warning ' from Gleb Natapov
The series fixes a regression and demotes a barrier_and_drain logging error to a warning since this particular condition may happen during normal operation.

We want to backport both since one is a bug fix and another is trivial and reduces CI flakiness.

Closes scylladb/scylladb#22650

* https://github.com/scylladb/scylladb:
  topology_coordinator: demote barrier_and_drain rpc failure to warning
  topology_coordinator: read peers table only once during topology state application
2025-02-11 10:25:35 +01:00
Pavel Emelyanov
529ff3efa5 Merge 'Alternator: implement UpdateTable operation to add or delete GSI' from Nadav Har'El
In this series we implement the UpdateTable operation to add a GSI to an existing table, or remove a GSI from a table. As the individual commit messages will explained, this required changing how Alternator stores materialized view keys - instead of insisting that these key must be real columns (that is **not** the case when adding a GSI to an existing table), the materialized view can now take as its key any Alternator attribute serialized inside the ":attrs" map holding all non-key attributes. Fixes #11567.

We also fix the IndexStatus and Backfilling attributes returned by DescribeTable - as DynamoDB API users use this API to discover when a newly added GSI completed its "backfilling" (what we call "view building") stage. Fixes #11471.

This series should not be backported lightly - it's a new feature and required fairly large and intrusive changes that can introduce bugs to use cases that don't even use Alternator or its UpdateTable operations - every user of CQL materialized views or secondary indexes, as well as Alternator GSI or LSI, will use modified code. **It should be backported to 2025.1**, though - this version was actually branched long after this PR was sent, and it provides a feature that was promised for 2025.1.

Closes scylladb/scylladb#21989

* github.com:scylladb/scylladb:
  alternator: fix view build on oversized GSI key attribute
  mv: clean up do_delete_old_entry
  test/alternator: unflake test for IndexStatus
  test/alternator: work around unrelated bug causing test flakiness
  docs/alternator: adding a GSI is no longer an unimplemented feature
  test/alternator: remove xfail from all tests for issue 11567
  alternator: overhaul implementation of GSIs and support UpdateTable
  mv: support regular_column_transformation key columns in view
  alternator: add new materialized-view computed column for item in map
  build: in cmake build, schema needs alternator
  build: build tests with Alternator
  alternator: add function serialized_value_if_type()
  mv: introduce regular_column_transformation, a new type of computed column
  alternator: add IndexStatus/Backfilling in DescribeTable
  alternator: add "LimitExceededException" error type
  docs/alternator: document two more unimplemented Alternator features
2025-02-11 10:02:01 +03:00
Kefu Chai
a18069fad7 service: migrate from boost::range::remove_if() to std::ranges::remove_if
Replace boost::range::remove_if() with the standard library's
std::ranges::remove_if() to reduce external dependencies and simplify
the codebase. This change eliminates the requirement for boost::range
and makes the implementation more maintainable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-02-11 09:15:14 +08:00
Kefu Chai
ba724a26f4 sstable: migrate from boost::remove_if() to std::erase_if()
Replace boost::remove_if() with the standard library's std::erase_if()
to reduce external dependencies and simplify the codebase. This change
eliminates the requirement for boost::range and makes the implementation
more maintainable.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-02-11 09:15:14 +08:00
TripleChecker
e72e6fadeb Fix typos 2025-02-11 00:17:43 +02:00
Avi Kivity
6a1ee32cc3 Merge 'raft/group0_state_machine: load current RPC compression dict on startup' from Michał Chojnowski
We are supposed to be loading the most recent RPC compression dictionary
on startup, but we forgot to port the relevant piece of logic during
the source-available port. This causes a restarted node not to use the
dictionary for RPC compression until the next dictionary update.

Fix that.

Fixes scylladb/scylladb#22738

This is more of a bugfix than an improvement, so it should be backported to 2025.1.

Closes scylladb/scylladb#22739

* github.com:scylladb/scylladb:
  test_rpc_compression.py: test the dictionaries are loaded on startup
  raft/group0_state_machine: load current RPC compression dict on startup
2025-02-10 20:40:33 +02:00
Dawid Mędrek
cd50152522 service/mapreduce_service: Cancel query when stopping
Before these changes, shutting down a node could be prolonged because of
mapreduce_service. `mapreduce_service::stop()` uninitializes messaging
service, which includes waiting for all ongoing RPC handlers. We already
had a mechanism for cancelling local mapreduce tasks, but we were missing
one for cancelling external queries.

In this commit, we modify the signature of the request so it supports
cancelling via an abort source. We also provide a reproducer test
for the problem.

Fixes scylladb/scylladb#22337

Closes scylladb/scylladb#22651
2025-02-10 20:12:59 +02:00
Asias He
6f04de3efd streaming: Fail stream plan on stream_mutation_fragments handler in case of error
The following is observed in pytest:

1) node1, stream master, tried to pull data from node3

2) node3, stream follower, found node1 restarted

3) node3 killed the rpc stream

4) node1 did not get the stream session failure message from node3. This
failure message was supposed to kill the stream plan on node1. That's the
reason node1 failed the stream session much later at "2024-08-19 21:07:45,539".
Note, node3 failed the stream on its side, so it should have sent the stream
session failure message.

```
$ cat node1.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,162 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers={127.0.34.3}, master
ERROR 2024-08-19 20:24:01,190 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to handle STREAM_MUTATION_FRAGMENTS (receive and distribute phase) for ks=ks, cf=cf, peer=127.0.34.3: seastar::nested_exception: seastar::rpc::stream_closed (rpc stream was closed by peer) (while cleaning up after seastar::rpc::stream_closed (rpc stream was closed by peer))
WARN  2024-08-19 21:07:45,539 [shard 0:main] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.3}, tx=0 KiB, 0.00 KiB/s, rx=484 KiB, 0.18 KiB/s

$ cat node3.log |grep f890bea0-5e68-11ef-99ae-e5bca04385fc
INFO  2024-08-19 20:24:01,163 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Executing streaming plan for Tablet migration-ks-index-0 with peers=127.0.34.1, slave
INFO  2024-08-19 20:24:01,164 [shard 1:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Start sending ks=ks, cf=cf, estimated_partitions=2560, with new rpc streaming
WARN  2024-08-19 20:24:01,187 [shard 0: gms] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming plan for Tablet migration-ks-index-0 failed, peers={127.0.34.1}, tx=633 KiB, 26506.81 KiB/s, rx=0 KiB, 0.00 KiB/s
WARN  2024-08-19 20:24:01,188 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] stream_transfer_task: Fail to send to 127.0.34.1:0: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Failed to send: seastar::rpc::stream_closed (rpc stream was closed by peer)
WARN  2024-08-19 20:24:01,189 [shard 0:strm] stream_session - [Stream #f890bea0-5e68-11ef-99ae-e5bca04385fc] Streaming error occurred, peer=127.0.34.1
```

To be safe in case the stream fail message is not received, node1 could fail
the stream plan as soon as the rpc stream is aborted in the
stream_mutation_fragments handler.

Fixes #20227

Closes scylladb/scylladb#21960
2025-02-10 16:32:12 +01:00
Avi Kivity
cf72c31617 treewide: improve bash error reporting
bash error handling and reporting is atrocious. Without -e it will
just ignore errors. With -e it will stop on errors, but not report
where the error happened (apart from exiting itself with an error code).

Improve that with the `trap ERR` command. Note that this won't be invoked
on intentional error exit with `exit 1`.

We apply this on every bash script that contains -e or that it appears
trivial to set it in. Non-trivial scripts without -e are left unmodified,
since they might intentionally invoke failing scripts.

Closes scylladb/scylladb#22747
2025-02-10 18:28:52 +03:00
Pavel Emelyanov
81f7a6d97d doc: Update system.sstables table schema description
The partition key had been renamed and its type changed some time ago,
but the doc wasn't updated. Fix it.

refs: #20998

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#22683
2025-02-10 16:09:49 +02:00
Botond Dénes
51a273401c Merge 'test: tablets_test: Create proper schema in load balancer tests' from Tomasz Grabiec
This PR converts boost load balancer tests in preparation for load balancer changes
which add per-table tablet hints. After those changes, load balancer consults with the replication
strategy in the database, so we need to create proper schema in the
database. To do that, we need proper topology for replication
strategies which use RF > 1, otherwise keyspace creation will fail.

Topology is created in tests via group0 commands, which is abstracted by
the new `topology_builder` class.

Tests cannot modify token_metadata only in memory now as it needs to be
consistent with the schema and on-disk metadata. That's why modifications to
tablet metadata are now made under group0 guard and save back metadata to disk.

Closes scylladb/scylladb#22648

* github.com:scylladb/scylladb:
  test: tablets: Drop keyspace after do_test_load_balancing_merge_colocation() scenario
  tests: tablets: Set initial tablets to 1 to exit growing mode
  test: tablets_test: Create proper schema in load balancer tests
  test: lib: Introduce topology_builder
  test: cql_test_env: Expose topology_state_machine
  topology_state_machine: Introduce lock transition
2025-02-10 16:08:41 +02:00
Avi Kivity
c212f5a296 db/config: forward-declare boost options_description_easy_init
Reduces large dependency pull from boost.

Closes scylladb/scylladb#22748
2025-02-10 15:08:11 +02:00
Nikita Kurashkin
025bb379a4 cql: remove expansion of "SELECT *" in DESC MATERIALIZED VIEW
This patch removes expansion of "SELECT *" in DESC MATERIALIZED VIEW.
Instead of explicitly printing each column, DESC command will now just
use SELECT *, if view was created with it. Also, adds a correspodning test.
Fixes #21154

Closes scylladb/scylladb#21962
2025-02-10 15:01:23 +02:00
Kefu Chai
c6bf9d8d11 sstables: switch from boost to std::ranges::all_of()
Replace boost::algorithm::all_of_equal() to std::ranges::all_of()

In order to reduce the header dependency to boost ranges library, let's
use the utility from the standard library when appropriate.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22730
2025-02-10 15:44:55 +03:00
Kefu Chai
09a090e410 ent/encryption: Replace manual string suffix checks with ends_with()
Replace manual string suffix comparison (length check + std::equal) with
std::string::ends_with() introduced in C++20 for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22764
2025-02-10 15:42:39 +03:00
Avi Kivity
d4c531307d replica: database.hh: drop dependency on boost ranges
Reduces dependency load.

Closes scylladb/scylladb#22749
2025-02-10 13:29:55 +01:00
Michael Litvak
c098e9a327 test/test_view_build_status: fix flaky asserts
In few test cases of test_view_build_status we create a view, wait for
it and then query the view_build_status table and expect it to have all
rows for each node and view.

But it may fail because it could happen that the wait_for_view query and
the following queries are done on different nodes, and some of the nodes
didn't apply all the table updates yet, so they have missing rows.

To fix it, we change the assert to work in the eventual consistency
sense, retrying until the number of rows is as expectd.

Fixes scylladb/scylladb#22644

Closes scylladb/scylladb#22654
2025-02-10 12:41:42 +01:00
Kefu Chai
ca832dc4fb .github: Make "make-pr-ready-for-review" workflow run in base repo
The "make-pr-ready-for-review" workflow was failing with an "Input
required and not supplied: token" error.  This was due to GitHub Actions
security restrictions preventing access to the token when the workflow
is triggered in a fork:
```
    Error: Input required and not supplied: token
```

This commit addresses the issue by:

- Running the workflow in the base repository instead of the fork. This
  grants the workflow access to the required token with write permissions.
- Simplifying the workflow by using a job-level `if` condition to
  controlexecution, as recommended in the GitHub Actions documentation
  (https://docs.github.com/en/actions/writing-workflows/choosing-when-your-workflow-runs/using-conditions-to-control-job-execution).
  This is cleaner than conditional steps.
- Removing the repository checkout step, as the source code is not required for this workflow.

This change resolves the token error and ensures the
"make-pr-ready-for-review" workflow functions correctly.

Fixes scylladb/scylladb#22765

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22766
2025-02-10 12:56:39 +02:00
Evgeniy Naydanov
06793978c1 test.py: new Python dependencies for dtest->test.py migration
3rd-party library which provide compatibility between sync and async code:

  universalasync

Few deps from scylla-dtest:

  deepdiff
  cryptography
  boto3-stubs[dynamodb]

[avi: regenerate frozen toolchain with optimized clang from

  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-aarch64.tar.gz
  https://devpkg.scylladb.com/clang/clang-19.1.7-Fedora-41-x86_64.tar.gz
]

Closes scylladb/scylladb#22497
2025-02-10 10:52:27 +02:00
Kefu Chai
0185aa458b build: cmake: remove trailing comma in db/CMakeLists.txt source list
In c5668d99, a new source file row_cache.cc was added to the `db` target,
but with an extraneous trailing comma. In CMake's target_sources(),
source files should be space-separated - any comma is interpreted as
part of the filename, causing build failures like:
```
  CMake Error at db/CMakeLists.txt:2 (target_sources):
    Cannot find source file:
      row_cache.cc,
```
Fix the issue by removing the trailing comma.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#22754
2025-02-09 17:28:47 +02:00
Nadav Har'El
a492e239e3 Merge 'test.py: Add the possibility to run boost and unit tests with pytest ' from Andrei Chekun
Add the possibility to run boost and unit tests with pytest

test.py should follow the next paradigm - the ability to run all test cases sequentially by ONE pytest command.
With this paradigm, to have the better performance, we can split this 1 command into 2,3,4,5,100,200... whatever we want

It's a new functionality that does not touch test.py way of executing the boost and unit tests.
It supports the main features of test.py way of execution: automatic discovery of modes, repeats.
There is an additional requirement to execute tests in parallel: pytest-xdist. To install it, execute `pip install pytest-xdist`

To run test with pytest execute `pytest test/boost`. To execute only one file, provide the path filename `pytest test/boost/aggregate_fcts_test.cc` since it's a normal path, autocompletion will work on the terminal. To provide a specific mode, use the next parameter `--mode dev`, if parameter will not be provided pytest will try to use `ninja mode_list` to find out the compiled modes.
Parallel execution controlled by pyest-xdist and the parameter `-n 12`.
The useful command to discover the tests in the file or directory is `pytest --collect-only -q --mode dev test/boost/aggregate_fcts_test.cc`. That will return all test functions in the file. To execute only one function from the test, you can invoke the output from the previous command, but suffix for mode should be skipped, for example output will be `test/boost/aggregate_fcts_test.cc::test_aggregate_avg.dev`, so to execute this specific test function, please use the next command `pytest --mode dev test/boost/aggregate_fcts_test.cc::test_aggregate_avg`
There is a parameter `--repeat` that used to repeat the test case several times in the same way as test.py did.
It's not possible to run both boost and unit tests directories with one command, so we need to provide explicitly which directory should be executed. Like this `pytest --mode dev test/unit` or `pytest --mode dev test/boost`

Fixes: https://github.com/scylladb/qa-tasks/issues/1775

Closes scylladb/scylladb#21108

* github.com:scylladb/scylladb:
  test.py: Add possibility to run ldap tests from pytest
  test.py: Add the possibility to run unit tests from pytest
  test.py: Add the possibility to run boost test from pytest
  test.py: Add discovery for C++ tests for pytest
  test.py: Modify s3 server mock
  test.py: Add method to get environment variables from MinIO wrapper
  test.py: Move get configured modes to common lib
2025-02-09 11:56:24 +01:00
Yaron Kaikov
93f53f4eb8 dist: support smooth upgrade from enterprise to source availalbe
When upgrading for example from `2024.1` to `2025.1` the package name is
not identical casuing the upgrade command to fail:
```
Command: 'sudo DEBIAN_FRONTEND=noninteractive apt-get dist-upgrade scylla -y -o Dpkg::Options::="--force-confdef" -o Dpkg::Options::="--force-confold"'
Exit code: 100
Stdout:
Selecting previously unselected package scylla.
Preparing to unpack .../6-scylla_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb ...
Unpacking scylla (2025.1.0~dev-0.20250118.1ef2d9d07692-1) ...
Errors were encountered while processing:
/tmp/apt-dpkg-install-JbOMav/0-scylla-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/1-scylla-python3_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/2-scylla-server_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/3-scylla-kernel-conf_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/4-scylla-node-exporter_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
/tmp/apt-dpkg-install-JbOMav/5-scylla-cqlsh_2025.1.0~dev-0.20250118.1ef2d9d07692-1_amd64.deb
Stderr:
E: Sub-process /usr/bin/dpkg returned an error code (1)
```

Adding `Obsoletes` (for rpm) and `Replaces` (for deb)

Fixes: https://github.com/scylladb/scylladb/issues/22420

Closes scylladb/scylladb#22457
2025-02-08 21:56:09 +02:00