Commit Graph

40764 Commits

Author SHA1 Message Date
Gleb Natapov
b97ff54a41 storage service: topology coordinator: call notify_left() when a node leaves a cluster
When the topology coordinator is used for topology changes the gossiper
based code that calls notify_left() is not called. The coordinator needs
to call it itself.
2024-01-24 10:21:01 +02:00
Gleb Natapov
5459a8b9a5 storage_service: drop redundant check from notify_joined()
notify_joined() is called from handle_state_normal only, so there is no
point checking that the state is normal inside the function as well.
2024-01-24 10:17:12 +02:00
Botond Dénes
08cf5ccd23 Merge 'Fix test_tablet_missing_data_repair' from Asias He
This PR fixes test_tablet_missing_data_repair and enable the test again.

If a node is not UP yet, repair in the test will be a partial repair. The partial repair will not repair all the data which cause the check of rows after repair to fail.  Check nodes see each other as UP before repair.

Closes scylladb/scylladb#16930

* github.com:scylladb/scylladb:
  test: Enable test_tablet_missing_data_repair again
  test: Wait for nodes to be up when repair
  test: Check repair status in ScyllaRESTAPIClient
2024-01-23 10:38:13 +02:00
Anna Stuchlik
9076a944c5 doc: improve the ScyllaDB for Developers page
This commit improves the developer-oriented section
of the core documentation:

- Added links to the developer sections in the new
  Get Started guide (Develop with ScyllaDB and
  Tutorials and Example Projects) for ease of access.

- Replaced the outdated Learn to Use ScyllaDB page with
  a link to the up-to-date page in the Get Started guide.
  This involves removing the learn.rst file and adding
  an appropriate redirection.

- Removed the Apache Copyrights, as this page does not
  need it.

- Removed the Features panel box as there was only one
  feature listed, which looked weird. Also, we are in
  the process of removing the Features section.

Closes scylladb/scylladb#16800
2024-01-23 10:06:31 +02:00
Kefu Chai
ac473eca91 utils:: add formatter for enum_option
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for enum_option<>. since its
operator<<() is still used by the homebrew generic formatter for
formatting vector<>, operator<<() is preserved.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16917
2024-01-23 10:03:51 +02:00
Kefu Chai
91a93b125b utils:: add formatter for cql3::authorized_prepared_statements_cache_key
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
cql3::authorized_prepared_statements_cache_key, and remove its
operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16924
2024-01-23 09:13:14 +02:00
Kefu Chai
76b9e4f4f4 locator: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16914
2024-01-23 09:12:23 +02:00
Asias He
99e3d2ce72 test: Enable test_tablet_missing_data_repair again
Fixes #16859
2024-01-23 15:02:02 +08:00
Kefu Chai
db77587309 tracing: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16925
2024-01-23 08:57:11 +02:00
Kefu Chai
26004071b3 configure.py: reenable -Wnarrowing
it seems that the tree builds just fine with this warning enabled.
and narrowing is a potentially unsafe numeric conversion. so let's
enable this warning option.

this change also helps to reduce the difference between the rules
generated by configure.py and those generated by CMake.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16929
2024-01-23 08:49:25 +02:00
Kefu Chai
5005e0a156 configure.py: s/--std=/-std/
neither clang nor gcc supports the --std flag, they support -std=
though. see https://clang.llvm.org/cxx_status.html and
https://gcc.gnu.org/onlinedocs/gcc/C-Dialect-Options.html
so, let's use the -std=gnu++20 for the C++20 standard with GNU
extensions.

this change also helps to reduce the difference between the rules
generated by `configure.py` and those generated by CMake.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16928
2024-01-23 08:48:05 +02:00
Asias He
7c230f17cc test: Wait for nodes to be up when repair
If a node is not UP yet, repair in the test will be a partial repair.
Check nodes see each other as UP before repair.

Fixes #16859
2024-01-23 11:10:08 +08:00
Asias He
57a4e5594d test: Check repair status in ScyllaRESTAPIClient
Raise an exception in case the repair is not successful.
2024-01-23 11:10:08 +08:00
Kefu Chai
33794eca19 database: wait until commitlog are reclaimed in flush_all_tables()
this change addresses the possible data resurrection after
"nodetool compact" and "nodetool flush" commands. and prepare for
the fix of a similar data resurrection issue after "nodetool cleanup".

active commitlog segments are recycled in the background once they are
discarded.

and there is a chance that we could have data resurrection even after
"nodetool cleanup", because the mutations in commitlog's active segments
could change the tables which are supposed to be removed by
"nodetool cleanup", so as a solution to address this problem in the
pre-tablets era, we force new active segments of commitlog, and flush the
involved memtables. since the active segments are discarded in the
background, the completion of the "nodetool cleanup" does not guarantee
that these mutation won't be applied to memtable when server restarts,
if it is killed right away.

the same applies to "force_flush", "force_compaction" and
"force_keyspace_compaction" API calls which are used by nodetool as
well. quote from Benny's comment

> If major comapction doesn't wait for the commitlog deletion it is
> also exposed to data resurrection since theoretically it could purge
> tombstones based on the assumption that commitlog would not resurrect
> data that they might shadow, BUT on a crash/restart scenario commitlog
> replay would happen since the commitlog segments weren't deleted -
> breaking the contract with compaction.

so to ensure that the active segments are reclaimed upon completion of
"nodetool cleanup", "nodetool compact" and "nodetool flush" commands,
let's wait for pending deletes in `database::flush_all_tables()`, so the
caller wait until the reclamation of deleted active segments completes.

Refs #4734
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16915
2024-01-22 17:31:57 +02:00
David Garcia
f3eeba8cc6 docs: parse config.cc properties as rst text
This enhancement formats descriptions in config.cc using the standard markup language reStructuredText (RST).

By doing so, it improves the rendering of these descriptions in the documentation, allowing you to use various directives like admonitions, code blocks, ordered lists, and more.

Closes scylladb/scylladb#16311
2024-01-22 16:40:18 +02:00
Botond Dénes
a48881801a replica/tablets: drop keyspace_name from system.tablets partition-key
The name of the keyspace being part of the partition key is not useful,
the table_id already uniquely identifies the table. The keyspace name
being part of the key, means that code wanting to interact with this
table, often has to resolve the table id, just to be able to provide the
keyspace name. This is counter productive, so make the keyspace_name
just a static column instead, just like table_name already is.

Fixes: #16377

Closes scylladb/scylladb#16881
2024-01-22 13:12:02 +01:00
Petr Gusev
6a4176c84f Update seastar submodule
* seastar 8b9ae36b...85359b28 (4):
  > rpc: extend the use_gate until request processing is finished

Fixes scylladb/scylladb#16382

  > scripts: Remove build.sh
  > build: do not install FindProtobuf.cmake
  > net: add missing include

Closes scylladb/scylladb#16883
2024-01-22 11:29:50 +01:00
Kamil Braun
1007ac4956 Merge 'sync_raft_topology_nodes: force_remove_endpoint for left nodes only if an IP is not used by other nodes' from Petr Gusev
Before the patch we called `gossiper.remove_endpoint` for IP-s of the
left nodes. The problem is that in replace-with-same-ip scenario we
called `gossiper.remove_endpoint` for IP which is used by the new,
replacing node. The `gossiper.remove_endpoint` method puts the IP into
quarantine, which means gossiper will ignore all events about this IP
for `quarantine_delay` (one minute by default). If we immediately
replace just replaced node with the same IP again, the bootstrap will
fail since the gossiper events are blocked for this IP, and we won't be
able to resolve an IP for the new host_id.

Another problem was that we called gossiper.remove_endpoint method,
which doesn't remove an endpoint from `_endpoint_state_map`, only from
live and unreachable lists. This means the IP will keep circulating in
the gossiper message exchange between cluster nodes until full cluster
restart.

This patch fixes both of these problems. First, we rely on the fact that
when topology coordinator moves the `being_replaced` node to the left
state, the IP of the `replacing` node is known to all nodes. This means
before removing an IP from the gossiper we can check if this IP is
currently used by another node in the current raft topology. This is
done by constructing the `used_ips` map based on normal and transition
nodes. This map is cached to avoid quadratic behaviour.

Second, we call `gossiper.force_remove_endpoint`, not
`gossiper.remove_endpoint`. This function removes and IP from
`_endpoint_state_map`, as well as from live and unreachable lists.

Closes scylladb/scylladb#16820

* github.com:scylladb/scylladb:
  get_peer_info_for_update: update only required fields in raft topology mode
  get_peer_info_for_update: introduce set_field lambda
  storage_service::on_change: fix indent
  storage_service::on_change: skip handle_state functions in raft topology mode
  test_replace_different_ip: check old IP is removed from gossiper
  test_replace: check two replace with same IP one after another
  storage_service: sync_raft_topology_nodes: force_remove_endpoint for left nodes only if an IP is not used by other nodes
2024-01-22 11:25:55 +01:00
Botond Dénes
742bc1bd11 test/topology_experimental_raft: test_tablet.py: disable flaky test
Skip test_tablet_missing_data_repair, it is failing a lot breaking
promotion and CI. Can't revert because the PR introducing it was already
piled on. So disable while investigated.

Refs: #16859

Closes scylladb/scylladb#16879
2024-01-22 11:49:05 +02:00
Avi Kivity
9e8b65f587 chunked_vector: remove range constructor
Standard containers don't have constructors that take ranges;
instead people use boost::copy_range or C++23 std::ranges::to.

Make the API more uniform by removing this special constructor.

The only caller, in a test, is adjusted.

Closes scylladb/scylladb#16905
2024-01-22 10:26:15 +02:00
Lakshmi Narayanan Sreethar
a1867986e7 test.py: deduce correct path for unit tests when built with cmake
Fix the path deduction for unit test executables when the source code is
built with cmake.

Fixes #16906

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#16907
2024-01-22 10:03:44 +02:00
Nadav Har'El
0bef50ef0c cql-pytest: add "--vnodes" option to "run" script
Running test/cql-pytest/run now defaults to enabling the "tablets"
experimental feature when running Scylla - and tests detect this and
use this feature as appropriate. This is the correct default going
forward, but in the short term it would be nice to also have an
option to easily do a manual test run *without* tablets.

So this patch adds a "--vnodes" option to the test/cql-pytest/run
script. This option causes "run" to run Scylla without enabling the
"tablets" experimental feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16896
2024-01-22 09:35:11 +02:00
Anna Stuchlik
a462b914cb doc: add 2024.1 to the OSS vs. Enterprise matrix
This commit adds the information that
ScyllaDB Enterprise 2024.1 is based
on ScyllaDB Open Source 5.4
to the OSS vs. Enterprise matrix.

Closes scylladb/scylladb#16880
2024-01-22 09:25:08 +02:00
Kefu Chai
9550f29d22 cql3: add formatter for cql3::prepared_cache_key_type
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for cql3::prepared_cache_key_type
and cql3::prepared_cache_key_type::cache_key_type, and remove
their operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16901
2024-01-21 19:12:59 +02:00
Avi Kivity
3092e3a5dc Merge 'doc: improvements to the Create Cluster page' from Anna Stuchlik
This PR:
- Removes the redundant information about previous versions from the Create Cluster page.
- Fixes language mistakes on that page, and replaces "Scylla" with "ScyllaDB".

(nobackport)

Closes scylladb/scylladb#16885

* github.com:scylladb/scylladb:
  doc: fix the language on the Create Cluster page
  doc: remove reduntant info about old versions
2024-01-21 18:18:32 +02:00
Avi Kivity
5810396ba1 Merge 'Invalidate prepared statements for views when their schema changes.' from Eliran Sinvani
When a base table changes and altered, so does the views that might
refer to the added column (which includes "SELECT *" views and also
views that might need to use this column for rows lifetime (virtual
columns).
However the query processor implementation for views change notification
was an empty function.
Since views are tables, the query processor needs to at least treat them
as such (and maybe in the future, do also some MV specific stuff).
This commit adds a call to `on_update_column_family` from within
`on_update_view`.
The side effect true to this date is that prepared statements for views
which changed due to a base table change will be invalidated.

Fixes https://github.com/scylladb/scylladb/issues/16392

This series also adds a test which fails without this fix and passes when the fix is applied.

Closes scylladb/scylladb#16897

* github.com:scylladb/scylladb:
  Add test for mv prepared statements invalidation on base alter
  query processor: treat view changes at least as table changes
2024-01-21 17:43:49 +02:00
Kefu Chai
d1dd71fbd7 mutation: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16889
2024-01-21 16:58:26 +02:00
Kefu Chai
1ce58595aa dht: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16891
2024-01-21 16:56:16 +02:00
Kefu Chai
45c4f2039b cql3: add formatter for cql3::ut_name
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define a formatter for cql3::ut_name, and remove
their operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16890
2024-01-21 16:53:05 +02:00
Kefu Chai
f916286b25 index: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16892
2024-01-21 16:52:25 +02:00
Kefu Chai
ce076b5ae3 gossiping_property_file_snitch: drop unused using namespace
we don't use any symbol in this namespace, in this function, so drop it.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16893
2024-01-21 16:48:37 +02:00
Eliran Sinvani
0e5a8cad62 Add test for mv prepared statements invalidation on base alter
Issue #16392 describes a bug where when a base table is altered, it's
materialized views prepared statements are not invalidated which in turn
causes them to return missing data.
This test reproduces this bug and serves as a regression test for this
problem.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2024-01-21 15:44:06 +02:00
Eliran Sinvani
5e33d9346b query processor: treat view changes at least as table changes
When a base table changes and altered, so does the views that might
refer to the added column (which includes "SELECT *" views and also
views that might need to use this column for rows lifetime (virtual
columns).
However the query processor implementation for views change notification
was an empty function.
Since views are tables, the query processor needs to at least treat them
as such (and maybe in the future, do also some MV specific stuff).
This commit adds a call to `on_update_column_family` from within
`on_update_view`.
The side effect true to this date is that prepared statements for views
which changed due to a base table change will be invalidated.

Fixes #16392

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2024-01-21 15:40:54 +02:00
Petr Gusev
5de970e430 get_peer_info_for_update: update only required fields in raft topology mode
Some fields of system.peers table are updated
through raft, we don't need to peek them from gossiper.

The goal of the patch is to declare explicitly
which code is responsible for which fields.
In particular, in raft topology mode we don't
need to update raft-managed fields since
it's done in topology_state_load and
raft_ip_address_updater.
2024-01-19 20:37:12 +04:00
Petr Gusev
f51f843b67 get_peer_info_for_update: introduce set_field lambda
This is a refactoring commit. In the next commit
we'll add a parameter to this unified lambda and
this is easy to do if we have only one lambda and
not three.
2024-01-19 20:37:12 +04:00
Petr Gusev
37063e2432 storage_service::on_change: fix indent 2024-01-19 20:37:12 +04:00
Petr Gusev
8e6b569de5 storage_service::on_change: skip handle_state functions in raft topology mode
We don't need them in raft topology mode since the token_metadata
update happens in topology_state_load function. We lift the
_raft_topology_change_enabled checks from those functions to on_change.
2024-01-19 20:37:12 +04:00
Petr Gusev
1e00889842 test_replace_different_ip: check old IP is removed from gossiper
In this commit we modify the existing
test_replace_different_ip. We add the check that the old
IP is not contained in alive or down lists, which
means it's completely wiped from gossiper. This test is failing
without the force_remove_endpoint fix from
a previous commit. We also check that the state of
local system.peers table is correct.
2024-01-19 20:36:52 +04:00
Anna Stuchlik
d345a893d6 doc: fix the language on the Create Cluster page
This commit fixes language mistakes on
the Create Cluster page, and replaces
"Scylla" with "ScyllaDB".
2024-01-19 17:21:12 +01:00
Anna Stuchlik
af669dd7ae doc: remove reduntant info about old versions
This commit removes the information about
old versions, which is reduntant in the next
upcoming version.
2024-01-19 17:06:34 +01:00
Anna Stuchlik
b1ba904c49 doc: remove upgrade for unsupported versions
This commit removes the upgrade guides
from ScyllaDB Open Source to Enterprise
for versions we no longer support.

In addition, it removes a link to
one of the removed pages from
the Troubleshooting section (the link is
redundant).

Closes scylladb/scylladb#16249
2024-01-19 15:59:35 +02:00
Mikołaj Grzebieluch
c589793a9e test.py: test_maintenance_socket: remove pytest.xfail
Issue https://github.com/scylladb/python-driver/issues/278 was fixed in
https://github.com/scylladb/python-driver/pull/279.

Closes scylladb/scylladb#16873
2024-01-19 14:54:15 +01:00
Botond Dénes
b50d9bb802 Merge 'Add code coverage support' from Eliran Sinvani
This mini-set includes code coverage support for ScyllaDB, it provides:
1. Support for building ScyllaDB with coverage support.
2. Utilities for processing coverage profiling data
3. test.py support for generation and processing of coverage profiling into an lcov trace files which can later be used to produce HTML or textual coverage reports.

Refs #16323

Closes scylladb/scylladb#16784

* github.com:scylladb/scylladb:
  Add code coverage documentation
  test.py: support code coverage
  code coverage: Add libraries for coverage handling
  test.py: support --coverage and --coverage-mode
  configure.py support coverage profiles on standrad build modes
2024-01-19 15:27:44 +02:00
Pavel Emelyanov
e62114214f Merge 'More logging for Raft-based topology' from Kamil Braun
Currently if topology coordinator gets stuck in a CI test run it's hard to debug this (e.g. scylladb/scylladb#16708). We can add a lot of logging inside topology coordinator code to aid debugging, without spamming the logs -- these are relatively rare control plane events.

Closes scylladb/scylladb#16749

* github.com:scylladb/scylladb:
  test/pylib: scylla_cluster: enable raft_topology=debug level by default
  raft topology: increase level of some TRACE messages
  raft topology: log when entering transition states
  raft topology: don't include null ID in exclude_nodes
  raft topology: INFO log when executing global commands and updating topology state
  storage_service: separate logger for raft topology
2024-01-19 16:19:44 +03:00
Nadav Har'El
debf6753c7 Merge 'test/cql-pytest: run tests with tablets' from Botond Dénes
Add `--experimental-features=tablets` to both `test/cql-pytest/suite.yaml` and `test/cql-pytest/run.py`, so tablets are enabled. Detect tablet support in `contest.py` and add an xfail and skip marker to mark tests that fail/crash with tablets. These are expected to be fixed soon.

Some tests checking things around alter-keyspace, had to force-disable tablets on the created keyspace, because tablets interfere with the test (a keyspace with tablets cannot have simple strategy for example).
Tablets were also interfering with `test_keyspace.py:test_storage_options_local`, because it is expecting `system_schema.scylla_keyspaces` to not have any entries for local storage keyspace, but they have it if tablets are enabled. Adjust the test to account for this.

Closes scylladb/scylladb#16840

* github.com:scylladb/scylladb:
  test/cql-pytest: run.py,suite.yaml: enable tablets by default
  test/cql-pytest: sprinkle xfail_tablets and skip_with_tablets as needed
  test/cql-pytest: disable tablets for some keyspace-altering tests
  test/cql-pytest: test_keyspace.py: test_storage_options_local(): fix for tablets
  test/cql-pytest: fix test_tablets.py to set initial_tablets correctly
  test/cql-pytest: add tablet detection logic and fixtures
  test/cql-pytest: extract is_scylla check into util.py
2024-01-19 13:38:56 +02:00
Kamil Braun
cc039498c6 Update tools/cqlsh submodule
* tools/cqlsh 426fa0ea...b8d86b76 (8):
  > Make cqlsh work with unix domain sockets

Fixes scylladb/scylladb#16489

  > Bump python-driver version
  > dist/debian: add trailer line
  > dist/debian: wrap long line
  > Draft: explicit build-time packge dependencies
  > stop retruning status_code=2 on schema disagreement
  > Fix minor typos in the code
  > Dockerfile: apt-get update and apt-get upgrade to get latest OS packages
2024-01-19 11:23:22 +01:00
Botond Dénes
04881b3915 test/cql-pytest: run.py,suite.yaml: enable tablets by default
All the preparations are done, the tests can now run with tablets.
2024-01-19 03:46:38 -05:00
Botond Dénes
075be5a04a test/cql-pytest: sprinkle xfail_tablets and skip_with_tablets as needed
For tests that cover functionality, which doesn't yet work with tablets.
These tests and the respective functionality they test, are expected to
be fixed soon, and then these fixtures will be removed.
2024-01-19 03:46:38 -05:00
Botond Dénes
6e6bee4368 test/cql-pytest: disable tablets for some keyspace-altering tests
When tablets are enabled on a keyspace, they cannot be altered to simple
replication strategy anymore.
These keyspaces are testing exactly that, so disable tablets on the
initial keyspace create statements.
2024-01-19 03:46:38 -05:00
Botond Dénes
5f11aa940d test/cql-pytest: test_keyspace.py: test_storage_options_local(): fix for tablets
This test expects a keyspace with local storage option, to not have a
row in system_schema.scylla_keyspace. With tablets enabled by default,
this won't be the case. Adjust the test to check for the specific
storage-related columns instead.
2024-01-19 03:46:38 -05:00