Refactor the existing stats tracking and updating
code into struct latency_stats_tracker and while at it,
count lock_acquisitions only on success.
Decrement operations_currently_waiting_for_lock in the destructor
so it's always balanced with the uncoditional increment
in the ctor.
As for updating estimated_waiting_for_lock, it is always
updated in the dtor, both on success and failure since
the wait for the lock happened, whether waiting
timed out or not.
Fixes#12190
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12225
The schedule_repair() receives a bunch of endpoint:mutations pairs and tries to create handlers for those. When creating the handlers it re-obtains topology from schema->ks->effective_replication_map chain, but this new topology can be outdated as compared to the list of endpoints at hand.
The fix is to carry the e.r.m. pointer used by read executor reconciliation all the way down to repair handlers creation. This requires some manipulations with mutate_internal() and mutate_prepare() argument lists.
fixes: #12050 (it was the same problem)
Closes#12256
* github.com:scylladb/scylladb:
proxy: Carry replication map with repair mutation(s)
proxy: Wrap read repair entries into read_repair_mutation
proxy: Turn ref to forwardable ref in mutations iterator
A complicated function (in continuation style) that benefits
from this simplification.
Closes#12289
* github.com:scylladb/scylladb:
sstables: update_info_for_opened_data: reindent
sstables: update_info_for_opened_data: coroutinize
Sometimes a single modification to a base partition requires updates to
a large number of view rows. A common example is deletion of a base
partition containing many rows. A large BATCH is also possible.
To avoid large allocations, we split the large amount of work into
batch of 100 (max_rows_for_view_updates) rows each. The existing code
assumed an empty result from one of these batches meant that we are
done. But this assumption was incorrect: There are several cases when
a base-table update may not need a view update to be generated (see
can_skip_view_updates()) so if all 100 rows in a batch were skipped,
the view update stopped prematurely. This patch includes two tests
showing when this bug can happen - one test using a partition deletion
with a USING TIMESTAMP causing the deletion to not affect the first
100 rows, and a second test using a specially-crafed large BATCH.
These use cases are fairly esoteric, but in fact hit a user in the
wild, which led to the discovery of this bug.
The fix is fairly simple: To detect when build_some() is done it is no
longer enough to check if it returned zero view-update rows; Rather,
it explicitly returns whether or not it is done as an std::optional.
The patch includes several tests for this bug, which pass on Cassandra,
failed on Scylla before this patch, and pass with this patch.
Fixes#12297.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12305
The abseil and tools/java submodules were accidentally updated in
71bc12eecc
(merged to master in 51f867339e)
This series reverts those changes.
Closes#12311
* github.com:scylladb/scylladb:
Revert accidental update of tools/java submodule
Revert accidental update of abseil submodule
The create_write_response_handler() for read repair needs the e.r.m.
from the caller, because it effectively accepts list of endpoints from
it.
So this patch equips all read_repair_mutation-s with the e.r.m. pointer
so that the handler creation can use it. It's the same for all
mutations, so it's a waste of space, but it's not bad -- there's
typically few mutations in this range and the entry passed there is
temporary, so even lots of them won't occupy lots of memory for long.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The schedule_repair() operates on a map of endpoint:mutations pairs.
Next patch will need to extend this entry and it's going to be easier if
the entry is wrapped in a helper structure in advance.
This is where the forwardable reference cursor from the previous patch
gets its user. The schedule_repair() produces a range of rvalue
wrappers, but the create_write_response_handler accepting it is OK, it
copies mutations anyway.
The printing operator is added to facilitate mutations logging from
mutate_internal() method.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The mutate_prepare() is iterating over range of mutation with 'auto&'
cursor thus accepting only lvalues. This is very restrictive, the caller
of mutate_prepare() may as well provide rvalues if the target
create_write_response_handler() or lambda accepts it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This PR implements two things:
* Getting the value of a conjunction of elements separated by `AND` using `expr::evaluate`
* Preparing conjunctions using `prepare_expression`
---
`NULL` is treated as an "unkown value" - maybe `true` maybe `false`.
`TRUE AND NULL` evaluates to `NULL` because it might be `true` but also might be `false`.
`FALSE AND NULL` evaluates to `FALSE` because no matter what value `NULL` acts as, the result will still be `FALSE`.
Unset and empty values are not allowed.
Usually in CQL the rule is that when `NULL` occurs in an operation the whole expression becomes `NULL`, but here we decided to deviate from this behavior.
Treating `NULL` as an "unkown value" is the standard SQL way of handing `NULLs` in conjunctions.
It works this way in MySQL and Postgres so we do it this way as well.
The evaluation short-circuits. Once `FALSE` is encountered the function returns `FALSE` immediately without evaluating any further elements.
It works this way in Postgres as well, for example:
`SELECT true AND NULL AND 1/0 = 0` will throw a division by zero error,
but `SELECT false AND 1/0 = 0` will successfully evaluate to `FALSE`.
Closes#12300
* github.com:scylladb/scylladb:
expr_test: add unit tests for prepare_expression(conjunction)
cql3: expr: make it possible to prepare conjunctions
expr_test: add tests for evaluate(conjunction)
cql3: expr: make it possible to evaluate conjunctions
Fixes https://github.com/scylladb/scylladb/issues/11712
Updates added with this PR:
- Added a new section with the description of AzureSnitch (similar to others + examples and language improvements).
- Fixed the headings so that they render properly.
- Replaced "Scylla" with "ScyllaDB".
Closes#12254
* github.com:scylladb/scylladb:
docs: replace Scylla with ScyllaDB on the Snitches page
docs: fix the headings on the Snitches page
doc: add the description of AzureSnitch to the documentation
prepare_expression used to throw an error
when encountering a conjunction.
Now it's possible to use prepare_expression
to prepare an expression that contains
conjunctions.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Previously it was impossible to use expr::evaluate()
to get the value of a conjunction of elements
separated by ANDs.
Now it has been implemented.
NULL is treated as an "unkown value" - maybe true maybe false.
`TRUE AND NULL` evaluates to NULL because it might be true but also might be false.
`FALSE AND NULL` evaluates to FALSE because no matter what value NULL acts as, the result will still be FALSE.
Unset and empty values are not allowed.
Usually in CQL the rule is that when NULL occurs in an operation the whole expression
becomes NULL, but here we decided to deviate from this behavior.
Treating NULL as an "unkown value" is the standard SQL way of handing NULLs in conjunctions.
It works this way in MySQL and Postgres so we do it this way as well.
The evaluation short-circuits. Once FALSE is encountered the function returns FALSE
immediately without evaluating any further elements.
It works this way in Postgres as well, for example:
`SELECT true AND NULL AND 1/0 = 0` will throw a division by zero error
but `SELECT false AND 1/0 = 0` will successfully evaluate to FALSE.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
The infinetely high time_point of `db_clock::time_point::max()`
used in ba42852b0e
is too high for some clients that can't represent
that as a date_time string.
Instead, limit it to 9999-12-31T00:00:00+0000,
that is practically sufficient to ensure truncation of
all sstables and should be within the clients' limits.
Fixes#12239
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12273
Add instructions on how to backport a feature to on older version of Scylla.
It contains a detailed step-by-step instruction so that people unfamiliar with intricacies of Scylla's repository organization can easily get the hang of it.
This is the guide I wish I had when I had to do my first backport.
I put it in backport.md because that looks like the file responsible for this sort of information.
For a moment I thought about `CONTRIBUTING.md`, but this is a really short file with general information, so it doesn't really fit there. Maybe in the future there will be some sort of unification (see #12126)
Closes#12138
* github.com:scylladb/scylladb:
dev/docs: add additional git pull to backport docs
docs/dev: add a note about cherry-picking individual commits
docs/dev: use 'is merged into' instead of 'becomes'
docs/dev: mention that new backport instructions are for the contributor
docs/dev: Add backport instructions for contributors
Several snitch drivers make http requests to get
region/dc/zone/rack/whatever from the cloud provider. They blindly rely
on the response being successfull and read response body to parse the
data they need from.
That's not nice, add checks for requests finish with http OK statuses.
refs: #12185
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#12287
update_normal_tokens checks that that the endpoint is in topology. Currently we call update_topology on this path only if it's not a normal_token_owner, but there are paths when the endpoint could be a normal token owner but still
be pending in topology so always update it, just in case.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12080
* github.com:scylladb/scylladb:
storage_service: handle_state_normal: always update_topology before update_normal_tokens
storage_service: handle_state_normal: delete outdated comment regarding update pending ranges race
Thanks to #12250, Host IDs uniquely identify nodes. We can use them as Raft IDs which simplifies the code and makes reasoning about it easier, because Host IDs are always guaranteed to be present (while Raft IDs may be missing during upgrade).
Fixes: https://github.com/scylladb/scylladb/issues/12204Closes#12275
* github.com:scylladb/scylladb:
service/raft: raft_group0: take `raft::server_id` parameter in `remove_from_group0`
gms, service: stop gossiping and storing RAFT_SERVER_ID
Revert "gms/gossiper: fetch RAFT_SERVER_ID during shadow round"
service: use HOST_ID instead of RAFT_SERVER_ID during replace
service/raft: use gossiped HOST_ID instead of RAFT_SERVER_ID to update Raft address map
main: use Host ID as Raft ID
This series improves the add-node-to-cluster document, in particular around the documentation for the associated cleanup procedure, and the prerequisite steps.
It also removes information about outdated releases.
Closes#12210
* github.com:scylladb/scylladb:
docs: operating-scylla: add-node-to-cluster: deleted instructions for unsupported releases
docs: operating-scylla: add-node-to-cluster: cleanup: move tips to a note
docs: operating-scylla: add-node-to-cluster: improve wording of cleanup instructions
docs: operating-scylla: prerequisites: system_auth is a keyspace, not a table
docs: operating-scylla: prerequisites: no Authetication status is gathered
docs: operating-scylla: prerequisites: simplify grep commands
docs: operating-scylla: add-node-to-cluster: prerequisites: number sub-sections
docs: operating-scylla: add-node-to-cluster: describe other nodes in plural
Revert version change made by PR #11106, which increased it to `4.0.0`
to enable server-side describe on latest cqlsh.
Turns out that our tooling some way depends on it (eg. `sstableloader`)
and it breaks dtests.
Reverting only the version allows to leave the describe code unchanged
and it fixes the dtests.
cqlsh 6.0.0 will return a warning when running `DESC ...` commands.
Closes#12272
We no longer need to translate from IP to Raft ID using the address map,
because Raft ID is now equal to the Host ID - which is always available
at the call site of `remove_from_group0`.
It is equal to (if present) HOST_ID and no longer used for anything.
The application state was only gossiped if `experimental-features`
contained `raft`, so we can free this slot.
Similarly, `raft_server_id`s were only persisted in `system.peers` if
the `SUPPORTS_RAFT` cluster feature was enabled, which happened only
when `experimental-features` contained `raft`. The `raft_server_id`
field in the schema was also introduced recently in `master` and didn't
get to be in a release yet. Given either of these reasons, we can remove
this field safely.
Fixes /scylladb/scylla-enterprise/issues#1262
Changes the somewhat ambiguous "none" into "not set" to clarify that "none" is not an
option to be written out, but an absense of a choice (in which case you also have made
a choice).
Closes#12270
The Host ID now uniquely identifies a node (we no longer steal it during
node replace) and Raft is still experimental. We can reuse the Host ID
of a node as its Raft ID. This will allow us to remove and simplify a
lot of code.
With this we can already remove some dead code in this commit.
Script for "one-click" opening of coredumps.
It extracts the build-id from the coredump, retrieves metadata for that
build, downloads the binary package, the source code and finally
launches the dbuild container, with everything ready to load the
coredump.
The script is idempotent: running it after the prepartory steps will
re-use what is already donwloaded.
The script is not trying to provide a debugging environment that caters
to all the different ways and preferences of debugging. Instead, it just
sets up a minimalistic environment for debugging, while providing
opportunities for the user to customization according to their
preferred.
I'm not entirely sure, coredumps from master branch will work, but we
can address this later when we confirm they don't.
Example:
$ ~/ScyllaDB/scylla/worktree0/scripts/open-coredump.sh ./core.scylla.113.bac3650b616f4f09a4d1ab160574b6a5.4349.1669185225000000000000
Build id: 5009658b834aaf68970135bfc84f964b66ea4dee
Matching build is scylla-5.0.5 0.20221009.5a97a1060 release-x86_64
Downloading relocatable package from http://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-5.0/scylla-x86_64-package-5.0.5.0.20221009.5a97a1060.tar.gz
Extracting package scylla-x86_64-package-5.0.5.0.20221009.5a97a1060.tar.gz
Cloning scylla.git
Downloading scylla-gdb.py
Copying scylla-gdb.py from /home/bdenes/ScyllaDB/storage/11961/open-coredump.sh.dir/scylla.repo
Launching dbuild container.
To examine the coredump with gdb:
$ gdb -x scylla-gdb.py -ex 'set directories /src/scylla' --core ./core.scylla.113.bac3650b616f4f09a4d1ab160574b6a5.4349.1669185225000000000000 /opt/scylladb/libexec/scylla
See https://github.com/scylladb/scylladb/blob/master/docs/dev/debugging.md for more information on how to debug scylla.
Good luck!
[root@fedora workdir]#
Closes#12223
We want to always be able to distinguish between
the replacing node and the replacee by using different,
unique, host identifiers.
This will allow us to use the host_id authoritatively
to identify the node (rather then its endpoint ip address)
for token mapping and node operations.
Also, it will be used in the following patch to never allow the
replaced node to rejoin the cluster, as its host_id should never
be reused.
This change does not affect #5523, the replaced node may still steal back its tokens if restarted.
Refs #9839
Refs #12040Closes#12250
* github.com:scylladb/scylladb:
docs: replace-dead-node: update host_id of replacing node
docs: replace-dead-node: fix alignment
db: system_keyspace: change set_local_host_id to private set_local_random_host_id
storage_service: do not inherit the host_id of a replaced a node
The generic task holds and destroyes a task::impl
but we want the derived class's destructor to be called
when the task is destroyed otherwise, for example,
member like abort_source subscription will not be destroyed
(and auto-unlinked).
Fixes#12183
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12266
It is moved into the async thread so the encapsulating
function should be defined mutable to move the func
rather thna copying it.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closes#12267
This patch includes a translation of two more test files from
Cassandra's CQL unit test directory cql3/validation/operations.
All tests included here pass on Cassandra. Several test fail on Scylla
and are marked "xfail". These failures discovered two previously-unknown
bugs:
#12243: Setting USING TTL of "null" should be allowed
#12247: Better error reporting for oversized keys during INSERT
And also added reproducers for two previously-known bugs:
#3882: Support "ALTER TABLE DROP COMPACT STORAGE"
#6447: TTL unexpected behavior when setting to 0 on a table with
default_time_to_live
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12248
We recently (commit 6a5d9ff261) started
to use std::source_location instead of std::experimental::source_location.
However, this does not work on clang 14, because libc++ 12's
<source_location> only works if __builtin_source_location, and that is
not available on clang 14.
clang 15 is just three months old, and several relatively-recent
distributions still carry clang 14 so it would be nice to support it
as well.
So this patch adds a trivial compatibility header file, which, when
included and compiled with clang 14, it aliases the functional
std::experimental::source_location to std::source_location.
It turns out it's enough to include the new header file from three
headers that included <source_location> - I guess all other uses
of source_location depend on those header files directly or indirectly.
We may later need to include the compatibility header file in additional
places, bug for now we don't.
Refs #12259
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#12265
This PR adds server-side `DESCRIBE` statement, which is required in latest cqlsh version.
The only change from the user perspective is the `DESC ...` statement can be used with cqlsh version >= 6.0. Previously the statement was executed from client side, but starting with Cassandra 4.0 and cqlsh 6.0, execution of describe was moved to server side, so the user was unable to do `DESC ...` with Scylla and cqlsh 6.0.
Implemented describe statements:
- `DESC CLUSTER`
- `DESC [FULL] SCHEMA`
- `DESC [ONLY] KEYSPACE`
- `DESC KEYSPACES/TYPES/FUNCTIONS/AGGREGATES/TABLES`
- `DESC TYPE/FUNCTION/AGGREGATE/MATERIALIZED VIEW/INDEX/TABLE`
- `DESC`
[Cassandra's implementation for reference](https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/cql3/statements/DescribeStatement.java)
Changes in this patch:
- cql3::util: added `single_quite()` function
- added `data_dictionary::keyspace_element` interface
- implemented `data_dictionary::keyspace_element` for:
- keyspace_metadata,
- UDT, UDF, UDA
- schema
- cql3::functions: added `get_user_functions()` and `get_user_aggregates()` to get all UDFs/UDAs in specified keyspace
- data_dictionary::user_types_metadata: added `has_type()` function
- extracted `describe_ring()` from storage_service to standalone helper function in `locator/util.hh`
- storage_proxy: added `describe_ring()` (implemented using helper function mentioned above)
- extended CQL grammar to handle describe statement
- increased version in `version.hh` to 4.0.0, so cqlsh will use server-side describe statement
Referring: https://github.com/scylladb/scylla/issues/9571, https://github.com/scylladb/scylladb/issues/11475Closes#11106
* github.com:scylladb/scylladb:
version: Increasing version
cql-pytest: Add tests for server-side describe statement
cql-pytest: creating random elements for describe's tests
cql3: Extend CQL grammar with server-side describe statement
cql3:statements: server-side describe statement
data_dictonary: add `get_all_keyspaces()` and `get_user_keyspaces()`
storage_proxy: add `describe_ring()` method
storage_service, locator: extract describe_ring()
data_dictionary:user_types_metadata: add has_type() function
cql3:functions: `get_user_functions()` and `get_user_aggregates()`
implement `keyspace_element` interface
data_dictionary: add `keyspace_element` interface
cql3: single_quote() util function
view: row_lock: lock_ck: reindent
test/topology: enable replace tests
service/raft: report an error when Raft ID can't be found in `raft_group0::remove_from_group0`
service: handle replace correctly with Raft enabled
gms/gossiper: fetch RAFT_SERVER_ID during shadow round
service: storage_service: sleep 2*ring_delay instead of BROADCAST_INTERVAL before replace
The `current()` version in version.hh has to be increased to at
least 4.0.0, so server-side describe will be used. Otherwise,
cqlsh returns warning that client-side describe is not supported.