Commit Graph

40965 Commits

Author SHA1 Message Date
Kamil Braun
994a2ea5c3 Merge 'Call left/joined notifiers when topology coordinator is enabled' from Gleb
The gossiper topology change code calls left/joined notifiers when a
node leave or joins the cluster. This code it not executed in topology coordinator
mode, so the coordinator needs to call those notifiers by itself. The
series add the calls.

Fixes scylladb/scylladb#15841

* 'gleb/raft-topo-notifications-v1' of github.com:scylladb/scylla-dev:
  storage service: topology coordinator: call notify_joined() when a node joins a cluster
  storage service: topology coordinator: call notify_left() when a node leaves a cluster
  storage_service: drop redundant check from notify_joined()
2024-01-25 12:12:53 +01:00
Kefu Chai
1d33a68dd7 tests: utils: error injection: print time duration instead of count
instead of casting / comparing the count of duration unit, let's just
compare the durations, so that boost.test is able to print the duration
in a more informative and user friendly way (line wrapped)

test/boost/error_injection_test.cc(167): fatal error:
    in "test_inject_future_disabled":
      critical check wait_time > sleep_msec has failed [23839ns <= 10ms]

Refs #15902
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-25 19:10:24 +08:00
Kefu Chai
8a5689e7a7 error_injection: do not cast to milliseconds when injecting timeout
before this change, we always cast the wait duration to millisecond,
even if it could be using a higher resolution. actually
`std::chrono::steady_clock` is using `nanosecond` for its duration,
so if we inject a deadline using `steady_clock`, we could be awaken
earlier due to the narrowing of the duration type caused by the
duration_cast.

in this change, we just use the duration as it is. this should allow
the caller to use the resolution provided by Seastar without losing
the precision.

Fixes #15902

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-01-25 19:10:24 +08:00
Gleb Natapov
adf70aae15 storage service: topology coordinator: call notify_joined() when a node joins a cluster
When the topology coordinator is used for topology changes the gossiper
based code that calls notify_joined() is not called. The coordinator needs
to call it itself. But it needs to call it only once when node becomes
normal. For that the patch changes state loading code to remember the
old set of nodes in normal state to check if a node that is normal after
new state is loaded was not in the normal state before.
2024-01-25 12:28:08 +02:00
Botond Dénes
c9f247f3e8 Merge 'sstables: writer: don't block topology changes while writing sstables' from Avi Kivity
The sstable writer held the effective_replication_map_ptr while writing
sstables, which is both a layering violation and slows down tablet load
balancing. It was needed in order to ensure the sharder was stable. But
it turns out that sharding metadata is unnecessary for tablets, so just
skip the whole thing when writing an sstable for tablets.

Closes scylladb/scylladb#16953

* github.com:scylladb/scylladb:
  sstables: writer: don't require effective_replication_map for sharding metadata
  schema: provide method to get sharder, iff it is static
2024-01-25 12:12:01 +02:00
Botond Dénes
8e82df6fb6 Merge 'coverage libraries: bug fixes' from Eliran Sinvani
This mini-series contains two bug fixes that were found as part of testing coverage reporting in CI:
ref: https://github.com/scylladb/scylladb/pull/16895

1. The html-fixup which is triggered when using:`test/pylib/coverage_utils.py lcov-tools genhtml...` rendered incorrect links for multiple links in the same line.
2. For files that contined `,` in their name the output was simply wrong and resulted in lcov not being able to find such files for the purpose of filtering or generating reports.

The aforementioned draft PR served as a testing bed for finding and fixing those bugs.

Closes scylladb/scylladb#16977

* github.com:scylladb/scylladb:
  lcov_utils.py: support sourcefiles that contains commas in their name
  coreage_utils.py: make regular expression lazy in  html-fixup
2024-01-25 11:46:15 +02:00
Kefu Chai
0fbfc96619 db: add formatter for schema_tables::table_kind
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for db::schema_tables::table_kind,
and its operator<<() is still used by the homebrew generic formatter
for std::map<>, so it is preserved.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16972
2024-01-25 11:33:13 +03:00
Kefu Chai
ffb5ad494f api: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16973
2024-01-25 11:28:02 +03:00
Patryk Wrobel
cdfe0c1c35 api/api.hh: use const reference when looping over container
When reference is not used in the range-for loop, then
each element of a container is copied. Such copying
is not a problem for scalar types. However, the in case
of non-trivial types it may cause unneeded overhead.

This change replaces copying with const references
to avoid copying of types like seastar::sstring etc.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-01-25 09:20:35 +01:00
Patryk Wrobel
1ca71f2532 api/api.hh: use std::vector::reserve() when the total size is known
When growing via push_back(), std::vector may need to reallocate
its internal block of memory due to not enough space. It is advised
to allocate the required space before appending elements if the
size is known beforehand.

This change introduces usage of std::vector::reserve() in api.hh
to ensure that push_back() does not cause reallocations.

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>
2024-01-25 08:50:19 +01:00
Eliran Sinvani
d27283918f lcov_utils.py: support sourcefiles that contains commas in their name
As part of the parsing, every line of an lcov file was modeled as
INFO_TYPE:field[,field]...
However specifically for info type "SF" which represents the source file
there can only be one field.
This caused files that are using ',' in their names to be cut down up to
the first ',' and as a results not handled  correctly by lcov_utils.py
especially when rewriting a file.
This patch adds a special handling for the "SF" INFO_TYPE.
ref : `man geninfo`

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2024-01-25 09:30:52 +02:00
Eliran Sinvani
11eb9f5bb2 coreage_utils.py: make regular expression lazy in html-fixup
The html-fixup procedure was created because of a bug in genhtml (`man
genhtml` for details about what genhtml is). The bug is that genhtml
doesn't account for file names that contains illegal  url characters (ref:
https://stackoverflow.com/a/1547940/2669716). html-fixup converts those
characters to the %<octet> notation (i.e space character becomes %20
etc..). However, the regular expression used to detect links was eager,
which didn't account for multiple links in the same line. This was
discovered during browsing one of the report and noticing that the links
that are meant to alternate between code view and function view of a
source got scrambled and unusable after html-fixup.
This change makes the regex that is used to detect links lazy so it can
handle multiple links in the same line in an html file correctly.

Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>
2024-01-25 09:30:42 +02:00
Nadav Har'El
69a68e35dd Merge 'scylla-sstable: add support for loading schema of views and indexes' from Botond Dénes
Loading schemas of views and indexes was not supported, with either `--schema-file`, or when loading schema from schema sstables.
This PR addresses both:
* When loading schema from CQL (file), `CREATE MATERIALIZED VIEW` and `CREATE INDEX` statements are now also processed correctly.
* When loading schema from schema tables, `system_schema.views` is also processed, when the table has no corresponding entry in `system_schema.tables`.

Tests are also added.

Fixes: #16492

Closes scylladb/scylladb#16517

* github.com:scylladb/scylladb:
  test/cql-pytest: test_tools.py: add schema-loading tests for MV/SI
  test/cql-pytest: test_tools.py: extract some fixture logic to functions
  test/cql-pytest: test_tools.py: extract common schema-loading facilities into base-class
  tools/schema_loader: load_schema_from_schema_tables(): add support for MV/SI schemas
  tools/schema_loader: load_one_schema_from_file(): add support for view/index schemas
  test/boost/schema_loader_test: add test for mvs and indexes
  tools/schema_loader: load_schemas(): implement parsing views/indexes from CQL
  replica/database: extract existing_index_names and get_available_index_name
  tools/schema_loader: make real_db.tables the only source of truth on existing tables
  tools/schema_loader: table(): store const keyspace&
  tools/schema_loader: make database,keyspace,table non-movable
  cql3/statements/create_index_statement: build_index_schema(): include index metadata in returned value
  cql3/statements/create_index_statement: make build_index_schema() public
  cql3/statements/create_index_statement: relax some method's dependence on qp
  cql3/statements/create_view_statement: make prepare_view() public
2024-01-24 23:36:54 +02:00
Nadav Har'El
df6c9828ef Merge 'Add protobuf and Native histogram support' from Amnon Heiman
Native histograms (also known as sparse histograms) are an experimental Prometheus feature.
They use protobuf as the reporting layer.
Native histograms hold the benefits of high resolution at a lower resource cost.

This series allows sending histograms in a native histogram format over protobuf.
By default, protobuf support is disabled. To use protobuf with native histograms, the command line flag prometheus_allow_protobuf should be set to true, and the Prometheus server should send the accept header with protobuf.

Fixes #12931

Closes scylladb/scylladb#16737

* github.com:scylladb/scylladb:
  main.cc: Add prometheus_allow_protobuf command line
  histogram_metrics_helper: support native histogram
  config: Add prometheus_allow_protobuf flag
2024-01-24 21:24:50 +02:00
Michał Chojnowski
f0eadc734e test: test_tablets: add a test for cleanup after migration
Reproduces the problems fixed by earlier commits in the series.
2024-01-24 19:36:29 +01:00
Botond Dénes
7bb3ed7f23 docs/operating-scylla: scylla-sstable.rst: fix checksum list
Add empty line before list of different checksums in
validate-checksums's description. Otherwise the list is not rendered.

Closes scylladb/scylladb#16401
2024-01-24 16:34:13 +01:00
Kefu Chai
a9851cf834 test.py: replace "$foo is False" with "not $foo"
for better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16960
2024-01-24 15:21:53 +02:00
Kefu Chai
add74ec8ee mutation_writer: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16958
2024-01-24 15:20:02 +02:00
Kefu Chai
c978d1b3f8 config: s/re-use/reuse/
this misspelling is identified by codespell.
per m-w, reuse is a word per-se, and we don't need the hyphen for
addressing the ambiguity in the use cases, like, recover and re-cover.
see also https://www.merriam-webster.com/dictionary/reuse

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16962
2024-01-24 15:19:03 +02:00
Kefu Chai
8c39aba820 tools/scylla-sstable: use canonical path for sst_path
we deduce the paths to other SSTable components from the one
specified from the command line, for instance, if
/a/b/c/me-really-big-Data.db is fed to `scylla sstable`, the tool
would try to read /a/b/c/me-really-big-TOC.txt for the list of
other components. this works fine if the full path is specified
in the command line.

but if a relative path is specified, like, "me-really-big-Data.db",
this does not work anymore. before this change, the tool
would be reading "/me-really-big-TOC.txt", which does not exist
under most circumstances. while $PWD/me-really-big-TOC.txt should
exist if the SSTable is sane.

after this change, we always convert the specified path to
its canonical representation, no matter it is relative or absolutate.
this enables us to get the correct parent path path when trying
to read, for instance, the TOC component.

Fixes #16955
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16964
2024-01-24 13:28:40 +02:00
Michał Chojnowski
b88a0eb9ab test: pylib: add ScyllaCluster.wipe_sstables
Add a method which wipes sstables files for a particular table on a particular
stopped node.
2024-01-24 11:52:49 +01:00
Michał Chojnowski
94cdfcaa94 test: boost: add commitlog_cleanup_test
Adds a test for the commitlog cleanup functionality added
earlier in the series.
2024-01-24 10:37:39 +01:00
Michał Chojnowski
a246bb39ef db: commitlog_replayer: ignore mutations affected by (tablet) cleanups
To avoid data resurrection, mutations deleted by cleanup operations
have to be skipped during commitlog replay.

This patch implements this, based on the metadata recorded on cleanup
operations into system.commitlog_cleanups.
2024-01-24 10:37:39 +01:00
Michał Chojnowski
f458a1bf3e replica: table: garbage-collect irrelevant system.commitlog_cleanups records
Currently, rows in system.commitlog_cleanups are only dropped on node restart,
so the table can accumulate an unbounded number of records.

This probably isn't a problem in practice, because tablet cleanups aren't that
frequent, but this patch adds a countermeasure anyway.

This patch makes the choice to delete the unneeded records right when new records
are added. This isn't ideal -- it would be more natural if the unneeded records
were deleted as soon as they become unneeded -- but it does the job with a
minimal amount of code.
2024-01-24 10:37:38 +01:00
Michał Chojnowski
05ff32ebf9 db: commitlog: add min_position()
Add a helper function which returns the minimum replay position
across all existing or future commitlog segments.
Only positions greater or equal to it can be replayed on the next reboot.

We will use this helper in a future patch to garbage collect some cleanup
metadata which refers to replay positions.
2024-01-24 10:37:38 +01:00
Michał Chojnowski
a10650959c replica: table: populate system.commitlog_cleanups on tablet cleanup
To avoid data resurrection after cleanup, we have to filter out the
cleaned mutations during commitlog replay.

In this patch, we get tablet cleanup to record the affected set of mutations
to system.commitlog_cleanups. In a later patch, we will use these records
for filtering during commitlog replay.
2024-01-24 10:37:38 +01:00
Michał Chojnowski
7c5a8894be db: system_keyspace: add system.commitlog_cleanups
Add a system table which will hold records of cleanup operations,
for the purpose of filtering commitlog replays to avoid data
resurrections.
2024-01-24 10:37:38 +01:00
Michał Chojnowski
8bfd078c54 replica: table: refresh compound sstable set after tablet cleanup
If the compound set isn't refreshed, readers will keep seeing
the dataset as it was before the cleanup, which is a bug.
2024-01-24 10:37:38 +01:00
Kefu Chai
207fe93b90 utils: add formatter for rjson::value
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for rjson::value, and drop its
operator<<().

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16956
2024-01-24 10:30:52 +02:00
Gleb Natapov
b97ff54a41 storage service: topology coordinator: call notify_left() when a node leaves a cluster
When the topology coordinator is used for topology changes the gossiper
based code that calls notify_left() is not called. The coordinator needs
to call it itself.
2024-01-24 10:21:01 +02:00
Gleb Natapov
5459a8b9a5 storage_service: drop redundant check from notify_joined()
notify_joined() is called from handle_state_normal only, so there is no
point checking that the state is normal inside the function as well.
2024-01-24 10:17:12 +02:00
Avi Kivity
8ee75ae8f4 sstables: writer: don't require effective_replication_map for sharding metadata
Currently, we pass an effective_replication_map_ptr to sstable_writer,
so that we can get a stable dht::sharder for writing the sharding metadata.
This is needed because with tablets, the sharder can change dynamically.

However, this is both bad and unnecessary:
 - bad: holding on to an effective_replication_map_ptr is a barrier
   for topology operations, preventing tablet migrations (etc) while
   an sstable is being written
 - unnecessary: tablets don't require sharding metadata at all, since
   two tablets cannot overlap (unlike two sstables from different shards in
   the same node). So the first/last key is sufficient to determine the
   shard/tablet ownership.

Given that, just pass the sharder for vnode sstables, and don't generate
sharding metadata for tablet sstables.
2024-01-23 22:23:08 +02:00
Avi Kivity
b88f422a53 schema: provide method to get sharder, iff it is static
The current get_sharder() method only allows getting a static sharder
(since a dynamic sharder needs additional protection). However, it
chooses to abort if someone attempt to get a dynamic sharder.

In one case, it's useful to get a sharder only if it's static, so
provide a method to do that. This is for providing sstable sharding
metadata, which isn't useful with tablets.
2024-01-23 22:20:59 +02:00
Kamil Braun
05643208a8 Merge 'raft topology: move the topology coordinator to a dedicated file' from Piotr Dulikowski
The `topology_coordinator` is a large class (>1000 loc) which resides in
an even larger source file (storage_service.cc, ~7800 loc). This PR
moves the topology_coordinator class out of the storage_service.cc file
in order to improve modularity and recompilation times during
development.

As a first step, the `topology_mutation_builder` and
`topology_node_mutation_builder` classes are also moved from
storage_service.cc to their own, new header/source files as they are an
important abstraction used both by the topology coordinator code and
some other code in storage_service.cc that won't be moved.

Then, the `topology_coordinator` is moved out. The
`topology_coordinator` class is completely hidden in the new
topology_coordinator.cc file and can only be started and waited on to
finish via the new `run_topology_coordinator` function.

Fixes: scylladb/scylladb#16605

Closes scylladb/scylladb#16609

* github.com:scylladb/scylladb:
  service: move topology coordinator to a separate file
  storage_service: introduce run_topology_coordinator function
  service: move topology mutation builder out of storage_service
  storage_service: detemplate topology_node_mutation_builder::set
2024-01-23 20:02:06 +01:00
Kefu Chai
f86a5ae87a streaming: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16947
2024-01-23 19:38:30 +02:00
Kefu Chai
d493f949ca cql3: add formatter for cql3::statements::statement_type
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.

in this change, we define formatters for
cql3::statements::statement_type. and its operator<<() is dropped.

Refs #13245

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16948
2024-01-23 19:36:24 +02:00
Piotr Dulikowski
c3c3f5c1c8 service: move topology coordinator to a separate file
The topology coordinator is a large class that sits in an even larger
storage_service.cc file. For the sake of code modularization and
reducing recompilation time, move the topology coordinator outside
storage_service.cc.

The topology_coordinator class is moved to the new
topology_coordinator.cc unchanged. Along with it, the following items
are moved:

- wait_for_ip function - it's used both by storage_service and
  topology_coordinator, so in order for the new topology_coordinator.cc
  not to depend on storage service, it is moved to the new file,
- raft_topology logger - for the same reason as wait_for_ip,
- run_topology_coordinator - serves as the main interface for the
  topology coordinator. The topology coordinator class is not exposed at
  all, it's only possible to start the coordinator and wait until it
  shuts down itself via that function.
2024-01-23 17:51:10 +01:00
Avi Kivity
4a57b67634 docs: add a rough diagram of module interaction
It is incomplete and maybe inaccurate, but it is a start.

Closes scylladb/scylladb#16903
2024-01-23 18:08:48 +02:00
Kamil Braun
1824c12975 raft: remove empty() from fsm_output
Nobody remembered to keep this function up to date when adding stuff to
`fsm_output`.

Turns out that it's not being used by any Raft logic but only in some
tests. That use case can now be replaced with `fsm::has_output()` which
is also being used by `raft::server` code.
2024-01-23 16:48:28 +01:00
Kamil Braun
bf6d5309ca test: add test for manual triggering of Raft snapshots 2024-01-23 16:48:28 +01:00
Kamil Braun
617e09137d api: add HTTP endpoint to trigger Raft snapshots
This uses the `trigger_snapshot()` API added in previous commit on a
server running for the given Raft group.

It can be used for example in tests or in the context of disaster
recovery (ref scylladb/scylladb#16683).
2024-01-23 16:48:28 +01:00
Kamil Braun
0eda7a2619 raft: server: add trigger_snapshot API
This allows the user of `raft::server` to ask it to create a snapshot
and truncate the Raft log. In a later commit we'll add a REST endpoint
to Scylla to trigger group 0 snapshots.

One use case for this API is to create group 0 snapshots in Scylla
deployments which upgraded to Raft in version 5.2 and started with an
empty Raft log with no snapshot at the beginning. This causes problems,
e.g. when a new node bootstraps to the cluster, it will not receive a
snapshot that would contain both schema and group 0 history, which would
then lead to inconsistent schema state and trigger assertion failures as
observed in scylladb/scylladb#16683.

In 5.4 the logic of initial group 0 setup was changed to start the Raft
log with a snapshot at index 1 (ff386e7a44)
but a problem remains with these existing deployments coming from 5.2,
we need a way to trigger a snapshot in them (other than performing 1000
arbitrary schema changes).

Another potential use case in the future would be to trigger snapshots
based on external memory pressure in tablet Raft groups (for strongly
consistent tables).
2024-01-23 16:48:28 +01:00
David Garcia
77822fc51d chore: add azure and gcp images extensions
Closes scylladb/scylladb#16942
2024-01-23 16:06:40 +02:00
Botond Dénes
e79ea91990 Merge 'Extend query tracing information' from Michał Jadwiszczak
This little patch adds:
- authenticated user to "Processing a statement" tracing log
- name of a semaphore to reader concurrency semaphore logs

The purpose of this patch is to be able to verify parts of query execution to track down issues with service levels.

```
cassandra@cqlsh> select * from ks1.t where a = 1;

 a | b
---+---

(0 rows)

Tracing session: ea7e5ce0-b9f5-11ee-b123-b0816809f2c0

 activity                                                                                                                                     | timestamp                  | source    | source_elapsed | client
----------------------------------------------------------------------------------------------------------------------------------------------+----------------------------+-----------+----------------+-----------
                                                                                                                           Execute CQL3 query | 2024-01-23 14:47:14.734000 | 127.0.0.1 |              0 | 127.0.0.1
                                                                                                         Parsing a statement [shard 1/sl:sl1] | 2024-01-23 14:47:14.734126 | 127.0.0.1 |              3 | 127.0.0.1
                                                                    Processing a statement for authenticated user: cassandra [shard 1/sl:sl1] | 2024-01-23 14:47:14.734279 | 127.0.0.1 |            156 | 127.0.0.1
      Creating read executor for token -4069959284402364209 with all: {127.0.0.2} targets: {127.0.0.2} repair decision: NONE [shard 1/sl:sl1] | 2024-01-23 14:47:14.737348 | 127.0.0.1 |           3225 | 127.0.0.1
   Creating never_speculating_read_executor - speculative retry is disabled or there are no extra replicas to speculate with [shard 1/sl:sl1] | 2024-01-23 14:47:14.737351 | 127.0.0.1 |           3228 | 127.0.0.1
                                                                                  read_data: sending a message to /127.0.0.2 [shard 1/sl:sl1] | 2024-01-23 14:47:14.737358 | 127.0.0.1 |           3236 | 127.0.0.1
                                                                                 read_data: message received from /127.0.0.1 [shard 1/sl:sl1] | 2024-01-23 14:47:14.737593 | 127.0.0.2 |             16 | 127.0.0.1
                                                        Start querying singular range {{-4069959284402364209, 000400000001}} [shard 0/sl:sl1] | 2024-01-23 14:47:14.737676 | 127.0.0.2 |             24 | 127.0.0.1
                                                                  [reader concurrency semaphore sl:sl1] admitted immediately [shard 0/sl:sl1] | 2024-01-23 14:47:14.737684 | 127.0.0.2 |             31 | 127.0.0.1
                                                                        [reader concurrency semaphore sl:sl1] executing read [shard 0/sl:sl1] | 2024-01-23 14:47:14.737688 | 127.0.0.2 |             35 | 127.0.0.1
                                    Querying cache for range {{-4069959284402364209, 000400000001}} and slice {(-inf, +inf)} [shard 0/sl:sl1] | 2024-01-23 14:47:14.737715 | 127.0.0.2 |             63 | 127.0.0.1
 Page stats: 0 partition(s), 0 static row(s) (0 live, 0 dead), 0 clustering row(s) (0 live, 0 dead) and 0 range tombstone(s) [shard 0/sl:sl1] | 2024-01-23 14:47:14.737724 | 127.0.0.2 |             72 | 127.0.0.1
                                                                                                            Querying is done [shard 0/sl:sl1] | 2024-01-23 14:47:14.737731 | 127.0.0.2 |             79 | 127.0.0.1
                                                                read_data handling is done, sending a response to /127.0.0.1 [shard 1/sl:sl1] | 2024-01-23 14:47:14.738321 | 127.0.0.2 |            743 | 127.0.0.1
                                                                                     read_data: got response from /127.0.0.2 [shard 1/sl:sl1] | 2024-01-23 14:47:14.739148 | 127.0.0.1 |           5026 | 127.0.0.1
                                                                                        Done processing - preparing a result [shard 1/sl:sl1] | 2024-01-23 14:47:14.739196 | 127.0.0.1 |           5074 | 127.0.0.1
                                                                                                                             Request complete | 2024-01-23 14:47:14.739087 | 127.0.0.1 |           5087 | 127.0.0.1

```

Closes scylladb/scylladb#16920

* github.com:scylladb/scylladb:
  reader_concurrency_semaphore: add name of semaphore in tracing messages
  cql3:query_processor: add logged user to query tracing info
2024-01-23 16:06:16 +02:00
Piotr Dulikowski
4ad6b6563b storage_service: introduce run_topology_coordinator function
Extracts a part of the logic of the raft_state_monitor_fiber method into
a separate function. It will be moved to a separate file in the next
commit along with the topology coordinator, and will serve as the only
way of interaction with the topology coordinator while the class itself
will remain hidden.

The topology_coordinator class is now directly constructed on the stack
(or rather in the coroutine frame), the indirection via shared_ptr is no
longer needed.
2024-01-23 14:09:12 +01:00
Patryk Wrobel
f15880dc48 compaction_group::stop(): always call compaction_manager.remove()
Before introduction of PR#15524 the removal had always been invoked
via finally() continuation. In spite of making flush() noexcept, the
mentioned PR modified the logic. If flush() returns exceptional future,
then the removal is not performed.

This change restores the old behavior - removal operation is always called.
Since now, the logic of compaction_group::stop() is as follows:
 - firstly, it waits for completion of flush() via
   seastar::coroutine::as_future() to avoid premature exception
 - then it executes compaction_manager.remove()
 - in the end it inspects the future returned from flush()
   to re-throw the exception if the operation failed

Fixed: scylladb#16751

Signed-off-by: Patryk Wrobel <patryk.wrobel@scylladb.com>

Closes scylladb/scylladb#16940
2024-01-23 14:56:27 +02:00
Botond Dénes
78ec96f5f3 Merge 'alternator: allow empty tag value' from Nadav Har'El
Alternator incorrectly refuses an empty tag value for TagResource, but DynamoDB does allow this case and it's useful (note that an empty tag key is rightly forbidden). So this short series fixes this case, and adds additional tests for TagResource which covers this case and other cases we forgot to cover in tests.

Fixes #16904.

Closes scylladb/scylladb#16910

* github.com:scylladb/scylladb:
  test/alternator: add more tests for TagResource
  alternator: allow empty tag value
2024-01-23 13:53:30 +02:00
Botond Dénes
26d814d8be Merge 'Configure initial tablets count scaling' from Pavel Emelyanov
There are currently two options how to "request" the number of initial tables for a table

1. specify it explicitly when creating a keyspace
2. let scylla calculate it on its own

Both are not very nice. The former doesn't take cluster layout into consideration. The latter does, but starts with one tablet per shard, which can be too low if the amount of data grows rapidly.

Here's a (maybe temporary) proposal to facilitate at least perf tests -- the --tablets-initial-scale-factor option that enhances the option number two above by multiplying the calculated number of tablets by the configured number. This is what we currently do to run perf tests by patching scylla, with the option it going to be more convenient.

Closes scylladb/scylladb#16919

* github.com:scylladb/scylladb:
  config: Add --tablets-initial-scale-factor
  tablet_allocator: Add initial tablets scale to config
  tablet_allocator: Add config
2024-01-23 13:25:12 +02:00
Amnon Heiman
50b3078916 main.cc: Add prometheus_allow_protobuf command line
This patch add the prometheus_allow_protobuf command line support.

When set to true, Prometheus will accept protobuf requests and will
reply with protobuf protocol.
This will also enable the experimental Prometheus Native Histograms.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-01-23 13:12:34 +02:00
Amnon Heiman
95d1146fea histogram_metrics_helper: support native histogram
approx_exponential_histogram uses similar logic to Prometheus native
histogram, to allow Prometheus sending its data in a native histogram
format it needs to report schema and min id (id of the first bucket).

This patch update to_metrics_histogram to set those optional parameters,
leaving it to the Prometheus to decide in what format the histogram will
be reported.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2024-01-23 13:12:34 +02:00