For getting rid of fb_utilities.
In the future, that could be used to instantiate
multiple scylla node instances.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Pass the broadcast_address from main to get_seeds_from_db_config
rather than getting it from fb_utilities.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
and set broadcast_address / broadcast_rpc_address in main
to remove this dependency of snitch on fb_utilities.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Now that ec2_snitch::load_config is a coroutine
there's no need for a seastar thread here either.
Refs #16241
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
UUID v1 uses an epoch derived frmo Gregorian calendar. but
base36-uuid.py interprets the timestamp with the UNIX epoch time.
that's why it prints a UUID like
```console
$ ./scripts/base36-uuid.py -d 3gbi_0mhs_4sjf42oac6rxqdsnyx
date = 2411-02-16 16:05:52
decimicro_seconds = 0x7ad550
lsb = 0xafe141a195fe0d59
```
even this UUID is generated on nov 30, 2023. so in this change,
we shift the time with the timestamp of UNIX epoch derived from
the Gregorian calendar's day 0. so, after this change, we have:
```console
$ ./scripts/base36-uuid.py -d 3gbi_0mhs_4sjf42oac6rxqdsnyx
date = 2023-11-30 16:05:52
decimicro_seconds = 0x7ad550
lsb = 0xafe141a195fe0d59
```
see https://datatracker.ietf.org/doc/html/rfc4122#section-4.1.4
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16235
neither <iomanip> nor "utils/to_string.hh" is used in
`gms/inet_address.cc`, so let's remove their "#include"s.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16281
This commit fixes:
1.The error message will be specific about what type of keys
exceeds the limit (e.g clustering keys or partition keys).
2.Error message will be more general about what causes it, cartesian product
or simple list.
3.Error message will advise to use --max-partition-key-restrictions-per-query
or --max-clustering-key-restrictions-per-query configuration options to
override current (100) limit.
Fixes#15627Closesscylladb/scylladb#16226
in 7a1fbb38, a new test is added to an existing test for
comparing the UUIDs with different time stamps, but we should tighten
the test a little bit to reflect the intention of the test:
the timestamp of "2023-11-24 23:41:56" should be less than
"2023-11-24 23:41:57".
in this change, we replace LE with LT to correct it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16245
This commit fixes the rollback procedure in
the 4.6-to-5.0 upgrade guide:
- The "Restore system tables" step is removed.
- The "Restore the configuration file" command
is fixed.
- The "Gracefully shutdown ScyllaDB" command
is fixed.
In addition, there are the following updates
to be in sync with the tests:
- The "Backup the configuration file" step is
extended to include a command to backup
the packages.
- The Rollback procedure is extended to restore
the backup packages.
- The Reinstallation section is fixed for RHEL.
Refs https://github.com/scylladb/scylladb/issues/11907
This commit must be backported to branch-5.4, branch-5.2, and branch-5.1
Closesscylladb/scylladb#16155
This commit fixes the rollback procedure in
the 5.1-to-5.2 upgrade guide:
- The "Restore system tables" step is removed.
- The "Restore the configuration file" command
is fixed.
- The "Gracefully shutdown ScyllaDB" command
is fixed.
In addition, there are the following updates
to be in sync with the tests:
- The "Backup the configuration file" step is
extended to include a command to backup
the packages.
- The Rollback procedure is extended to restore
the backup packages.
- The Reinstallation section is fixed for RHEL.
Also, I've the section removed the rollback
section for images, as it's not correct or
relevant.
Refs https://github.com/scylladb/scylladb/issues/11907
This commit must be backported to branch-5.4 and branch-5.2.
Closesscylladb/scylladb#16152
This commit fixes the rollback procedure in
the 5.0-to-5.1 upgrade guide:
- The "Restore system tables" step is removed.
- The "Restore the configuration file" command
is fixed.
- The "Gracefully shutdown ScyllaDB" command
is fixed.
In addition, there are the following updates
to be in sync with the tests:
- The "Backup the configuration file" step is
extended to include a command to backup
the packages.
- The Rollback procedure is extended to restore
the backup packages.
- The Reinstallation section is fixed for RHEL.
Also, I've the section removed the rollback
section for images, as it's not correct or
relevant.
Refs https://github.com/scylladb/scylladb/issues/11907
This commit must be backported to branch-5.4, branch-5.2, and branch-5.1
Closesscylladb/scylladb#16154
Using consistent cluster management and not using schema commitlog
ends with a bad configuration throw during bootstrap. Soon, we
will make consistent cluster management mandatory. This forces us
to also make schema commitlog mandatory, which we do in this patch.
A booting node decides to use schema commitlog if at least one of
the two statements below is true:
- the node has `force_schema_commitlog=true` config,
- the node knows that the cluster supports the `SCHEMA_COMMITLOG`
cluster feature.
The `SCHEMA_COMMITLOG` cluster feature has been added in version
5.1. This patch is supposed to be a part of version 6.0. We don't
support a direct upgrade from 5.1 to 6.0 because it skips two
versions - 5.2 and 5.4. So, in a supported upgrade we can assume
that the version which we upgrade from has schema commitlog. This
means that we don't need to check the `SCHEMA_COMMITLOG` feature
during an upgrade.
The reasoning above also applies to Scylla Enterprise. Version
2024.2 will be based on 6.0. Probably, we will only support
an upgrade to 2024.2 from 2024.1, which is based on 5.4. But even
if we support an upgrade from 2023.x, this patch won't break
anything because 2023.1 is based on 5.2, which has schema
commitlog. Upgrades from 2022.x definitely won't be supported.
When we populate a new cluster, we can use the
`force_schema_commitlog=true` config to use schema commitlog
unconditionally. Then, the cluster feature check is irrelevant.
This check could fail because we initiate schema commitlog before
we learn about the features. The `force_schema_commitlog=true`
config is especially useful when we want to use consistent cluster
management. Failing feature checks would lead to crashes during
initial bootstraps. Moreover, there is no point in creating a new
cluster with `consistent_cluster_management=true` and
`force_schema_commitlog=false`. It would just cause some initial
bootstraps to fail, and after successful restarts, the result would
be the same as if we used `force_schema_commitlog=true` from the
start.
In conclusion, we can unconditionally use schema commitlog without
any checks in 6.0 because we can always safely upgrade a cluster
and start a new cluster.
Apart from making schema commitlog mandatory, this patch adds two
changes that are its consequences:
- making the unneeded `force_schema_commitlog` config unused,
- deprecating the `SCHEMA_COMMITLOG` feature, which is always
assumed to be true.
Closesscylladb/scylladb#16254
Fixes#16277
When the PR for 'tagged pages' was submitted for RFC, it was assumed that PR #12849
(compression) would be committed first. The latter introduced v3 format, and the
format in #12849 (tagged pages) was assumed to have to be bumped to 4.
This ended up not the case, and I missed that the code went in with file format
tag numeric value being '4' (and constant named v3).
While not detrimental, it is confusing, and should be changed asap (before anything
depends on files with the tag applied).
Closesscylladb/scylladb#16278
Refs #15269
Unit test to check that trying to skip past EOF in a borked segment
will not crash the process. file_data_input_impl asserts iff caller
tries this.
On /usr/lib/sysctl.d/99-scylla-sched.conf, we have some sysctl settings to
tune the scheduler for lower latency.
This is mostly to prevent softirq threads processing tcp and reactor threads
from injecting latency into each other.
However, these parameters are moved to debugfs from linux-5.13+, so we lost
scheduler tuneing on recent kernels.
To support tuning recent kernel, let's add a new service which support
to configure both sysctl and debugfs.
The service named scylla-tune-sched.service
The service will unconditionally enables when installed, on older kernel
it will tune via sysctl, on recent kernel it will tune via debugfs.
Fixes#16077Closesscylladb/scylladb#16122
in 4ea6e06c, we specialized fmt::formatter<gms::inet_address> using
the formatter of bytes if the underlying address is an IPv6 address.
this breaks the tests with JMX which expected the shortened form of
the text representation of the IPv6 address.
in this change, instead of reinventing the wheel, let's reuse the
existing formatter of net::inet_address, which is able to handle
both IPv4 and IPv6 addresses, also it follows
https://datatracker.ietf.org/doc/html/rfc5952 by compressing the
consecutive zeros.
since this new formatter is a thin wrapper of seastar::net::inet_addresss,
the corresponding unit test will be added to Seastar.
Refs #16039
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16267
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we define a formatter for
row_level_diff_detect_algorithm. but its operator<<() is preserved,
as we are still using our homebrew the generic formatter for
std::vector, and this formatter is still using operator<< for formatting
the elements in the vector.
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16248
as part of the efforts to migrate to the CMake-based building system,
this change enables us to `configure.py` to optionally create
`build.ninja` with CMake.
in this change, we add a new option named `--use-cmake` to
`configure.py` so we can create `build.ninja`. please note,
instead of using the "Ninja" generator used by Seastar's
`configure.py` script, we use "Ninja Multi-Config" generator
along with `CMAKE_CROSS_CONFIGS` setting in this project.
so that we can generate a `build.ninja` which is capable of
building the same artifacts with multiple configuration.
Refs #15379
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#15916
* github.com:scylladb/scylladb:
build: cmake: add compatibility target of dev-headers
build: add an option to use CMake as the build build system
before this change, we rely on the default-generated fmt::formatter
created from operator<<, but fmt v10 dropped the default-generated
formatter.
in this change, we
* define a formatter for logical_clock::time_point, as fmt does not
provide formatter for this time_point, as it is not a part of the
standard library
* remove operator<<() for logical_clock::time_point, as its soly
purpose is to generate the corresponding fmt::formatter when
FMT_DEPRECATED_OSTREAM is defined.
* remove operator<<() for logical_clock::duration, as fmt provides
a default implementation for formatting
std::chrono::nanoseconds already, which uses `int64_t` as its rep
template parameter as well.
* include "fmt/chrono.h" so that the source files including this
header can have access the formatter without including it by
themselves, this preserve the existing behavior which we have
before removal of "operator<<()".
Refs #13245
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16263
In the view update code, the function get_view_natural_endpoint()
determines which view replica this base replica should send an update
to. It currently gets the *view* table's replication map (i.e., the map
from view tokens to lists of replicas holding the token), but assumes
that this is also the *base* table's replication map.
This assumption was true with vnodes, but is no longer true with
tablets - the base table's replication map can be completely different
from the view table's. By looking at the wrong mapping,
get_view_natural_endpoint() can believe that this node isn't really
a base-replica and drop the view update. Alternatively, it can think
it is a base replica - but use the wrong base-view pairing and create
base-view inconsistencies.
This patch solves this bug - get_view_natural_endpoint() now gets two
separate replication maps - the base's and the view's. The callers
need to remember what the base table was (in some cases they didn't
care at the point of the call), and pass it to the function call.
This patch also includes a simple test that reproduces the bug, and
confirms it is fixed: The test has a 6-node cluster using tablets
and a base table with RF=1, and writes one row to it. Before this
patch, the code usually gets confused, thinking the base replica
isn't a replica and loses the view update. With this patch, the
view update works.
Fixes#16227.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#16228
Prototype implementation of format suggested/requested by @avikivity:
Divides segments into disk-write-alignment sized pages, each tagged with segment ID + CRC of data content.
When read, we both verify sector integrity (CRC) to detect corruption, as well as matching ID read with expected one.
If the latter mismatches we have a prematurely terminated segment (read truncation), which, depending on whether the CL is
written in batch or periodic mode, as well as explicit sync, can mean data loss.
Note: all-zero pages are treated as kosher, both to align with newly allocated segments, as well as fully terminated (zero-page) ones.
Note: This is a preview/RFC - the rest of the file format is not modified. At least parts of entry CRC could probably be removed, but I have not done so yet (needs some thinking).
Note: Some slight abstraction breaks in impl. and probably less than maximal efficiency.
v2:
* Removed entry CRC:s in file format.
* Added docs on format v3
* Added one more test for recycling-truncation
v3:
* Fixed typos in size calc and docs
* Changed sect metadata order
* Explicit iter type
Closesscylladb/scylladb#15494
* github.com:scylladb/scylladb:
commitlog_test: Add test for replaying large-ish mutation
commitlog_test: Add additional test for segmnent truncation
docs: Add docs on commitlog format 3
commitlog: Remove entry CRC from file format
commitlog: Implement new format using CRC:ed sectors
commitlog: Add iterator adaptor for doing buffer splitting into sub-page ranges
fragmented_temporary_buffer: Add const iterator access to underlying buffers
commitlog_replayer: differentiate between truncated file and corrupt entries
The helper in question complicates the logic of sstable_directory::process() by making garbage collection differently for sstables deleted "atomically" and deleted "one-by-one". Also, the code that deletes sstables one-by-one and uses remove_by_toc_name() renders excessive TOC file reading, because there's sstable object at hand and it had all_components() ready for use.
Surprisingly, there was no test for the deletion-log functionality. This PR adds one. The test passes before the g.c. and regular unlink fix, and (of course) continues passing after it.
Closesscylladb/scylladb#16240
* github.com:scylladb/scylladb:
sstables: Drop remove_by_name()
sstables/fs_storage: Wipe by recognized+unrecognized components
sstable_directory: Enlight deletion log replay
sstables: Split remove_by_toc_name()
test: Add test case to validate deletion log work
sstable_directory: Close dir on exception
sstable_directory: Fix indentation after previous patch
sstable_directory: Coroutinize delete_with_pending_deletion_log()
test: Sstable on_delete() is not necessarily in a thread
sstable_directory: Split delete_with_pending_deletion_log()
Currently, the max size of commitlog is obtained either from the
config parameter commitlog_total_space_in_mb or, when the parameter
is -1, from the total memory allocated for Scylla.
To facilitate testing of the behavior of commitlog hard limit,
expose the value of commitlog max_disk_size in a dedicated API.
Closesscylladb/scylladb#16020
rpmlint complains about "mixed-use-of-spaces-and-tabs". and it
does not good in the editor. so let's replace tab with spaces.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#16246
Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255Closesscylladb/scylladb#16257
* github.com:scylladb/scylladb:
Update service/topology_state_machine.hh
Update raft/tracker.hh
Update db/view/view.cc
Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.
Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
This PR is a necessary step to fix#15854 -- making consistent
cluster management mandatory on master.
Before making consistent cluster management mandatory, we have
to get rid of all tests that depend on the
`consistent_cluster_management=false` config. These are the tests
in the `topology_raft_disabled` suite.
There's the internal Raft upgrade procedure, which is the bulk of the
upgrade logic. Then, there are two thin "layers" around it that
invoke it underneath: recovery procedure and
enable-raft-in-the-cluster procedure. We're getting rid of the
second one by making Raft always enabled, so we naturally have to
get rid of tests that depend on it. The idea is to replace every
necessary enable-raft-in-the-cluster procedure in these tests with
the recovery procedure. Then, we will still be testing the internal
Raft upgrade procedure in the in-tree tests. The
enable-raft-in-the-cluster procedure is already tested by QA tests,
so we don't need to worry about these changes.
Unfortunately, we cannot adapt `test_raft_upgrade_no_schema`.
After making consistent cluster management mandatory on master,
schema commitlog will also become mandatory because
`consistent_cluster_management: True`,
`force_schema_commit_log: False`
is considered a bad configuration. These changes will make
`test_raft_upgrade_no_schema` unimplementable in the Scylla repo.
Therefore, we remove this test. If we want to keep it, we must
rewrite it as an upgrade dtest.
After making all tests in `topology_raft_disabled` use consistent
cluster management, there is no point in keeping this suite.
Therefore, we delete it and move all the tests to `topology_custom`.
Closesscylladb/scylladb#16192
* github.com:scylladb/scylladb:
test: delete topology_raft_disabled suite
test: topology_raft_disabled: move tests to topology_custom suite
test: topology_raft_disabled: move utils to topology suite
test: topology_raft_disabled: use consistent cluster management
test: topology_raft_disabled: add new util functions
test: topology_raft_disabled: delete test_raft_upgrade_no_schema
Currently wiping fs-backed sstable happens via reading and parsing its
TOC file back. Then the three-step process goes:
- move TOC -> TOC.tmp
- remove components (obtained from TOC.tmp)
- remove TOC.tmp
However, wiping sstable happens in one of two cases -- the sstable was
loaded from the TOC file _or_ sstable had evaluated the needed
components and generated TOC file. With that, the 2nd step can be made
without reading the TOC file, just by looking at all components sitting
on the sstable
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Garbage collection of sstables is scattered between two strages -- g.c.
per-se and the regular processing.
The former stage collects deletion logs and for each log found goes
ahead and deletes the full sstable with the standard sequence:
- move TOC -> TOC.tmp
- remove components
- remove TOC.tmp
The latter stage picks up partially unlinked sstables that didn't go via
atomic deletion with the log. This comes as
- collect all components
- keep TOC's and TOC.tmp's in separate lists
- attach other components to TOC/TOC.tmp by generation value
- for all TOC.tmp's get all attached components and remove them
- continue loading TOC's with attached components
Said that, replaying deletion log can be as light as just the first step
out of the above sequence -- just move TOC to TOC.tmp. After that the
regular processing would pick the remaining components and clean them
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The helper consists of three phases:
- move TOC -> TOC.tmp
- remove components listed in TOC
- remove TOC.tmp
The first step is needed separately by the next patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The test sequence is
- create several sstables
- create deletion log for a sub-set of them
- partially unlink smaller sub-sub-set
- make sstable directory do the processing with g.c.
- check that the sstables loaded do NOT include the deleted ones
The .throw_on_missing_toc bit set additionally validates that the
directory doesn't contain garbage not attached to any other TOCs
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When committing the deletion log creation its containing directory is
sync-ed via opened file. This place is not exception safe and directory
can be left unclosed
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
One of the test cases injects an observer into sstable->unlink() method
via its _on_delete() callback. The test's callback assumes that it runs
in an async context, but it's a happy coincidence, because deletion via
the deletion log runs so. Next patch is changing it and the test case
will no longer work. But since it's a test case it can just directly
call a libc function for its needs
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>