scylla

Author	SHA1	Message	Date
Piotr Sarna	697e2fc66d	db,view: drop the artificial limit on view update mutation size The series which split the view update process into smaller parts accidentally put an artificial 10MB limit on the generated mutation size, which is wrong - this limit is configurable for users, and, what's more important, this data was already validated when it was inserted into the base table. Thus, the limit is lifted. Tests: unit(release), dtest(wide_rows_test)	2021-07-15 14:09:37 +02:00
Nadav Har'El	8bdff97d8d	Merge 'Fix propagating view update generation failures' from Piotr Sarna When the generate-and-propagate-view-updates routine was rewritten to allow partial results, one important validation got lost: previously, an error which occured during update generation was propagated to the user - as an example, the indexed column value must be smaller than 64kB, otherwise it cannot act as primary key part in the underlying view. Errors on view update propagation are however ignored in this layer, because it becomes a background process. During the rewrite these two got mixed up and so it was possible to ignore an error that should have been propagated. This behavior is now fixed. Fixes #9013 Closes #9021 * github.com:scylladb/scylla: cql-pytest: add a case for too large value in SI table: stop ignoring view generation errors on write path	2021-07-14 15:49:48 +02:00
Piotr Sarna	91b4e24db5	Merge 'Tests for Alternator's TTL feature' from Nadav Har'El This series includes a comprehensive test suite for the DynamoDB API's TTL (item expiration) feature described in issue #5060. Because we have not yet implemented the TTL feature in Alternator, all of the tests still xfail, but they all pass on DynamoDB and demonstrate exactly how the TTL feature works and how it interacts with other features such as LSI, GSI and Streams. The patch which introduces these tests is heavily commented to explain exactly what it tests, and why. Because DynamoDB only expires items some 10-30 minutes after their expiration time (the documentation even suggests it can be delayed by 24 hours!), some of these tests are extremely long (up to 30 minutes!), so we also introduce in this series a new marker for "verylong" tests. verylong tests are skipped by default, unless the "--runverylong" option is given. In the future, when we implement the TTL feature in Alternator and start testing it, we may be able to configure it with a much shorter expiration timeout and then we might be able to run these tests in a reasonable time and make them run by default. Closes #8564 * github.com:scylladb/scylla: test/alternator: add tests for the Alternator TTL feature test/alternator: add marker for "veryslow" tests test/alternator: add new_test_table() utility function	2021-07-14 15:49:48 +02:00
Nadav Har'El	2acfee8118	test/alternator: add tests for the Alternator TTL feature This patch adds a comprehensive test suite for the DynamoDB API's TTL (item expiration) feature. The tests check the two new API commands added by this feature (UpdateTimeToLive and DescribeTimeToLive), and also how items are expired in practice, and how item expiration interacts with other features such as GSI, LSI and DynamoDB Streams. Because DynamoDB has extremely long delays until items are expired, or until expiration configuration may be changed, several of these tests take up to 30 minutes to complete. We mark these tests with the "verylong" marker, so they are skipped in ordinary test runs - use the "--runverylong" option to run them. All these tests currently pass on DynamoDB, but xfail on Alternator because the two commands UpdateTimeToLive and DescribeTimeToLive are currently rejected by Alternator. Refs #5060 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 14:08:55 +03:00
Avi Kivity	6df3139455	install-dependencies.sh: add gdb gdb is used for testing scylla-gdb.py (since `3c2e852dd`), so it needs to be listed as a dependency. Add it there. It was listed as a courtesy dependency in the frozen toolchain (which is why it still worked), so it's removed from there. Closes #9034	2021-07-14 10:15:54 +03:00
Avi Kivity	64ad31c26f	build: enable -Wc++1z-extensions It was disabled for the move to clang, but now apparently no longer needed. So re-enable that warning. Closes #9026	2021-07-14 08:28:26 +03:00
Nadav Har'El	9d07ce3cb6	test/alternator: add marker for "veryslow" tests Until now, Alternator test have all been very fast, taking milliseconds or at worst seconds each - or a bit longer on DynamoDB. However, sometimes we need to write tests which take a huge amount of time - for example, tests for the TTL feature may take 10 minutes because the item expiration is delayed by that much. Because a 10 minute test is ridiculous (all 500 Alternator tests together take just one minute today!), we would normally run such test once, and then mark it "skip" so will never run again. One annoying thing about skipped tests is that there is no way to temporarily "unskip" them when we want to run such a test anyway. So in this patch, we introduce a better option for these very slow tests instead of the simple "skip": The patch introduces a marker "@pytest.mark.veryslow". By default, a test with this marker is skipped. However, an command-line option "--runveryslow" is introduced which causes tests with the veryslow mark to be run anyway, and not skipped. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 00:26:21 +03:00
Nadav Har'El	2fb379bb94	test/alternator: add new_test_table() utility function This patch adds a convenient function new_test_table() that Alternator tests can use to safely create a temporary table, and be sure it is deleted in any case. This function is used in a "with", as follows: with new_test_table(dynamodb, ...) as table: do_something(table) # at this point table has already been deleted. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-07-14 00:26:21 +03:00
Piotr Sarna	11a054c1fc	cql-pytest: add a case for too large value in SI The test case tries to insert a too-large value into an indexed column and expects to see a write failure. Refs #9013	2021-07-13 17:26:18 +02:00
Piotr Sarna	73f7702a69	table: stop ignoring view generation errors on write path When the generate-and-propagate-view-updates routine was rewritten to allow partial results, one important validation got lost: previously, an error which occured during update generation was propagated to the user - as an example, the indexed column value must be smaller than 64kB, otherwise it cannot act as primary key part in the underlying view. Errors on view update propagation are however ignored in this layer, because it becomes a background process. During the rewrite these two got mixed up and so it was possible to ignore an error that should have been propagated. This behavior is now fixed. Fixes #9013	2021-07-13 17:20:38 +02:00
Benny Halevy	c8e7bd9a26	storage_proxy: abstract_read_resolver: catch semaphore_timed_out before timed_out_error Prepare for making semaphore_timed_out derived from timed_out_error in seastar. When this happens in seastar, we would need to catch the derived, more-specific exception first to avoid the following warning: ``` service/storage_proxy.cc:2818:18: error: exception of type 'seastar::semaphore_timed_out &' will be caught by earlier handler [-Werror,-Wexceptions] } catch (semaphore_timed_out&) { ^ service/storage_proxy.cc:2815:18: note: for type 'seastar::timed_out_error &' } catch (timed_out_error&) { ^ ``` Later on, after the seastar change is applied to the scylla repo, we can eliminate the duplication and catch only timed_out_error. Test: unit(dev) (w/ the seastar changes to semaphore_timed_out and rpc::timeout_error to inherit from timed_out_error). Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210713132858.294504-1-bhalevy@scylladb.com>	2021-07-13 16:39:17 +03:00
Nadav Har'El	c174eeae06	alternator: do not allow LSI on base table with no sort key The purpose of an LSI (local secondary index) in Alternator is to allow a different sort key for the existing partitions, keeping the same division into partititions. So it doesn't make sense to create an LSI on a table that did not originally have a sort key (i.e., single-item partitions). DynamoDB indeed doesn't allow this case, and Alternator forgot to forbid it - so this patch adds the missing check to the CreateTable operation. This patch also adds a test case for this, test_lsi_wrong_no_sort_key, which failed before the patch and passes after it (and also passes on DynamoDB). Also, the existing test_lsi_wrong tests for bad LSI creation attempts by mistake used a base table without a sort key - so while they encountered an error as expected, it was not the right error! So we fix that test (and split it into two tests), adding the missing sort key and exposing the actual errors that the tests were meant to expose. That test passed before this patch and also afterwards - but at least after the patch it is actually testing what it was meant to be testing. Fixes #9018. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210713123747.1012954-1-nyh@scylladb.com>	2021-07-13 15:12:01 +02:00
Nadav Har'El	1ff1c3735b	Merge 'Remove the mutation-based restriction checks' from Piotr Sarna This series unifies the interface for checking if CQL restrictions are satisfied. Previously, an additional mutation-based approach was added in the materialized views layer, but the decision was reached that it's better to have a single API based on partition slices. With that, the regular selection path gets simplified at the cost of more complicated view generation path, which is a good tradeoff. Note that in order to unify the interface, the view layer performs ugly transformations in order to adjust the input for `is_satisfied_by`. Reviewers, please take a close look at this code (`matches_view_filter`, `clustering_prefix_matches`, `partition_key_matches`), because it looks error-prone and relies on dirty internals of our serialization layer. If somebody has a better suggestion on how to do the transformation, I'm all ears. Tests: unit(release), manual(playing with materialized views with custom filters) Fixes #7215 Closes #8979 * github.com:scylladb/scylla: db,view,table: drop unneeded time point parameter cql3,expr: unify get_value cql3,expr: purge mutation-based is_satisfied_by db,view: migrate key checks from the deprecated is_satisfied_by db,view: migrate checking view filter to new is_satisfied_by db,view: add a helper result builder class db,view: move make_partition_slice helper function up	2021-07-13 12:42:13 +03:00
Piotr Sarna	a1813c9b34	db,view,table: drop unneeded time point parameter Now that restriction checking is translated to the partition-slice-style interface, checking the partition/clustering key restrictions for views can be performed without the time point parameter. The parameter is dropped from all relevant call sites.	2021-07-13 10:40:08 +02:00
Piotr Sarna	1e0880e345	cql3,expr: unify get_value Now that there's only one helper function for getting values, the call can be inlined instead.	2021-07-13 10:40:08 +02:00
Piotr Sarna	95002bb8d4	cql3,expr: purge mutation-based is_satisfied_by The interface is now unified, and all callers use the original CQL3-backed API.	2021-07-13 10:40:08 +02:00
Piotr Sarna	37fc3f4b5b	db,view: migrate key checks from the deprecated is_satisfied_by Last two users of the mutation-based is_satisfied_by function were in the partition/clustering key checks. These functions are now translated to use the original API.	2021-07-13 10:40:07 +02:00
Piotr Sarna	d6b0a8338a	db,view: migrate checking view filter to new is_satisfied_by In order to unify the interfaces, the is_satisfied_by flavor for mutations is getting deprecated. In order to be able to remove it, one of its biggest users, the matches_view_filter() function, is translated to the other variant.	2021-07-13 10:04:03 +02:00
Piotr Sarna	786db7e9a8	db,view: add a helper result builder class In order to migrate from mutation-based restriction checks, code in view.cc needs to have a way of translating results to partition-slice-based representation. A slightly simplified builder from multishard_mutation_query.cc is injected into the view code.	2021-07-13 10:04:03 +02:00
Piotr Sarna	32d87837b1	db,view: move make_partition_slice helper function up No functional changes, it will be needed for a future patch.	2021-07-13 10:04:02 +02:00
Botond Dénes	2bbfb76cc5	compaction/leveled_compaction_strategy.cc: remove unused <ranges> include Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210713063506.419658-1-bdenes@scylladb.com>	2021-07-13 10:34:22 +03:00
Benny Halevy	1db0612a06	cql3: query_processor: delete service_level_controller param The query_processor internal_state doesn't use the service_level_controller as it only needs service::client_state::for_internal_calls() Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210713055703.131099-1-bhalevy@scylladb.com>	2021-07-13 10:34:05 +03:00
Benny Halevy	90e5181192	main: defer-stop sl_controller after start and drain before storage_proxy drain_on_shutdown As per `32bcbe59ad`, the sl_controller is stopped after set_distributed_data_accessor is called. However if scylla shuts down before that happens, the sl_controller still needs to be stopped. We need to drain the service level controller before storage_proxy::drain_on_shutdown is called to prevent queries by the update loop from starting after the storage_proxy has been drained - leading to issues similar to #9009. Fixes #9014 Test: unit(dev) Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210713055606.130901-1-bhalevy@scylladb.com>	2021-07-13 10:33:29 +03:00
Avi Kivity	f0e2f31839	Merge "Implement validation compaction" from Botond " Currently, when sstables are suspected to be corrupt, one has a few bad choices on how to verify that they are indeed correct: * Obtain suspect sstable files and manually inspect them. This is problematic because it requires a scylla engineer to have direct access to data, which is not always simple or even possible due to privacy protection rules. * Run sstable scrub in abort mode. This is enough to confirm whether there is any corruption or not, but only in a binary manner. It is not possible to explore the full scope of the corruption, as the scrub will abort on the first corruption. * Run sstable scrub in non-abort mode. Although this allows for exploring the full scope of the corruption and it even gets rid of it, it is a very intrusive and potentially destructive process that some users might not be willing to even risk. This patchset offers an alternative: validation compaction. This is a completely non-intrusive compaction that reads all sstables in turn and validates their contents, logging any discrepancies it can find. It does not mutate their content, it doesn't even re-writes them. It is akin to a dry-run mode for sstable scrub. The reason it was not implemented as such is that the current compaction infrastructure assumes that input sstables are replaced by output sstables as part of the compaction process. Lifting this assumption seemed error-prone and risky, so instead I snatched the unused "Validation" compaction type for this purpose. This compaction type completely bypasses the regular compaction infrastructure but only at the low-level. It still integrates fully into compaction-manager. Fixes: #7736 Refs: https://github.com/scylladb/scylla-tools-java/issues/263 Tests: unit(dev) " * 'validation-compaction/v5' of https://github.com/denesb/scylla: test/boost/sstable_datafile_test: add test for validation compaction test/boost/sstable_datafile_test: scrub tests: extract corrupt sst writer code into function api: storage_service: expose validation compaction sstables/compaction_manager: add perform_sstable_validation() sstables/compaction_manager: rewrite_sstables(): resolve maintenance group FIXME sstables/compaction_manager: add maintenance scheduling group sstables/compaction_manager: drop _scheduling_group field sstables/compaction_manager: run_custom_job(): replace parameter name with compaction type sstables/compaction_manager: run_custom_job(): keep job function alive sstables/compaction_descriptor: compaction_options: add validation compaction type sstables/compaction: compaction_options::type(): add static assert for size of index_to_type sstables/compaction: implement validation compaction type sstables/compaction: extract compaction info creation into static method sstables/compaction: extract sstable list formatting to a class sstables/compaction: scrub_compaction: extract reporting code into static methods position_in_paritition{_view}: add has_key() mutation_fragment_stream_validator: add schema() accessor	2021-07-13 10:29:40 +03:00
Tomasz Grabiec	e947fac74c	database: Fix cache metrics not being registered Introduced in `6a6403d`. The default constructor with dummy_app_stats is also used by production code. Fixes #9012 Message-Id: <20210712221447.71902-1-tgrabiec@scylladb.com>	2021-07-13 07:50:44 +03:00
Avi Kivity	058afbcee8	build: re-enable -Wmisleading-indentation This can catch mismatches between visual indication about control flow and what the compiler actually does. Looks like boost cleaned up its indentation since it was disabled in `7f38634080` ("dist/debian: Switch to g++-7/boost-1.63 on Ubuntu 14.04/16.04"). It's unlikely to pop back since modern compilers enable it by default. Closes #9015	2021-07-12 22:29:19 +03:00
Avi Kivity	8fb4fe2f24	Merge "reader_concurrency_semaphore: relax on destroy stop checks" from Botond " Currently we `assert(_stopped)` in the destructor, but this is too harsh, especially on freshly created semaphore instances that weren't even used yet. This basically mandates semaphores to be initialized at the end of the constructor body, which is very cumbersome. Further to that, this series relaxes the checks on destroying an unstopped previously (but not currently) used semaphore. As destroying such a semaphore without stop is risky an error is still logged. Tests: unit(dev) " * 'reader-concurrency-semaphore-relax-stop-check/v1' of https://github.com/denesb/scylla: reader_concurrency_semaphore: relax _stopped check when destroying a used semaphore reader_concurrency_semaphore: allow destroying without stop() when not used yet reader_concurrency_semaphore: add permit-stats	2021-07-12 20:07:01 +02:00
Nadav Har'El	f540a69a82	Update tools/java submodule * tools/java 5013321823...79a441972d (2): > Add Zstd compressor > Settings Schema: fix typo in settings printing Adding the Zstd compressor fixes #8887.	2021-07-12 20:07:00 +02:00
Avi Kivity	4d48e1e9e1	build: avoid sanitize/coverage builds in multi-mode targets The default target (i.e. what gets executed under "ninja") excludes sanitize and coverage modes (since they're useful in special cases only), but the other multi-mode targets such as "ninja build" do not. This means that two extra modes are built. Make things consistent by always using default_modes (which default to release,dev,debug). This can be overriden using the --mode switch to configure.py. Closes #8775	2021-07-12 20:07:00 +02:00
Botond Dénes	f8004c652b	reader_concurrency_semaphore: relax _stopped check when destroying a used semaphore Further relax the conditions under which we abort on destroying a unstopped semaphore. We already allow destroying completely unused semaphores, this patch further relaxes this to allow destroying formerly used but presently not used semaphores without stopping. We still call `on_internal_error_noexcept()` even if destroying the semaphore is safe, because without calling `stop()`, destroying the semaphore depends on luck, which we shouldn't rely on.	2021-07-12 15:53:00 +03:00
Botond Dénes	750b20fd85	reader_concurrency_semaphore: allow destroying without stop() when not used yet To make it easier to construct objects with semaphore members. When the construction of such object fails, they can now just destroy their semaphore member as usual, without having to employ tricks to make sure its is stopped before.	2021-07-12 15:53:00 +03:00
Botond Dénes	03959a332b	reader_concurrency_semaphore: add permit-stats Which stores permit related stats. For now only total number of permits is maintained which is useful to determine whether the semaphore was used already or not.	2021-07-12 15:53:00 +03:00
Nadav Har'El	3fda13e20e	cql-pytest: fix sporadic failure in over-zealous TTL test We have been seeing rare failures of the cql-pytest (translated from Cassandra's unit tests) for testing TTL in secondary indexes: cassandra_tests/validation/entities/secondary_index_test.py::testIndexOnRegularColumnInsertExpiringColumn The problem is that the test writes an item with 1 second TTL, and then sleeps exactly 1.0 seconds, and expects to see the item disappear by that time. Which doesn't always happen: The problem with that assumption stems from Scylla's TTL clock ("gc_clock") being based on Seastar's lowres clock. lowres_clock only has a 10ms "granularity": The time Scylla sees when deciding whether an item expires may be up to 10ms in the past - the arbitrary point when the lowres timer happened to last run. In rare overload cases, the inaccuracy may be even grater than 10ms (if the timer got delayed by other things running). So when Scylla is asked to expire an item in 1 second - we cannot be sure it will be expired in exactly 1 second or less - the expiration can be also around 10ms later. So in this patch we change the test to sleep with more than enough margin - 1.1 seconds (i.e., 100ms more than 1 second). By that time we're sure the item must have expired. Before this patch, I saw the test failing once every few hundred runs, after this patch I ran if 2,000 times without a single failure. Fixes #9008 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210712100655.953969-1-nyh@scylladb.com>	2021-07-12 13:48:21 +03:00
Yaron Kaikov	aa7c40ba50	dist: build docker based on ubuntu 20.04 OS Today our docker image is based on Centos7 ,Since centos will be EOL in 2024 and no longer has stable release stream. let's move our docker image to be based on ubuntu 20.04 Based on the work done in https://github.com/scylladb/scylla/pull/8730, let's build our docker image based on local packages using buildah Closes #8849	2021-07-12 13:32:03 +03:00
Piotr Jastrzebski	c010cefc4d	cdc: Handle compact storage tables correctly When a table with compact storage has no regular column (only primary key columns), an artificial column of type empty is added. Such column type can't be returned via CQL so CDC Log shouldn't contain a column that reflects this artificial column. This patch does two things: 1. Make sure that CDC Log schema does not contain columns that reflect the artificial column from a base table. 2. When composing mutation to CDC Log, ommit the artificial column. Fixes #8410 Test: unit(dev) Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> Closes #8988	2021-07-12 12:17:35 +03:00
Nadav Har'El	2cc8c40c07	Merge 'Fix some issues found by gcc 11' from Avi Kivity This series fixes some issues that gcc 11 complains about. I believe all are correct errors from the standard's view. Clang accepts the changed code. Note that this is not enough to build with gcc 11, but it's a start. Closes #9007 * github.com:scylladb/scylla: utils: compact-radix-tree: detemplate array_of<> utils: compact-radix-tree: don't redefine type as member raft: avoid changing meaning of a symbol inside a class cql3: lists: catch polymorphic exceptions by reference	2021-07-12 11:17:57 +03:00
Avi Kivity	4433178ccb	utils: exceptions: convert sprint() to format() sprint() is type-unsafe, and we are converging on format(). Convert exceptions.hh to format(). Closes #9006	2021-07-12 11:17:57 +03:00
Botond Dénes	5c39a2921e	test/boost/sstable_datafile_test: add test for validation compaction Add a two new unit tests, one which cover the whole stack, and one which stresses the validation part.	2021-07-12 10:25:15 +03:00
Botond Dénes	8296759cce	test/boost/sstable_datafile_test: scrub tests: extract corrupt sst writer code into function So the two tests having this almost identical code can shared it, and so that it can be used by new tests.	2021-07-12 10:25:15 +03:00
Botond Dénes	b0ef57c833	api: storage_service: expose validation compaction	2021-07-12 10:25:15 +03:00
Botond Dénes	47283ed151	sstables/compaction_manager: add perform_sstable_validation() Exposing validation compaction on the compaction manager level. To keep things simple, validation compaction uses the custom job infrastructure.	2021-07-12 10:25:15 +03:00
Botond Dénes	4c05e5f966	sstables/compaction_manager: rewrite_sstables(): resolve maintenance group FIXME Run this compaction in the maintenance group which is now available, resolving the FIXME asking for this.	2021-07-12 10:25:15 +03:00
Botond Dénes	c8f8e9232c	sstables/compaction_manager: add maintenance scheduling group rewrite_sstables() wants to be run in the maintenance group and soon we will add another compaction type which also wants to be run in the said group. To enable this propagate the maintenance scheduling group (both CPU and IO) to the compaction manager.	2021-07-12 10:25:15 +03:00
Botond Dénes	12b8b650b7	sstables/compaction_manager: drop _scheduling_group field Use the equivalent _compaction_controller.sg() instead.	2021-07-12 10:25:15 +03:00
Botond Dénes	75bad71f0e	sstables/compaction_manager: run_custom_job(): replace parameter name with compaction type All callers use it to do operations that are closely associated with one of the standard compaction types, so no reason to pass in a custom string instead of the compaction type enum.	2021-07-12 10:25:15 +03:00
Botond Dénes	ddf2700b2e	sstables/compaction_manager: run_custom_job(): keep job function alive For the duration of the job, allowing coroutine lambdas to be used as well.	2021-07-12 10:25:15 +03:00
Botond Dénes	891921377d	sstables/compaction_descriptor: compaction_options: add validation compaction type This enables starting validation compaction via `compact_sstables()`.	2021-07-12 10:25:15 +03:00
Botond Dénes	349a3ed4e8	sstables/compaction: compaction_options::type(): add static assert for size of index_to_type To remind those adding a member to the variant, to also add the corresponding entry here.	2021-07-12 10:25:15 +03:00
Botond Dénes	a57caf5229	sstables/compaction: implement validation compaction type Validation just reads all the passed-in sstables and runs the mutation stream through a mutation fragment stream validator, logging all errors found, and finally also logging whether all the sstables are valid or not. Validation is not really a compaction as it doesn't write any output. As such it bypasses most of the usual compaction machinery, so the latter doesn't have to be adapted to this outlier. This patch only adds the implementation, but it still cannot be started via `compact_sstables()`, that will be implemented by the next patches.	2021-07-12 10:25:15 +03:00
Botond Dénes	cae8624edb	sstables/compaction: extract compaction info creation into static method To make this snippet reusable by the soon-to-be-added validation compaction as well.	2021-07-12 07:53:11 +03:00

1 2 3 4 5 ...

27313 Commits