scylla

Author	SHA1	Message	Date
Benny Halevy	5d7c80c148	view_update_generator::start: fix indentation Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Benny Halevy	02784f46b9	view_update_generator: handle errors when processing sstable Consumer may throw, in this case, break from the loop and retry. move_sstable_from_staging_in_thread may theoretically throw too, ignore the error in this case since the sstable was already processed, individual move failures are already ignored and moving from staging will be retried upon restart. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:20:20 +02:00
Benny Halevy	0d2a7111b2	view_update_generator: sstable_with_table: std::move constructor args Just a small optimization. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2019-12-17 12:19:55 +02:00
Piotr Sarna	9c5a5a5ac2	treewide: add names to semaphores By default, semaphore exceptions bring along very little context: either that a semaphore was broken or that it timed out. In order to make debugging easier without introducing significant runtime costs, a notion of named semaphore is added. A named semaphore is simply a semaphore with statically defined name, which is present in its errors, bringing valuable context. A semaphore defined as: auto sem = semaphore(0); will present the following message when it breaks: "Semaphore broken" However, a named semaphore: auto named_sem = named_semaphore(0, named_semaphore_exception_factory{"io_concurrency_sem"}); will present a message with at least some debugging context: "Semaphore broken: io_concurrency_sem" It's not much, but it would really help in pinpointing bugs without having to inspect core dumps. At the same time, it does not incur any costs for normal semaphore operations (except for its creation), but instead only uses more CPU in case an error is actually thrown, which is considered rare and not to be on the hot path. Refs #4999 Tests: unit(dev), manual: hardcoding a failure in view building code	2019-11-26 15:14:21 +02:00
Kamil Braun	2ada219f2c	view: generalize create_virtual_column and maybe_make_virtual to UDTs.	2019-10-25 12:04:44 +02:00
Kamil Braun	bbdb438d89	collection_mutation: easier (de)serialization of collection_mutation(s). `collection_type_impl::serialize_mutation_form` became `collection_mutation(_view)_description::serialize`. Previously callers had to cast their data_type down to collection_type to use serialize_mutation_form. Now it's done inside `serialize`. In the future `serialize` will be generalized to handle UDTs. `collection_type_impl::deserialize_mutation_form` became a free standing function `deserialize_collection_mutation` with similiar benefits. Actually, noone needs to call this function manually because of the next paragraph. A common pattern consisting of linearizing data inside a `collection_mutation_view` followed by calling `deserialize_mutation_form` has been abstracted out as a `with_deserialized` method inside collection_mutation_view. serialize_mutation_form_only_live was removed, because it hadn't been used anywhere.	2019-10-25 10:42:58 +02:00
Kamil Braun	b1d16c1601	types: move collection_type_impl::mutation(_view) out of collection_type_impl. collection_type_impl::mutation became collection_mutation_description. collection_type_impl::mutation_view became collection_mutation_view_description. These classes now reside inside collection_mutation.hh. Additional documentation has been written for these classes. Related function implementations were moved to collection_mutation.cc. This makes it easier to generalize these classes to non-frozen UDTs in future commits. The new names (together with documentation) better describe their purpose.	2019-10-25 10:19:45 +02:00
Piotr Sarna	9e98b51aaa	view: fix view_info select statement for local indexes Calculating the select statement for given view_info structure used to work fine, but once local indexes were introduced, a subtle bug appeared: the legacy token column does not exist in local indexes and a valid clustering key column was omitted instead. That results in potentially incorrect partition slices being used later in read-before-write. There's a long term plan for removing select_statement from view info altogether, but nonetheless the bug needs to be fixed first.	2019-10-14 17:14:19 +02:00
Kamil Braun	ef9d5750c8	view: fix bug in virtual columns. When creating a virtual column of non-frozen map type, the wrong type was used for the map's keys. Fixes #5165.	2019-10-11 20:47:06 +03:00
Piotr Sarna	feec3825aa	view: degrade shutdown bookkeeping update failures log to warn Currently, if updating bookkeeping operations for view building fails, we log the error message and continue. However, during shutdown, some errors are more likely to happen due to existing issues like #4384. To differentiate actual errors from semi-expected errors during shutdown, the latter are now logged with a warning level instead of error. Fixes #4954	2019-09-16 10:13:06 +03:00
Piotr Sarna	23c891923e	main: make sure view_builder doesn't propagate semaphore errors Stopping services which occurs in a destructor of deferred_action should not throw, or it will end the program with terminate(). View builder breaks a semaphore during its shutdown, which results in propagating a broken_semaphore exception, which in turn results in throwing an exception during stop().get(). In order to fix that issue, semaphore exceptions are explicitly ignored, since they're expected to appear during shutdown. Fixes #4875	2019-09-01 11:59:57 +03:00
Botond Dénes	136fc856c5	treewide: silence discarded future warnings for questionable discards This patches silences the remaining discarded future warnings, those where it cannot be determined with reasonable confidence that this was indeed the actual intent of the author, or that the discarding of the future could lead to problems. For all those places a FIXME is added, with the intent that these will be soon followed-up with an actual fix. I deliberately haven't fixed any of these, even if the fix seems trivial. It is too easy to overlook a bad fix mixed in with so many mechanical changes.	2019-08-26 19:28:43 +03:00
Botond Dénes	fddd9a88dd	treewide: silence discarded future warnings for legit discards This patch silences those future discard warnings where it is clear that discarding the future was actually the intent of the original author, and they did the necessary precautions (handling errors). The patch also adds some trivial error handling (logging the error) in some places, which were lacking this, but otherwise look ok. No functional changes.	2019-08-26 18:54:44 +03:00
Piotr Sarna	3cc5a04301	db,view: wrap view update generation in stream scheduling group Generating view updates is used by streaming, so the service itself should also run under the matching scheduling group.	2019-08-20 00:24:50 +02:00
Piotr Sarna	3c5dd94306	view: remove unused token_for function The function was only used once in code removed in this series.	2019-07-19 11:58:42 +02:00
Piotr Sarna	6a6871aa0e	view: check for computed columns in view Currently, having a 'computed' column in view update generation indicates that token value needs to be generated and assigned to it.	2019-07-19 11:58:42 +02:00
Piotr Sarna	85a3a4b458	view: ignore duplicated key entries in progress virtual reader Build progress virtual reader uses Scylla-specific scylla_views_builds_in_progress table in order to represent legacy views_builds_in_progress rows. The Scylla-specific table contains additional cpu_id clustering key part, which is trimmed before returning it to the user. That may cause duplicated clustering row fragments to be emitted by the reader, which may cause undefined behaviour in consumers. The solution is to keep track of previous clustering keys for each partition and drop fragments that would cause duplication. That way if any shard is still building a view, its progress will be returned, and if many shards are still building, the returned value will indicate the progress of a single arbitrary shard. Fixes #4524 Tests: unit(dev) + custom monotonicity checks from <tgrabiec@scylladb.com>	2019-06-11 13:01:31 +02:00
Piotr Sarna	cf8d2a5141	Revert "view: cache is_index for view pointer" This reverts commit `dbe8491655`. Caching the value was not done in a correct manner, which resulted in longevity tests failures. Fixes #4478 Branches: 3.1 Message-Id: <762ca9db618ca2ed7702372fbafe8ecd193dcf4d.1557129652.git.sarna@scylladb.com>	2019-05-06 11:45:46 +03:00
Duarte Nunes	ded9221187	db/view: Apply tracked tombstones for new updates When generating view updates for base mutations when no pre-existing data exists, we were forgetting to apply the tracked tombstones. Fixes #4321 Tests: unit(dev) Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2019-03-27 12:01:39 +00:00
Piotr Sarna	a7602bd2f1	database: add global view update stats Currently view update metrics are only per-table, but per-table metrics are not always enabled. In order to be able to see the number of generated view updates in all cases, global stats are added. Fixes #4221 Message-Id: <e94c27c530b2d7d262f76d03937e7874d674870a.1552552016.git.sarna@scylladb.com>	2019-03-14 12:04:18 +00:00
Piotr Sarna	5f85a7a821	db,view: fix virtual columns liveness checks When looking for optimization paths, columns selected in a view are checked against multiple conditions - unfortunately virtual columns were erroneously skipped from that check, which resulted in ignoring their TTLs. That can lead to overoptimizing and not including vital liveness info into view rows, which can then result in row disappearing too early.	2019-02-28 10:47:19 +01:00
Piotr Sarna	bd52e05ae2	view: minimize generated view updates for unselected columns In some cases generating view updates for columns that were not selected in CREATE VIEW statement is redundant - it is the case when the update will not influence row liveness in anyway. Currently, these cases are optimized out: - row marker is live and only unselected columns were updated; - row marked is not live and only unselected columns were updated, and in the process nothing was created or deleted and there was no TTL involved;	2019-02-20 14:05:27 +01:00
Piotr Sarna	dbe8491655	view: cache is_index for view pointer It's detrimental to keep querying index manager whether a view is backing a secondary index every time, so this value is cached at construct time. At the same time, this value is not simply passed to view_info when being created in secondary index manager, in order to decouple materialized view logic from secondary indexes as much as possible (the sole existence of is_index() is bad enough).	2019-02-20 12:52:32 +01:00
Nadav Har'El	05db7d8957	Materialized views: name the "batch_memory_max" constant Give the constant 1024*1024 introduced in an earlier commit a name, "batch_memory_max", and move it from view.cc to view_builder.hh. It now resides next to the pre-existing constant that controlled how many rows were read in each build step, "batch_size". Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190217100222.15673-1-nyh@scylladb.com>	2019-02-17 13:28:16 +00:00
Nadav Har'El	fec562ec8f	Materialized views: limit size of row batching during bulk view building The bulk materialized-view building processes (when adding a materialized view to a table with existing data) currently reads the base table in batches of 128 (view_builder::batch_size) rows. This is clearly better than reading entire partitions (which may be huge), but still, 128 rows may grow pretty large when we have rows with large strings or blobs, and there is no real reason to buffer 128 rows when they are large. Instead, when the rows we read so far exceed some size threshold (in this patch, 1MB), we can operate on them immediately instead of waiting for 128. As a side-effect, this patch also solves another bug: At worst case, all the base rows of one batch may be written into one output view partition, in one mutation. But there is a hard limit on the size of one mutation (commitlog_segment_size_in_mb, by default 32MB), so we cannot allow the batch size to exceed this limit. By not batching further after 1MB, we avoid reaching this limit when individual rows do not reach it but 128 of them did. Fixes #4213. This patch also includes a unit test reproducing #4213, and demonstrating that it is now solved. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190214093424.7172-1-nyh@scylladb.com>	2019-02-14 12:04:40 +02:00
Piotr Sarna	9a6261ca27	db,view: add updating view_building_paused statistics Each time view building does is paused because of connection failure, view_building_paused metrics is bumped.	2019-01-28 09:38:42 +01:00
Piotr Sarna	e30cf22956	db,view: add allow_hints parameter to mutate_MV Mutating MV function can now accept a parameter whether hints should be allowed during sending mutations to endpoints.	2019-01-28 09:38:42 +01:00
Piotr Sarna	e0fe9ce2c0	storage_proxy: add allow_hints parameter to send_to_endpoint With hints allowed, send_to_endpoint will leverage consistency level ANY to send data. Otherwise, it will use the default - cl::ONE.	2019-01-28 09:38:41 +01:00
Piotr Sarna	02d88de082	db,view: add consuming units in staging table registration View update generator service can accept sstables even before it starts, but it should still acknowledge the number of waiters in the semaphore. Reported-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <fcaa0f2884ebb4d34d1716e9e1cfed0642b4b85d.1547661048.git.sarna@scylladb.com>	2019-01-16 18:05:17 +00:00
Duarte Nunes	04a14b27e4	Merge 'Add handling staging sstables to /upload dir' from Piotr " This series adds generating view updates from sstables added through /upload directory if their tables have accompanying materialized views. Said sstables are left in /upload directory until updates are generated from them and are treated just like staging sstables from /staging dir. If there are no views for a given tables, sstables are simply moved from /upload dir to datadir without any changes. Tests: unit (release) " * 'add_handling_staging_sstables_to_upload_dir_5' of https://github.com/psarna/scylla: all: rename view_update_from_staging_generator distributed_loader: fix indentation service: add generating view updates from uploaded sstables init: pass view update generator to storage service sstables: treat sstables in upload dir as needing view build sstables,table: rename is_staging to requires_view_building distributed_loader: use proper directory for opening SSTable db,view: make throttling optional for view_update_generator	2019-01-15 18:19:27 +00:00
Piotr Sarna	0eb703dc80	all: rename view_update_from_staging_generator The new name, view_update_generator, is both more concise and correct, since we now generate from directories other than "/staging".	2019-01-15 17:31:47 +01:00
Piotr Sarna	beb4836726	db,view: make throttling optional for view_update_generator Currently registering new view updates is throttled by a semaphore, which makes sense during stream sessions in order to avoid overloading the queue. Still, registration also occurs during initialization, where it makes little sense to wait on a semaphore, since view update generator might not have started at all yet.	2019-01-15 16:47:01 +01:00
Piotr Sarna	b9203ec4f8	view: wait for stream sessions to finish before view building During streaming, there's a race between streamed sstables and view creation, which might result in some tables not being used to generate view updates, even though they should. That happens when the decision about view update path for a table is done before view creation, but after already receiving some sstables via streaming. These will not be used in view building even though they should. Hence, a phaser is used to make the view builder wait for all ongoing stream sessions for a table to finish before proceeding with build steps. Refs #4032	2019-01-15 09:36:55 +01:00
Duarte Nunes	fa2b0384d2	Replace std::experimental types with C++17 std version. Replace stdx::optional and stdx::string_view with the C++ std counterparts. Some instances of boost::variant were also replaced with std::variant, namely those that called seastar::visit. Scylla now requires GCC 8 to compile. Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20190108111141.5369-1-duarte@scylladb.com>	2019-01-08 13:16:36 +02:00
Piotr Sarna	9d46715613	streaming,view: move view update checks to separate file Checking if view update path should be used for sstables is going to be reused in row level repair code, so relevant functions are moved to a separate header.	2019-01-03 08:31:40 +01:00
Duarte Nunes	f41d13f38c	db/view/view_update_from_staging_generator: Break semaphore on stop() This avoid having fibers waiting _registration_sem without ever being notified. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-29 12:55:04 +00:00
Duarte Nunes	4974addc5c	db/view/view_update_from_staging_generator: Restore formatting Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-29 12:55:02 +00:00
Duarte Nunes	201196130d	db/view/view_update_from_staging_generator: Avoid creating more than one fiber If view_update_from_staging_generator::maybe_generate_view_updates() is called before view_update_from_staging_generator::start(), as can happen in main.cc, then we can potentially create more than one fiber, which leads to corrupted state and conflicting operations. To avoid this, use just one fiber and be explicit about notifying it that more work is needed, by leveraging a condition-variable. Fixes #4021 Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-29 12:52:51 +00:00
Avi Kivity	0c0cc66ee7	system_keyspace, view: reduce interdependencies system_keyspace is an implementation detail for most of its users, not part of the interface, as it's only used to store internal data. Therefore, including it in a header file causes unneeded dependencies. This patch removes a dependency between views and system_keyspace.hh by moving view_name and view_build_progress into a separate header file, and using forward declarations where possible. This allows us to remove an inclusion of system_keyspace.hh from a header file (the last one), so that further changes to system_keyspace.hh will cause fewer recompilations. Message-Id: <20181228215736.11493-1-avi@scylladb.com>	2018-12-29 12:12:15 +00:00
Duarte Nunes	2bd76f8fc5	db/view: Introduce node_update_backlog class This class is an atomic view update backlog representation, safe to update from multiple shards. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:30 +00:00
Duarte Nunes	12ce517242	db/view: Add view_update_backlog The view update backlog represents the pending view data that a base replica maintains. It is the maximum of the memory backlog - how much memory pending view updates are consuming - and the disk backlog - how much view hints are consuming. The size of a backlog is relative to its maximum size. We will use this class to represent a base replica's view update backlog at the coordinator. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:30 +00:00
Duarte Nunes	a3d30ea99a	db/view: Propagate acquired semaphore units to mutate_MV() Propagate acquired semaphore units to mutate_MV() to allow the semaphore to be incrementally signalled as view updates are processed by view replicas. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:29 +00:00
Duarte Nunes	2753cfee88	db/view: Generate view updates as frozen_mutations Working in terms of frozen_mutations allows us to account more precisely the memory pending view updates consume at the storage_proxy layer. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:29 +00:00
Duarte Nunes	715da6fd6b	db/view: Reserve vector space in mutate_MV() Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:29 +00:00
Duarte Nunes	5d011eb61f	db/view: Cleanup mutate_MV() In particular, extract out the logic updating the stats in case of a failed update. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2018-12-19 22:38:29 +00:00
Botond Dénes	1865e5da41	treewide: remove include database.hh from headers where possible Many headers don't really need to include database.hh, the include can be replaced by forward declarations and/or including the actually needed headers directly. Some headers don't need this include at all. Each header was verified to be compilable on its own after the change, by including it into an empty `.cc` file and compiling it. `.cc` files that used to get `database.hh` through headers that no longer include it were changed to include it themselves.	2018-12-14 08:03:57 +02:00
Paweł Dziepak	9024187222	partition_slice: use small_vector for column_ids	2018-12-06 14:21:04 +00:00
Duarte Nunes	6fbf792777	db/view/view_builder: Don't timeout waiting for view to be built Remove the timeout argument to db::view::view_builder::wait_until_built(), a test-only function to wait until a given materialized view has finished building. This change is motivated by the fact that some tests running on slow environments will timeout. Instead of incrementally increasing the timeout, remove it completely since tests are already run under an exterior timeout. Fixes #3920 Tests: unit release(view_build_test, view_schema_test) Signed-off-by: Duarte Nunes <duarte@scylladb.com> Message-Id: <20181115173902.19048-1-duarte@scylladb.com>	2018-11-15 19:41:43 +02:00
Piotr Sarna	fc7267c797	db/view: add view_update_from_staging_generator service A shardable service for generating mv updates after restarts is added.	2018-11-13 15:01:52 +01:00
Piotr Sarna	ed05d91adc	db/view: add view updating consumer This consumer is used to generate and push view replica updates from read mutations.	2018-11-13 14:54:39 +01:00

... 6 7 8 9 10

488 Commits