scylla

Author	SHA1	Message	Date
Botond Dénes	d66b07823b	db/view/view_updating_consumer: account for the size of mutations All partitions will have a corresponding mutation object in the buffer. These objects have non-negligible sizes, yet the consumer did not bump the _buffer_size when a new partition was consumer. This resulted in empty partitions not moving the _buffer_size at all, and thus they could accumulate without bounds in the buffer, never triggering a flush just by themselves. We have recently seen this causing OOM. This patch fixes that by bumping the _buffer_size with the size of the freshly created mutation object.	2023-07-26 03:07:25 -04:00
Michał Chojnowski	ac29b6f198	view_updating_consumer: make buffer limit a variable The limit doesn't change at runtime, but we this patch makes it variable for unit testing purposes.	2023-07-05 17:33:47 +02:00
Michał Chojnowski	5ad0846bff	view: fix range tombstone handling on flushes in view_updating_consumer View update routines accept `mutation` objects. But what comes out of staging sstable readers is a stream of mutation_fragment_v2 objects. To build view updates after a repair/streaming, we have to convert the fragment stream into `mutation`s. This is done by piping the stream to mutation_rebuilder_v2. To keep memory usage limited, the stream for a single partition might have to be split into multiple partial `mutation` objects. view_update_consumer does that, but in improper way -- when the split/flush happens inside an active range tombstone, the range tombstone isn't closed properly. This is illegal, and triggers an internal error. This patch fixes the problem by closing the active range tombstone (and reopening in the same position in the next `mutation` object). The tombstone is closed just after the last seen clustered position. This is not necessary for correctness -- for example we could delay all processing of the range tombstone until we see its end bound -- but it seems like the most natural semantic. Fixes #14503	2023-07-04 20:33:21 +02:00
Pavel Emelyanov	2652dffd89	view: Capture v.u.generator on view_updating_consumer lambda The consumer is in fact pushing the updates and _that_'s the component that would really need the view_update_generator at hand. The consumer is created from the generator itself so no troubles getting the pointer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-03-29 14:10:55 +03:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Avi Kivity	c5e4bf51bd	Introduce mutation/ module Move mutation-related files to a new mutation/ directory. The names are kept in the global namespace to reduce churn; the names are unambiguous in any case. mutation_reader remains in the readers/ module. mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this patch. This is a step forward towards librarization or modularization of the source base. Closes #12788	2023-02-14 11:19:03 +02:00
Mikołaj Sielużycki	6f1b6da68a	compile: Fix headers so that *-headers targets compile cleanly. Closes #10273	2022-03-25 16:19:26 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Botond Dénes	05c48ee0cc	db/view/view_updating_consumer: migrate to v2 Not a completely mechanical transition. The consumer has to generate its mutation via a mutation_rebuilder_v2 as mutation fragment v2 cannot be applied to mutations directly yet.	2022-02-21 12:29:24 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Avi Kivity	bbad8f4677	replica: move ::database, ::keyspace, and ::table to replica namespace Move replica-oriented classes to the replica namespace. The main classes moved are ::database, ::keyspace, and ::table, but a few ancillary classes are also moved. There are certainly classes that should be moved but aren't (like distributed_loader) but we have to start somewhere. References are adjusted treewide. In many cases, it is obvious that a call site should not access the replica (but the data_dictionary instead), but that is left for separate work. scylla-gdb.py is adjusted to look for both the new and old names.	2022-01-07 12:04:38 +02:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Nadav Har'El	64a4e5e059	cross-tree: reduce dependency on db/config.hh and database.hh Every time db/config.hh is modified (e.g., to add a new configuration option), 110 source files need to be recompiled. Many of those 110 didn't really care about configuration options, and just got the dependency accidentally by including some other header file. In this patch, I remove the include of "db/config.hh" from all header files. It is only needed in source files - and header files only need forward declarations. In some cases, source files were missing certain includes which they got incidentally from db/config.hh, so I had to add these includes explicitly. After this patch, the number of source files that get recompiled after a change to db/config.hh goes down from 110 to 45. It also means that 65 source files now compile faster because they don't include db/config.hh and whatever it included. Additionally, this patch also eliminates a few unnecessary inclusions of database.hh in other header files, which can use a forward declaration or database_fwd.hh. Some of the source files including one of those header files relied on one of the many header files brought in by database.hh, so we need to include those explicitly. In view_update_generator.hh something interesting happened - it needs database.hh because of code in the header file, but only included database_fwd.hh, and the only reason this worked was that the files including view_update_generator.hh already happened to unnecessarily include database.hh. So we fix that too. Refs #1 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210505121830.964529-1-nyh@scylladb.com>	2021-05-05 15:24:25 +03:00
Nadav Har'El	58e275e362	cross-tree: reduce dependency on db/config.hh and database.hh Every time db/config.hh is modified (e.g., to add a new configuration option), 110 source files need to be recompiled. Many of those 110 didn't really care about configuration options, and just got the dependency accidentally by including some other header file. In this patch, I remove the include of "db/config.hh" from all header files. It is only needed in source files - and header files only need forward declarations. In some cases, source files were missing certain includes which they got incidentally from db/config.hh, so I had to add these includes explicitly. After this patch, the number of source files that get recompiled after a change to db/config.hh goes down from 110 to 45. It also means that 65 source files now compile faster because they don't include db/config.hh and whatever it included. Additionally, this patch also eliminates a few unnecessary inclusions of database.hh in other header files, which can use a forward declaration or database_fwd.hh. Some of the source files including one of those header files relied on one of the many header files brought in by database.hh, so we need to include those explicitly. In view_update_generator.hh something interesting happened - it needs database.hh because of code in the header file, but only included database_fwd.hh, and the only reason this worked was that the files including view_update_generator.hh already happened to unnecessarily include database.hh. So we fix that too. Refs #1 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210505102111.955470-1-nyh@scylladb.com>	2021-05-05 13:23:00 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	566e31a5ac	db/view: view_updating_consumer: allow passing custom update pusher So that tests can test the `view_update_consumer` in isolation, without having to set up the whole database machinery. In addition to less infrastructure setup, this allows more direct checking of mutations pushed for view generation.	2020-07-20 11:23:39 +03:00
Botond Dénes	0166f97096	db/view: view_update_generator: make staging reader evictable The view update generation process creates two readers. One is used to read the staging sstables, the data which needs view updates to be generated for, and another reader for each processed mutation, which reads the current value (pre-image) of each row in said mutation. The staging reader is created first and is kept alive until all staging data is processed. The pre-image reader is created separately for each processed mutation. The staging reader is not restricted, meaning it does not wait for admission on the relevant reader concurrency semaphore, but it does register its resource usage on it. The pre-image reader however is restricted. This creates a situation, where the staging reader possibly consumes all resources from the semaphore, leaving none for the later created pre-image reader, which will not be able to start reading. This will block the view building process meaning that the staging reader will not be destroyed, causing a deadlock. This patch solves this by making the staging reader restricted and making it evictable. To prevent thrashing -- evicting the staging reader after reading only a really small partition -- we only make the staging reader evictable after we have read at least 1MB worth of data from it.	2020-07-20 11:23:39 +03:00
Glauber Costa	1f9c37fb5e	view_updating_consumer: move reference to a pointer It is currently not possible to wrap the view_updating_consumer in an std::optional. I intend to do it to allow for compactions to optionally generate view updates. The reason for that is that view_updating_consumer has a reference as a member, which makes the move assignment constructor not be implicitly generated. This patch fixes it by keeping a pointer instead of a reference. Signed-off-by: Glauber Costa <glauber@scylladb.com> Message-Id: <20200421123648.8328-1-glauber@scylladb.com>	2020-04-22 10:05:35 +03:00
Glauber Costa	4e6400293e	staging: potentially read many SSTables at the same time There is no reason to read a single SSTable at a time from the staging directory. Moving SSTables from staging directory essentially involves scanning input SSTables and creating new SSTables (albeit in a different directory). We have a mechanism that does that: compactions. In a follow up patch, I will introduce a new specialization of compaction that moves SSTables from staging (potentially compacting them if there are plenty). In preparation for that, some signatures have to be changed and the view_updating_consumer has to be more compaction friendly. Meaning: - Operating with an sstable vector - taking a table reference, not a database Because this code is a bit fragile and the reviewer set is fundamentally different from anything compaction related, I am sending this separately Signed-off-by: Glauber Costa <glauber@scylladb.com>	2020-04-15 11:26:44 -04:00
Pavel Emelyanov	4fa12f2fb8	header: De-bloat schema.hh The header sits in many other headers, but there's a handy schema_fwd.hh that's tiny and contains needed declarations for other headers. So replace shema.hh with schema_fwd.hh in most of the headers (and remove completely from some). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200303102050.18462-1-xemul@scylladb.com>	2020-03-03 11:34:00 +01:00
Pavel Emelyanov	e2ec5eecf6	view_update: Do not need storage_proxy The view_update_generator acceps (and keeps) database and storage_proxy, the latter is only needed to initialize the view_updating_consumer which, in turn, only needs it to get database from (to find column family). This can be relaxed by providing the database from _generator to _consumer directly, without using the storage_proxy in between. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20200207112427.18419-1-xemul@scylladb.com>	2020-02-07 13:30:01 +02:00
Botond Dénes	1865e5da41	treewide: remove include database.hh from headers where possible Many headers don't really need to include database.hh, the include can be replaced by forward declarations and/or including the actually needed headers directly. Some headers don't need this include at all. Each header was verified to be compilable on its own after the change, by including it into an empty `.cc` file and compiling it. `.cc` files that used to get `database.hh` through headers that no longer include it were changed to include it themselves.	2018-12-14 08:03:57 +02:00
Piotr Sarna	ed05d91adc	db/view: add view updating consumer This consumer is used to generate and push view replica updates from read mutations.	2018-11-13 14:54:39 +01:00

23 Commits