db/view/view_builder: don't drop partition and range tombstones when resuming
The view builder builds the views from a given base table in view_builder::batch_size batches of rows. After processing this many rows, it suspends so the view builder can switch to building views for other base tables in the name of fairness. When resuming the build step for a given base table, it reuses the reader used previously (also serving the role of a snapshot, pinning sstables read from). The compactor however is created anew. As the reader can be in the middle of a partition, the view builder injects a partition start into the compactor to prime it for continuing the partition. This however only included the partition-key, crucially missing any active tombstones: partition tombstone or -- since the v2 transition -- active range tombstone. This can result in base rows covered by either of this to be resurrected and the view builder to generate view updates for them. This patch solves this by using the detach-state mechanism of the compactor which was explicitly developed for situations like this (in the range scan code) -- resuming a read with the readers kept but the compactor recreated. Also included are two test cases reproducing the problem, one with a range tombstone, the other with a partition tombstone. Fixes: #11668 Closes #11671
This commit is contained in:
committed by
Nadav Har'El
parent
2c744628ae
commit
5621cdd7f9
@@ -2254,17 +2254,20 @@ public:
|
||||
// Called in the context of a seastar::thread.
|
||||
void view_builder::execute(build_step& step, exponential_backoff_retry r) {
|
||||
gc_clock::time_point now = gc_clock::now();
|
||||
auto consumer = compact_for_query_v2<view_builder::consumer>(
|
||||
auto compaction_state = make_lw_shared<compact_for_query_state_v2>(
|
||||
*step.reader.schema(),
|
||||
now,
|
||||
step.pslice,
|
||||
batch_size,
|
||||
query::max_partitions,
|
||||
view_builder::consumer{*this, step, now});
|
||||
if (auto mfp = step.reader.peek().get(); mfp && !mfp->is_partition_start()) {
|
||||
consumer.consume_new_partition(step.current_key); // Initialize the state in case we're resuming a partition
|
||||
}
|
||||
query::max_partitions);
|
||||
auto consumer = compact_for_query_v2<view_builder::consumer>(compaction_state, view_builder::consumer{*this, step, now});
|
||||
auto built = step.reader.consume_in_thread(std::move(consumer));
|
||||
if (auto ds = std::move(*compaction_state).detach_state()) {
|
||||
if (ds->current_tombstone) {
|
||||
step.reader.unpop_mutation_fragment(mutation_fragment_v2(*step.reader.schema(), step.reader.permit(), std::move(*ds->current_tombstone)));
|
||||
}
|
||||
step.reader.unpop_mutation_fragment(mutation_fragment_v2(*step.reader.schema(), step.reader.permit(), std::move(ds->partition_start)));
|
||||
}
|
||||
|
||||
_as.check();
|
||||
|
||||
|
||||
Reference in New Issue
Block a user