sstables: Fix race when loading checksum component
`read_checksum()` loads the checksum component from disk and stores a non-owning reference in the shareable components. To avoid loading the same component twice, the function has an early return statement. However, this does not guarantee atomicity - two fibers or threads may load the component and update the shareable components concurrently. This can lead to use-after-free situations when accessing the component through the shareable components, since the reference stored there is non-owning. This can happen when multiple compaction tasks run on the same SSTable (e.g., regular compaction and scrub-validate). Fix this by not updating the reference in shareable components, if a reference is already in place. Instead, create an owning reference to the existing component for the current fiber. This is less efficient than using a mutex, since the component may be loaded multiple times from disk before noticing the race, but no locks are used for any other SSTable component either. Also, this affects uncompressed SSTables, which are not that common. Fixes #23728. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com> Closes scylladb/scylladb#23872
This commit is contained in:
committed by
Botond Dénes
parent
2739eb49fd
commit
eaa2ce1bb5
@@ -2600,7 +2600,7 @@ future<lw_shared_ptr<checksum>> sstable::read_checksum() {
|
||||
co_return nullptr;
|
||||
}
|
||||
auto checksum = make_lw_shared<sstables::checksum>();
|
||||
co_await do_read_simple(component_type::CRC, [checksum, this] (version_types v, file crc_file) -> future<> {
|
||||
co_await do_read_simple(component_type::CRC, [&checksum, this] (version_types v, file crc_file) -> future<> {
|
||||
file_input_stream_options options;
|
||||
options.buffer_size = 4096;
|
||||
|
||||
@@ -2627,7 +2627,14 @@ future<lw_shared_ptr<checksum>> sstable::read_checksum() {
|
||||
|
||||
co_await crc_stream.close();
|
||||
maybe_rethrow_exception(std::move(ex));
|
||||
_components->checksum = checksum->weak_from_this();
|
||||
if (!_components->checksum) {
|
||||
_components->checksum = checksum->weak_from_this();
|
||||
} else {
|
||||
// Race condition: Another fiber/thread has called `read_checksum()`
|
||||
// while we were loading the component from disk. Discard our local
|
||||
// copy and use theirs.
|
||||
checksum = _components->checksum->shared_from_this();
|
||||
}
|
||||
});
|
||||
|
||||
co_return std::move(checksum);
|
||||
|
||||
Reference in New Issue
Block a user