From 1b577e0414547e775ef06370d1ed44c73492d517 Mon Sep 17 00:00:00 2001 From: Asias He Date: Thu, 13 Jul 2023 12:29:38 +0800 Subject: [PATCH] repair: Release permit earlier when the repair_reader is done Consider - 10 repair instances take all the 10 _streaming_concurrency_sem - repair readers are done but the permits are not released since they are waiting for view update _registration_sem - view updates trying to take the _streaming_concurrency_sem to make progress of view update so it could release _registration_sem, but it could not take _streaming_concurrency_sem since the 10 repair instances have taken them - deadlock happens Note, when the readers are done, i.e., reaching EOS, the repair reader replaces the underlying (evictable) reader with an empty reader. The empty reader is not evictable, so the resources cannot be forcibly released. To fix, release the permits manually as soon as the repair readers are done even if the repair job is waiting for _registration_sem. Fixes #14676 Closes #14677 --- repair/row_level.cc | 2 ++ 1 file changed, 2 insertions(+) diff --git a/repair/row_level.cc b/repair/row_level.cc index 665a772949..20fe2f7f8b 100644 --- a/repair/row_level.cc +++ b/repair/row_level.cc @@ -361,6 +361,7 @@ repair_reader::read_mutation_fragment() { future<> repair_reader::on_end_of_stream() noexcept { return _reader.close().then([this] { + _permit.release_base_resources(); _reader = mutation_fragment_v1_stream(make_empty_flat_reader_v2(_schema, _permit)); _reader_handle.reset(); }); @@ -368,6 +369,7 @@ future<> repair_reader::on_end_of_stream() noexcept { future<> repair_reader::close() noexcept { return _reader.close().then([this] { + _permit.release_base_resources(); _reader_handle.reset(); }); }