repair: Release permit earlier when the repair_reader is done

Consider

- 10 repair instances take all the 10 _streaming_concurrency_sem

- repair readers are done but the permits are not released since they
  are waiting for view update _registration_sem

- view updates trying to take the _streaming_concurrency_sem to make
  progress of view update so it could release _registration_sem, but it
  could not take _streaming_concurrency_sem since the 10 repair
  instances have taken them

- deadlock happens

Note, when the readers are done, i.e., reaching EOS, the repair reader
replaces the underlying (evictable) reader with an empty reader. The
empty reader is not evictable, so the resources cannot be forcibly
released.

To fix, release the permits manually as soon as the repair readers are
done even if the repair job is waiting for _registration_sem.

Fixes #14676

Closes #14677
This commit is contained in:
Asias He
2023-07-13 12:29:38 +08:00
committed by Botond Dénes
parent 6a7d980a5d
commit 1b577e0414

View File

@@ -361,6 +361,7 @@ repair_reader::read_mutation_fragment() {
future<> repair_reader::on_end_of_stream() noexcept {
return _reader.close().then([this] {
_permit.release_base_resources();
_reader = mutation_fragment_v1_stream(make_empty_flat_reader_v2(_schema, _permit));
_reader_handle.reset();
});
@@ -368,6 +369,7 @@ future<> repair_reader::on_end_of_stream() noexcept {
future<> repair_reader::close() noexcept {
return _reader.close().then([this] {
_permit.release_base_resources();
_reader_handle.reset();
});
}