topology_coordinator: allow cleanup_target transition without barrier after failure in allow_write_both_read_old
When a tablet is in `allow_write_both_read_old`, progressing normally
requires a barrier. If this first barrier fails, the tablet is supposed
to transition to `cleanup_target` on the next iteration:
```
case locator::tablet_transition_stage::allow_write_both_read_old:
if (action_failed(tablet_state.barriers[trinfo.stage])) {
if (check_excluded_replicas()) {
transition_to_with_barrier(locator::tablet_transition_stage::cleanup_target);
break;
}
}
if (do_barrier()) {
...
}
break;
```
That transition itself requires a barrier, which is executed asynchronously.
Because the barrier runs in the background, the cleanup logic is skipped in
that iteration.
On the following iteration, `action_failed(barriers[stage])` no longer
returns true, since the node that caused the original barrier failure
has been excluded. The barrier is therefore observed as successful,
and the tablet incorrectly proceeds to the next stage instead of entering
`cleanup_target`.
Since `cleanup_target` does not modify read/write selectors, the transition
can be done safely without a barrier, simplifying the state machine and
ensuring cleanup is not skipped.
Without it, the tablet would still eventually reach `cleanup_target` via
`write_both_read_old` and `streaming`, but that path is unnecessary.
This commit is contained in:
@@ -1539,7 +1539,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber
|
||||
case locator::tablet_transition_stage::allow_write_both_read_old:
|
||||
if (action_failed(tablet_state.barriers[trinfo.stage])) {
|
||||
if (check_excluded_replicas()) {
|
||||
transition_to_with_barrier(locator::tablet_transition_stage::cleanup_target);
|
||||
transition_to(locator::tablet_transition_stage::cleanup_target);
|
||||
break;
|
||||
}
|
||||
}
|
||||
|
||||
Reference in New Issue
Block a user