message: messaging_service: fix topology_ignored for pending endpoints in get_rpc_client

`get_rpc_client` calculates a `topology_ignored` field when creating a
client which says whether the client's endpoint had topology information
when topology was created. This is later used to check if that client
needs to be dropped and replaced with a new client which uses the
correct topology information.

The `topology_ignored` field was incorrectly calculated as `true` for
pending endpoints even though we had topology information for them. This
would lead to unnecessary drops of RPC clients later. Fix this.

Remove the default parameter for `with_pending` from
`topology::has_endpoint` to avoid similar bugs in the future.

Apparently this fixes #11780. The verbs used by decommission operation
use RPC client index 1 (see `do_get_rpc_client_idx` in
message/messaging_service.cc). From local testing with additional
logging I found that by the time this client is created (i.e. the first
verb in this group is used), we already know the topology. The node is
pending at that point - hence the bug would cause us to assume we don't
know the topology, leading us to dropping the RPC client later, possibly
in the middle of a decommission operation.

Fixes: #11780
This commit is contained in:
Kamil Braun
2022-11-09 13:47:27 +01:00
parent 840be34b5f
commit 1bd2471c19
3 changed files with 7 additions and 4 deletions

View File

@@ -562,7 +562,7 @@ const std::unordered_map<inet_address, host_id>& token_metadata_impl::get_endpoi
}
bool token_metadata_impl::is_member(inet_address endpoint) const {
return _topology.has_endpoint(endpoint);
return _topology.has_endpoint(endpoint, topology::pending::no);
}
void token_metadata_impl::add_bootstrap_token(token t, inet_address endpoint) {

View File

@@ -67,9 +67,10 @@ public:
void remove_endpoint(inet_address ep);
/**
* Returns true iff contains given endpoint
* Returns true iff contains given endpoint.
* Excludes pending endpoints if `with_pending == pending::no`.
*/
bool has_endpoint(inet_address, pending with_pending = pending::no) const;
bool has_endpoint(inet_address, pending with_pending) const;
const std::unordered_map<sstring,
std::unordered_set<inet_address>>&

View File

@@ -731,7 +731,9 @@ shared_ptr<messaging_service::rpc_protocol_client_wrapper> messaging_service::ge
std::optional<bool> topology_status;
auto has_topology = [&] {
if (!topology_status.has_value()) {
topology_status = _token_metadata ? _token_metadata->get()->get_topology().has_endpoint(id.addr) : false;
topology_status = _token_metadata
? _token_metadata->get()->get_topology().has_endpoint(id.addr, locator::topology::pending::yes)
: false;
}
return *topology_status;
};