messaging_service: Define metrics domain for client connections
Recent seastar update included RPC metrics (scylladb/seastar#1753). The reported metrics groups together sockets based on their "metrics_domain" configuration option. This patch makes use of this domain to make scylla metrics sane. The domain as this patch defines it includes two strings: First, the datacenter the server lives in. This is because grouping metrics for connections to different datacenters makes little sense for several reasons. For example -- packet delays _will_ differ for local-DC vs cross-DC traffic and mixing those latencies together is pointless. Another example -- the amount of traffic may also differ for local- vs cross-DC connections e.g. because of different usage of enryption and/or compression. Second, each verb-idx gets its own domain. That's to be able to analyze e.g. query-related traffic from gossiper one. For that the existing isolation cookie is taken as is. Note, that the metrics is _not_ per-server node. So e.g. two gossiper connections to two different nodes (in one DC) will belong to the same domain and thus their stats will be summed when reported. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#15785
This commit is contained in:
committed by
Kamil Braun
parent
efd65aebb2
commit
492b842929
@@ -294,6 +294,27 @@ bool messaging_service::is_same_rack(inet_address addr) const {
|
||||
return topo.get_rack(addr) == topo.get_rack();
|
||||
}
|
||||
|
||||
// The socket metrics domain defines the way RPC metrics are grouped
|
||||
// for different sockets. Thus, the domain includes:
|
||||
//
|
||||
// - Target datacenter name, because it's pointless to merge networking
|
||||
// statis for connections that are in advance known to have different
|
||||
// timings and rates
|
||||
// - The verb-idx to tell different RPC channels from each other. For
|
||||
// that the isolation cookie suits very well, because these cookies
|
||||
// are different for different indices and are more informative than
|
||||
// plain numbers
|
||||
sstring messaging_service::client_metrics_domain(unsigned idx, inet_address addr) const {
|
||||
sstring ret = _scheduling_info_for_connection_index[idx].isolation_cookie;
|
||||
if (_token_metadata) {
|
||||
const auto& topo = _token_metadata->get()->get_topology();
|
||||
if (topo.has_endpoint(addr)) {
|
||||
ret += ":" + topo.get_datacenter(addr);
|
||||
}
|
||||
}
|
||||
return ret;
|
||||
}
|
||||
|
||||
future<> messaging_service::ban_host(locator::host_id id) {
|
||||
return container().invoke_on_all([id] (messaging_service& ms) {
|
||||
if (ms._banned_hosts.contains(id) || ms.is_shutting_down()) {
|
||||
@@ -884,6 +905,7 @@ shared_ptr<messaging_service::rpc_protocol_client_wrapper> messaging_service::ge
|
||||
opts.tcp_nodelay = must_tcp_nodelay;
|
||||
opts.reuseaddr = true;
|
||||
opts.isolation_cookie = _scheduling_info_for_connection_index[idx].isolation_cookie;
|
||||
opts.metrics_domain = client_metrics_domain(idx, id.addr); // not just `addr` as the latter may be internal IP
|
||||
|
||||
assert(!must_encrypt || _credentials);
|
||||
|
||||
|
||||
@@ -528,6 +528,8 @@ private:
|
||||
|
||||
bool is_host_banned(locator::host_id);
|
||||
|
||||
sstring client_metrics_domain(unsigned idx, inet_address addr) const;
|
||||
|
||||
public:
|
||||
// Return rpc::protocol::client for a shard which is a ip + cpuid pair.
|
||||
shared_ptr<rpc_protocol_client_wrapper> get_rpc_client(messaging_verb verb, msg_addr id);
|
||||
|
||||
Reference in New Issue
Block a user