scylla

Author	SHA1	Message	Date
Pavel Emelyanov	47958a4b37	storage_service: Re-gossiping snitch data in reconfiguration callback Nowadays it's done inside snitch, and snitch needs to carry gossiper refernece for that. There's an ongoing effort in de-globalizing snitch and fixing its dependencies. This patch cuts this snitch->gossiper link to facilitate the mentioned effort. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-23 14:31:55 +03:00
Pavel Emelyanov	9e7407ff91	replication_strategy: Construct temp tokens in place Otherwise, the token_metadata object is default-initialized, then it's move-assigned from another object. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-22 19:19:32 +03:00
Pavel Emelyanov	d540af2cb0	topology: Define copy-sonctructor with init-lists Otherwise the topology is default-constructed, then its fields are copy-assigned with the data from the copy-from reference. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-22 19:18:58 +03:00
Pavel Emelyanov	5edeecf39b	token_metadata: Provide dc/rack for bootstrapping nodes The token_metadata::calculate_pending_ranges_for_bootstrap() makes a clone of itself and adds bootstrapping nodes to the clone to calculate ranges. Currently added nodes lack the dc/rack which affects the calculations the bad way. Unfortunately, the dc/rack for those nodes is not available on topology (yet) and needs pretty heavy patching to have. Fortunately, the only caller of this method has gossiper at hand to provide the dc/rack from. fixes: #11531 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11596	2022-09-22 06:55:52 +03:00
Michał Chojnowski	47844689d8	token_metadata: make local_dc_filter a lambda, not a std::function This std::function causes allocations, both on construction and in other operations. This costs ~2200 instructions for a DC-local query. Fix that. Closes #11494	2022-09-09 18:05:46 +02:00
Pavel Emelyanov	42c9f35374	topology: Mark compare_endpoints() arguments as const Continuation to `debfcc0e` (snitch: Move sort_by_proximity() to topology). The passed addresses are not modified by the helper. They are not yet const because the method was copy-n-pasted from snitch where it wasn't such. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220906074708.29574-1-xemul@scylladb.com>	2022-09-06 11:03:13 +03:00
Avi Kivity	ae4b2ee583	locator: token_metadata: drop unused and dangerous accessors The mutable get_datacenter_endpoints() and get_datacenter_racks() are dangerous since they expose internal members without enforcing class invariants. Fortunately they are unused, so delete them. Closes #11454	2022-09-06 06:08:02 +03:00
Pavel Emelyanov	debfcc0eff	snitch: Move sort_by_proximity() to topology Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:17:04 +03:00
Pavel Emelyanov	41973c5bf7	topology: Add "enable proximity sorting" bit There's one corner case in nodes sorting by snitch. The simple snitch code overloads the call and doesn't sort anything. The same behavior should be preserved by (future) topology implementation, but it doesn't know the snitch name. To address that the patch adds a boolean switch on topology that's turned off by main code when it sees the snitch is "simple" one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:15:07 +03:00
Pavel Emelyanov	b6fdea9a79	code: Call sort_endpoints_by_proximity() via topology The method is about to be moved from snitch to topology, this patch prepares the rest of the code to use the latter to call it. The topology's method just calls snitch, but it's going to change in the next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:14:01 +03:00
Pavel Emelyanov	4184091f1c	snitch, code: Remove get_sorted_list_by_proximity() There are two sorting methods in snitch -- one sorts the list of addresses in place, the other one creates a sorted copy of the passed const list (in fact -- the passed reference is not const, but it's not modified by the method). However, both callers of the latter anyway create their own temporary list of address, so they don't really benefit from snitch generating another copy. So this patch leaves just one sorting method -- the in-place one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:11:37 +03:00
Pavel Emelyanov	642e50f3e3	snitch: Move is_worth_merging_for_range_query to proxy Proxy is the only place that calls this method. Also the method name suggests it's not something "generic", but rather an internal logic of proxy's query processing. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-05 15:10:46 +03:00
Avi Kivity	61769d3b21	Merge "Make messaging service use topology for DC/RACK" from Pavel E " Messaging needs to know DC/RACK for nodes to decide whether it needs to do encryption or compression depending on the options. As all the other services did it still uses snitch to get it, but simple switch to use topology needs extra care. The thing is that messaging can use internal IP instead of endpoints. Currently it's snitch who tries har^w somehow to resolve this, in particular -- if the DC/RACK is not found for the given argument it assumes that it might be internal IP and calls back messaging to convert it to the endpoint. However, messaging does know when it uses which address and can do this conversion itself. So this set eliminates few more global snitch usages and drops the knot tieing snitch, gossiper and messaging with each-other. " * 'br-messaging-use-topology-1.2' of https://github.com/xemul/scylla: messaging: Get DC/RACK from topology messaging, topology: Keep shared_token_metadata* on messaging messaging: Add is_same_{dc\|rack} helpers snitch, messaging: Dont relookup dc/rack on internal IP	2022-09-04 13:54:34 +03:00
Pavel Emelyanov	6dedc69608	topology: Do not add bootstrapping nodes to topology Recent change in topology (commit `4cbe6ee9` titled "topology: Require entry in the map for update_normal_tokens()") made token_metadata::update_normal_tokens() require the entry presense in the embedded topology object. Respectively, the commit in question equipped most callers of update_normal_tokens() with preceeding topology update call to satisfy the requirement. However, tokens are put into token_metadata not only for normal state, but also for bootstrapping, and one place that added bootstrapping tokens errorneously got topology update. This is wrong -- node must not be present in the topology until switching into normal state. As the result several tests with bootstrapping nodes started to fail. The fix removes topology update for bootstrapping nodes, but this change reveals few other places that piggy-backed this mistaken update, so noy _they_ need to update topology themselves. tests: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/2040/ update_cluster_layout_tests.py::test_simple_add_new_node_while_schema_changes_with_repair update_cluster_layout_tests.py::test_simple_kill_new_node_while_bootstrapping_with_parallel_writes_in_multidc repair_based_node_operations_test.py::test_lcs_reshape_efficiency Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220902082753.17827-1-xemul@scylladb.com>	2022-09-04 13:53:38 +03:00
Pavel Emelyanov	c08c370c2c	snitch, messaging: Dont relookup dc/rack on internal IP When getting dc/rack snitch may perform two lookups -- first time it does it using the provided IP, if nothing is found snitch assumes that the IP is internal one, gets the corresponding public one and searches again. The thing is that the only code that may come to snitch with internal IP is the messaging service. It does so in two places: when it tries to connect to the given endpoing and when it accepts a connection. In the former case messaging performs public->internal IP conversion itself and goes to snitch with the internal IP value. This place can get simpler by just feeding the public IP to snich, and converting it to the internal only to initiate the connection. In the latter case the accepted IP can be either, but messaging service has the public<->private map onboard and can do the conversion itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-09-01 11:32:34 +03:00
Pavel Emelyanov	6405aba748	toplogy: Use the provided dc/rack info Previous patches made all the callers of topology.update_endpoint() (via token_metadata.update_topology()) provide correct dc/rack info for the endpoint. It's now possible to stop using global snitch by topology and just rely on the dc/rack argument. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 10:02:00 +03:00
Pavel Emelyanov	c043f6fa96	topology: Some renames after previous patch The topology::update_endpoint() is now a plain wrapper over private ::add_endpoint() method of the same class. It's simpler to merge them Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:46:26 +03:00
Pavel Emelyanov	4cbe6ee9f4	topology: Require entry in the map for update_normal_tokens() The method in question tries to be on the safest side and adds the enpoint for which it updates the tokens into the topology. From now on it's up to the caller to put the endpoint into topology in advance. So most of what this patch does is places topology.update_endpoint() into the relevant places of the code. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:44:08 +03:00
Pavel Emelyanov	5fc9854eae	topology: Make update_endpoint() accept dc-rack info The method in question populates topology's internal maps with endpoint vs dc/rack relations. As for today the dc/rack values are taken from the global snitch object (which, in turn, goes to gossiper, system keyspace and its internal non-updateable cache for that). This patch prepares the ground for providing the dc/rack externally via argument. By now it's just and argument with empty strings, but next patches will populate it with real values (spoiler: in 99% it's storage service that calls this method and each call will know where to get it from for sure) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:41:09 +03:00
Pavel Emelyanov	7305061674	replication_strategy: Accept dc-rack as get_pending_address_ranges argument The method creates a copy of token metadata and pushes an endpoint (with some tokens) into it. Next patches will require providing dc/rack info together with the endpoint, this patch prepares for that. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-26 09:39:44 +03:00
Avi Kivity	df87949241	Merge "Remove batch tokens update helper" from Pavel E " On token_metadata there are two update_normal_tokens() overloads -- one updates tokens for a single endpoint, another one -- for a set (well -- std::map) of them. Other than updating the tokens both methods also may add an endpoint to the t.m.'s topology object. There's an ongoing effort in moving the dc/rack information from snitch to topology, and one of the changes made in it is -- when adding an entry to topology, the dc/rack info should be provided by the caller (which is in 99% of the cases is the storage service). The batched tokens update is extremely unfriendly to the latter change. Fortunately, this helper is only used by tests, the core code always uses fine-grained tokens updating. " * 'br-tokens-update-relax' of https://github.com/xemul/scylla: token_metadata: Indentation fix after prevuous patch token_metadata: Remove excessive empty tokens check token_metadata: Remove batch tokens updating method tests: Use one-by-one tokens updating method	2022-08-25 12:01:58 +02:00
Pavel Emelyanov	d8c5044eee	token_metadata: Indentation fix after prevuous patch Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-24 08:24:21 +03:00
Pavel Emelyanov	8238c38e9f	token_metadata: Remove excessive empty tokens check After the previous patch empty passed tokens make the helper co_return early, so this if is the dead code Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-24 08:24:21 +03:00
Pavel Emelyanov	056d21c050	token_metadata: Remove batch tokens updating method No users left. The endpoint_tokens.empty() check is removed, only tests could trigger it, but they didn't and are patched out. Indentation is left broken Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-24 08:24:21 +03:00
Pavel Emelyanov	18fa5038b1	replication_strategy: Remove unused method The get_pending_address_ranges() accepting a single token is not in use, its peer that accepts a set of tokens is Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #11358	2022-08-23 20:23:50 +02:00
Benny Halevy	d295d8e280	everywhere: define locator::host_id as a strong tagged_uuid type So it can be distinguished from other uuid-based identifiers in the system. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11276	2022-08-12 06:01:44 +03:00
Benny Halevy	9167b857e9	abstract_replication_strategy: calculate_effective_replication_map: optimize for static replication strategies For replication strategies like "everywhere" and "local" that return the same set of endpoints for all tokens, we can call rs->calculate_natural_endpoints one once and reuse the result for all token. Note that ideally the replication_map could contain only a single token range for this case, but that does't seem to work yet. Add maybe_yield() calls to the tight loop to prevent reactor stalls on large clusters when copying a long vector returned by everywhere_replication_strategy to potentially 1000's of tokens in the map. Nicholas Peshek wrote in https://github.com/scylladb/scylladb/issues/10337#issuecomment-1211152370 about similar patch by Geoffrey Beausire: `994c6ecf3c` > Yep. That dropped our startup from 3000+ seconds to about 40. Fixes #10337 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-11 10:35:29 +03:00
Benny Halevy	eb678e723b	abstract_replication_strategy: add has_uniform_natural_endpoints So that using calaculate_natural_endpoints can be optimized for strategies that return the same endpoints for all tokens, namely everywhere_replication_strategy and local_strategy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-11 10:34:14 +03:00
Benny Halevy	91ab8ee1c3	effective_replication_map: make get_range_addresses asynchronous So it may yield, preenting reactor stalls as seen in #11005. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00
Benny Halevy	e541009f65	effective_replication_map: add get_replication_strategy And use it in storage_service::get_changed_ranges_for_leaving. A following patch will pass the e_r_m to storage_service::get_changed_ranges_for_leaving, rather than getting it there. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	6794e15163	effective_replication_map: get_range_addresses: use the precalculated replication_map There is no need to call get_natural_endpoints for every token in sorted_tokens order, since we can just get the precalculated per-token endpoints already in the _replication_map member. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	1d4aea4441	abstract_replication_strategy: get_pending_address_ranges: prevent extra vector copies Reduce large allocations and reactor stalls seen in #11005 by open coding `get_address_ranges` and using std::vector::insert to efficiently appending the ranges returned by `get_primary_ranges_for` onto the returned token_range_vector in contrast to building an unordered_multimap<inet_address, dht::token_range> first in `get_address_ranges` and traversing it and adding one token_range at a time. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	7811b0d0aa	abstract_replication_strategy: reindent	2022-08-08 17:31:00 +03:00
Benny Halevy	ebe1edc091	utils: sequenced_set: expose set and `contains` method And use that in sights using the endpoint set returned by abstract_replication_strategy::calculate_natural_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	7017ad6822	abstract_replication_strategy: calculate_natural_endpoints: return endpoint_set So it could be used also for easily searching for an endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Botond Dénes	fbbe2529c1	Merge "Remove global snitch usage from consistency_level.cc" from Pavel Emelyanov " There are several helpers in this .cc file that need to get datacenter for endpoints. For it they use global snitch, because there's no other place out there to get that data from. The whole dc/rack info is now moving to topology, so this set patches the consistency_level.cc to get the topology. This is done two ways. First, the helpers that have keyspace at hand may get the topology via ks's effective_replication_map. Two difficult cases are db::is_local() and db.count_local_endpoints() because both have just inet_address at hand. Those are patched to be methods of topology itself and all their callers already mess with token metadata and can get topology from it. " * 'br-consistency-level-over-topology' of https://github.com/xemul/scylla: consistency_level: Remove is_local() and count_local_endpoints() storage_proxy: Use topology::local_endpoints_count() storage_proxy: Use proxy's topology for DC checks storage_proxy: Keep shared_ptr<proxy> on digest_read_resolver storage_proxy: Use topology local_dc_filter in its methods storage_proxy: Mark some digest_read_resolver methods private forwarding_service: Use topology local_dc_filter storage_service: Use topology local_dc_filter consistency_level: Use topology local_dc_filter consitency-level: Call count_local_endpoints from topology consistency_level: Get datacenter from topology replication_strategy: Remove hold snitch reference effective_replication_map: Get datacenter from topology topology: Add local-dc detection shugar	2022-08-05 13:31:55 +03:00
Pavel Emelyanov	f84ee8f0fb	consistency_level: Get datacenter from topology In some of db/consistency_level.cc helpers the topology can be obtained from keyspace's effective replication map Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:47 +03:00
Pavel Emelyanov	00f166809e	replication_strategy: Remove hold snitch reference When the strategy is constructed there's no place to get snitch from so the global instance is used. However, after previous patch the replication strategy no longer needs snitch, so this dependency can be dropped Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:43 +03:00
Pavel Emelyanov	298213f27f	effective_replication_map: Get datacenter from topology Now it gets it from snitch, but the dc/rack info is being relocated onto topology. The topology is in turn already there Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:31 +03:00
Botond Dénes	df203a48af	Merge "Remove reconnectable_snitch_helper" from Pavel Emelyanov " The helper is in charge of receiving INTERNAL_IP app state from gossiper join/change notifications, updating system.peers with it and kicking messaging service to update its preferred ip cache along with initiating clients reconnection. Effectively this helper duplicates the topology tracking code in storage-service notifiers. Removing it makes less code and drops a bunch of unwanted cross-components dependencies, in particular: - one qctx call is gone - snitch (almost) no longer needs to get messaging from gossiper - public:private IP cache becomes local to messaging and can be moved to topology at low cost Some nice minor side effect -- this helper was left unsubscribed from gossiper on stop and snitch rename. Now its all gone. " * 'br-remove-reconnectible-snitch-helper-2' of https://github.com/xemul/scylla: snitch: Remove reconnectable snitch helper snitch, storage_service: Move reconnect to internal_ip kick snitch, storage_service: Move system.peers preferred_ip update snitch: Export prefer-local	2022-08-04 13:06:05 +03:00
Benny Halevy	0dfd92d0b3	token_metadata: allow update_normal_token_owners to yield Given #11146, we see a 10ms stall when calculate_natural_endpoints calls get_all_endpoints that up until this patch performed a similar loop on the `_token_to_endpoint_map`, so to prevent such a stall with large number of tokens, turn update_normal_token_owners async, and allow yielding in the per-token tight loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:32 +03:00
Benny Halevy	4f8ccef2c1	token_metadata: get_all_endpoints: return const unordered_set<inet_address>& There's no need to transform it into a vector. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:08 +03:00
Benny Halevy	a980f94d85	token_metadata: impl: keep the set of normal token owners as a member We don't need to recalculate the unique set of normal token everytime we change `_token_to_endpoint_map`. Similarly, this doesn't have to be done in `get_all_endpoints`. Instead we can maintain it inexpensively in `remove_endpoint`, and let `count_normal_token_owners` just return its size and `get_all_endpoints` just return the saved set. Note that currently topology is not updated accurately in update_normal_token() and it may contain endpoint that do no longer own any tokens. If we did update topology accurately there, we could use its locations map instead as its keys are equivalent to the unordered_set<inet_address> we implement here. Closes #11128 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:07 +03:00
Pavel Emelyanov	ee0828b506	topology: Add local-dc detection shugar It's often needed to check if an endpoint sits in the same DC as the current node. It can be done by topo.get_datacenter() == topo.get_datacenter(endpoint) but in some cases a RAII filter function can be helpful. Also there's a db::count_local_endpoints() that is surprisingly in use, so add it to topology as well. Next patches will make use of both. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-30 17:58:45 +03:00
Benny Halevy	cf47db2bdb	token_metadata: document that update_normal_tokens is unsafe Currently, if token_metadata_impl::update_normal_tokens throws an exception before it's done, it leaves the token_metadata_impl members partially updated and we have no way of recovering from that. The existing use cases take that into account and always call it on a cloned, temporary copy of the token metadata, so if it throws, the temporary copy is tossed away without being applied back. So just cement this, by adding cautions in the token_metadata class declaration. Closes #11127 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220728144821.130518-1-bhalevy@scylladb.com>	2022-07-29 05:38:56 +03:00
Asias He	6152f5b858	locator: Speed up abstract_replication_strategy::get_address_ranges To get the list of tokens for a given node, we loop through all the tokens and calculate the nodes that are responsible for the token. In case of the everywhere_topology, we know any node that is part of the the ring will be responsible for all tokens. This patch adds a fast path for everywhere_topology to avoid calculating natural endpoints. Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Asias He	9a8a80527b	locator: Speed up simple_strategy::calculate_natural_endpoint If the number of nodes in the cluster is smaller than the desired replication factor we should return the loop when endpoints already contains all the nodes in the cluster because no more nodes could be added to endpoints lists Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Asias He	4c714dfe3b	token_metadata: Speed up count_normal_token_owners Currently, a set of nodes is built from _token_to_endpoint_map to get the number of nodes in _token_to_endpoint_map. To make it faster so we can call it on a fast path in the following patch, a _nr_normal_token_owners member is introduced to track the number. Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Pavel Emelyanov	40d6ea973c	snitch: Remove reconnectable snitch helper It's now no-op Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:51:05 +03:00
Pavel Emelyanov	b91f7e9ec4	snitch, storage_service: Move reconnect to internal_ip kick The same thing as in previous patch -- when gossiper issues on_join/_change notification, storage service can kick messaging service to update its internal_ip cache and reconnect to the peer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:48:46 +03:00

... 3 4 5 6 7 ...

673 Commits