scylla

Author	SHA1	Message	Date
Benny Halevy	d295d8e280	everywhere: define locator::host_id as a strong tagged_uuid type So it can be distinguished from other uuid-based identifiers in the system. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #11276	2022-08-12 06:01:44 +03:00
Benny Halevy	9167b857e9	abstract_replication_strategy: calculate_effective_replication_map: optimize for static replication strategies For replication strategies like "everywhere" and "local" that return the same set of endpoints for all tokens, we can call rs->calculate_natural_endpoints one once and reuse the result for all token. Note that ideally the replication_map could contain only a single token range for this case, but that does't seem to work yet. Add maybe_yield() calls to the tight loop to prevent reactor stalls on large clusters when copying a long vector returned by everywhere_replication_strategy to potentially 1000's of tokens in the map. Nicholas Peshek wrote in https://github.com/scylladb/scylladb/issues/10337#issuecomment-1211152370 about similar patch by Geoffrey Beausire: `994c6ecf3c` > Yep. That dropped our startup from 3000+ seconds to about 40. Fixes #10337 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-11 10:35:29 +03:00
Benny Halevy	eb678e723b	abstract_replication_strategy: add has_uniform_natural_endpoints So that using calaculate_natural_endpoints can be optimized for strategies that return the same endpoints for all tokens, namely everywhere_replication_strategy and local_strategy. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-11 10:34:14 +03:00
Benny Halevy	91ab8ee1c3	effective_replication_map: make get_range_addresses asynchronous So it may yield, preenting reactor stalls as seen in #11005. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:01 +03:00
Benny Halevy	e541009f65	effective_replication_map: add get_replication_strategy And use it in storage_service::get_changed_ranges_for_leaving. A following patch will pass the e_r_m to storage_service::get_changed_ranges_for_leaving, rather than getting it there. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	6794e15163	effective_replication_map: get_range_addresses: use the precalculated replication_map There is no need to call get_natural_endpoints for every token in sorted_tokens order, since we can just get the precalculated per-token endpoints already in the _replication_map member. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	1d4aea4441	abstract_replication_strategy: get_pending_address_ranges: prevent extra vector copies Reduce large allocations and reactor stalls seen in #11005 by open coding `get_address_ranges` and using std::vector::insert to efficiently appending the ranges returned by `get_primary_ranges_for` onto the returned token_range_vector in contrast to building an unordered_multimap<inet_address, dht::token_range> first in `get_address_ranges` and traversing it and adding one token_range at a time. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	7811b0d0aa	abstract_replication_strategy: reindent	2022-08-08 17:31:00 +03:00
Benny Halevy	ebe1edc091	utils: sequenced_set: expose set and `contains` method And use that in sights using the endpoint set returned by abstract_replication_strategy::calculate_natural_endpoints. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Benny Halevy	7017ad6822	abstract_replication_strategy: calculate_natural_endpoints: return endpoint_set So it could be used also for easily searching for an endpoint. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-08 17:31:00 +03:00
Botond Dénes	fbbe2529c1	Merge "Remove global snitch usage from consistency_level.cc" from Pavel Emelyanov " There are several helpers in this .cc file that need to get datacenter for endpoints. For it they use global snitch, because there's no other place out there to get that data from. The whole dc/rack info is now moving to topology, so this set patches the consistency_level.cc to get the topology. This is done two ways. First, the helpers that have keyspace at hand may get the topology via ks's effective_replication_map. Two difficult cases are db::is_local() and db.count_local_endpoints() because both have just inet_address at hand. Those are patched to be methods of topology itself and all their callers already mess with token metadata and can get topology from it. " * 'br-consistency-level-over-topology' of https://github.com/xemul/scylla: consistency_level: Remove is_local() and count_local_endpoints() storage_proxy: Use topology::local_endpoints_count() storage_proxy: Use proxy's topology for DC checks storage_proxy: Keep shared_ptr<proxy> on digest_read_resolver storage_proxy: Use topology local_dc_filter in its methods storage_proxy: Mark some digest_read_resolver methods private forwarding_service: Use topology local_dc_filter storage_service: Use topology local_dc_filter consistency_level: Use topology local_dc_filter consitency-level: Call count_local_endpoints from topology consistency_level: Get datacenter from topology replication_strategy: Remove hold snitch reference effective_replication_map: Get datacenter from topology topology: Add local-dc detection shugar	2022-08-05 13:31:55 +03:00
Pavel Emelyanov	f84ee8f0fb	consistency_level: Get datacenter from topology In some of db/consistency_level.cc helpers the topology can be obtained from keyspace's effective replication map Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:47 +03:00
Pavel Emelyanov	00f166809e	replication_strategy: Remove hold snitch reference When the strategy is constructed there's no place to get snitch from so the global instance is used. However, after previous patch the replication strategy no longer needs snitch, so this dependency can be dropped Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:43 +03:00
Pavel Emelyanov	298213f27f	effective_replication_map: Get datacenter from topology Now it gets it from snitch, but the dc/rack info is being relocated onto topology. The topology is in turn already there Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-08-05 12:19:31 +03:00
Botond Dénes	df203a48af	Merge "Remove reconnectable_snitch_helper" from Pavel Emelyanov " The helper is in charge of receiving INTERNAL_IP app state from gossiper join/change notifications, updating system.peers with it and kicking messaging service to update its preferred ip cache along with initiating clients reconnection. Effectively this helper duplicates the topology tracking code in storage-service notifiers. Removing it makes less code and drops a bunch of unwanted cross-components dependencies, in particular: - one qctx call is gone - snitch (almost) no longer needs to get messaging from gossiper - public:private IP cache becomes local to messaging and can be moved to topology at low cost Some nice minor side effect -- this helper was left unsubscribed from gossiper on stop and snitch rename. Now its all gone. " * 'br-remove-reconnectible-snitch-helper-2' of https://github.com/xemul/scylla: snitch: Remove reconnectable snitch helper snitch, storage_service: Move reconnect to internal_ip kick snitch, storage_service: Move system.peers preferred_ip update snitch: Export prefer-local	2022-08-04 13:06:05 +03:00
Benny Halevy	0dfd92d0b3	token_metadata: allow update_normal_token_owners to yield Given #11146, we see a 10ms stall when calculate_natural_endpoints calls get_all_endpoints that up until this patch performed a similar loop on the `_token_to_endpoint_map`, so to prevent such a stall with large number of tokens, turn update_normal_token_owners async, and allow yielding in the per-token tight loop. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:32 +03:00
Benny Halevy	4f8ccef2c1	token_metadata: get_all_endpoints: return const unordered_set<inet_address>& There's no need to transform it into a vector. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:08 +03:00
Benny Halevy	a980f94d85	token_metadata: impl: keep the set of normal token owners as a member We don't need to recalculate the unique set of normal token everytime we change `_token_to_endpoint_map`. Similarly, this doesn't have to be done in `get_all_endpoints`. Instead we can maintain it inexpensively in `remove_endpoint`, and let `count_normal_token_owners` just return its size and `get_all_endpoints` just return the saved set. Note that currently topology is not updated accurately in update_normal_token() and it may contain endpoint that do no longer own any tokens. If we did update topology accurately there, we could use its locations map instead as its keys are equivalent to the unordered_set<inet_address> we implement here. Closes #11128 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-08-02 10:49:07 +03:00
Pavel Emelyanov	ee0828b506	topology: Add local-dc detection shugar It's often needed to check if an endpoint sits in the same DC as the current node. It can be done by topo.get_datacenter() == topo.get_datacenter(endpoint) but in some cases a RAII filter function can be helpful. Also there's a db::count_local_endpoints() that is surprisingly in use, so add it to topology as well. Next patches will make use of both. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-30 17:58:45 +03:00
Benny Halevy	cf47db2bdb	token_metadata: document that update_normal_tokens is unsafe Currently, if token_metadata_impl::update_normal_tokens throws an exception before it's done, it leaves the token_metadata_impl members partially updated and we have no way of recovering from that. The existing use cases take that into account and always call it on a cloned, temporary copy of the token metadata, so if it throws, the temporary copy is tossed away without being applied back. So just cement this, by adding cautions in the token_metadata class declaration. Closes #11127 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220728144821.130518-1-bhalevy@scylladb.com>	2022-07-29 05:38:56 +03:00
Asias He	6152f5b858	locator: Speed up abstract_replication_strategy::get_address_ranges To get the list of tokens for a given node, we loop through all the tokens and calculate the nodes that are responsible for the token. In case of the everywhere_topology, we know any node that is part of the the ring will be responsible for all tokens. This patch adds a fast path for everywhere_topology to avoid calculating natural endpoints. Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Asias He	9a8a80527b	locator: Speed up simple_strategy::calculate_natural_endpoint If the number of nodes in the cluster is smaller than the desired replication factor we should return the loop when endpoints already contains all the nodes in the cluster because no more nodes could be added to endpoints lists Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Asias He	4c714dfe3b	token_metadata: Speed up count_normal_token_owners Currently, a set of nodes is built from _token_to_endpoint_map to get the number of nodes in _token_to_endpoint_map. To make it faster so we can call it on a fast path in the following patch, a _nr_normal_token_owners member is introduced to track the number. Refs #10337 Refs #10817 Refs #10836 Refs #10837	2022-07-26 18:53:09 +08:00
Pavel Emelyanov	40d6ea973c	snitch: Remove reconnectable snitch helper It's now no-op Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:51:05 +03:00
Pavel Emelyanov	b91f7e9ec4	snitch, storage_service: Move reconnect to internal_ip kick The same thing as in previous patch -- when gossiper issues on_join/_change notification, storage service can kick messaging service to update its internal_ip cache and reconnect to the peer. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:48:46 +03:00
Pavel Emelyanov	1bf8b0dd92	snitch, storage_service: Move system.peers preferred_ip update Currently the INTERNAL_IP state is updated using reconnectable helper by subscribing on on_join/on_change events from gossiper. The same subscription exists in storage service (it's a bit more elaborated by checking if the node is the part of the ring which is OK). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:48:46 +03:00
Pavel Emelyanov	0abd2c1e52	snitch: Export prefer-local The boolean bit says whether "the system" should prefer connecting to the address gossiper around via INTERNAL_IP. Currently only gossiping property file snitch allows to tune it and ec2-multiregion snitch prefers internal IP unconditionally. So exporting consists of 2 pieces: - add prefer_local() snitch method that's false by default or returns the (existing) _prefer_local bit for production snitch base - set the _prefer_local to true by ec2-multiregion snitch While at it the _prefer_local is moved to production_snitch_base for uniformity with the new prefer_local() call Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-26 13:48:04 +03:00
Pavel Emelyanov	b6f7c8da8b	topology: Add get_rack/_datacenter methods For now they just forward the request to snitch. Once topology is properly updated boot-time dc/rack info and knows internal IP it will be able to serve request on its own. For convenience overloads without arguments return dc/rack for current node. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-22 11:47:26 +03:00
Asias He	72797bf516	token_metadata: Shortcut zero leaving nodes case in calculate_pending_ranges_for_leaving If there are zero leaving nodes, no need to calculate anything. This saves time for calculating pending ranges in large clusters significantly to avoid unnecessary calculation. Refs #10337 Closes #10822	2022-06-20 13:19:58 +03:00
Pavel Emelyanov	1199c6e5da	snitch: Use invoke_on_others() to replicate The replication happens on all shards but current one. There's a special helper in seastar for such cases Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-20 18:16:22 +03:00
Pavel Emelyanov	5ec87285f8	snitch: Merge set_my_dc and set_my_rack into one These two are always used in pair. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-20 18:16:19 +03:00
Pavel Emelyanov	c6d0bc87d0	azure_snitch: Do nothing on non-io-cpu All snitch drivers are supposed to snitch info on some shard and replicate the dc/rack info across others. All, but azure really do so. The azure one gets dc/rack on all shards, which's excessive but not terrible, but when all shards start to replicate their data to all the others, this may lead to use-after-frees. fixes: #10494 Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-20 18:15:57 +03:00
Avi Kivity	5937b1fa23	treewide: remove empty comments in top-of-files After `fcb8d040` ("treewide: use Software Package Data Exchange (SPDX) license identifiers"), many dual-licensed files were left with empty comments on top. Remove them to avoid visual noise. Closes #10562	2022-05-13 07:11:58 +02:00
Asias He	77b1db475c	locator: Do not enforce public ip address for broadcast_rpc_address Reported by Felipe Cardeneti: - Create a 2-node Scylla cluster w/ Ec2MultiRegionSnitch - Check system.peers table Scylla (uses public address) ``` cqlsh> select peer,data_center,host_id,preferred_ip,rack,rpc_address,schema_version from system.peers; peer \| data_center \| host_id \| preferred_ip \| rack \| rpc_address \| schema_version ---------------+-------------+--------------------------------------+---------------+------+---------------+-------------------------------------- 18.216.98.219 \| us-east-2 \| d9443741-a12e-4bbb-91ce-9931cece589c \| 172.31.43.122 \| 2c \| 18.216.98.219 \| 95c3fca5-c463-3aba-98c6-1c0b3fac5b58 (1 rows) ``` Cassandra (uses local address): ``` cqlsh> SELECT peer,data_center,host_id,preferred_ip,rack,rpc_address,schema_version from system.peers; peer \| data_center \| host_id \| preferred_ip \| rack \| rpc_address \| schema_version ---------------+-------------+--------------------------------------+---------------+------------+---------------+-------------------------------------- 52.15.104.255 \| us-east-2 \| 42c0b717-775f-4998-a420-0388fe8b4e70 \| 172.31.42.126 \| us-east-2c \| 172.31.42.126 \| 2207c2a9-f598-3971-986b-2926e09e239d (1 rows) ``` Config diff: ``` cassandra.yaml:rpc_address: 0.0.0.0 cassandra.yaml:broadcast_rpc_address: 172.31.42.126 /etc/scylla/scylla.yaml:broadcast_rpc_address: 172.31.42.126 /etc/scylla/scylla.yaml:rpc_address: 0.0.0.0 ``` After this patch, if broadcast_rpc_address is unset, Ec2MultiRegionSnitch will use the public ip address to set broadcast_rpc_address. If broadcast_rpc_address is set, Ec2MultiRegionSnitch will not modify it. Fixes #10236 Closes #10519	2022-05-11 14:46:30 +02:00
Pavel Emelyanov	e502047c74	snitch: Use local gossiper in drivers Each driver has a pointer to this shard snitch_ptr which, in turn, has the reference on gossiper. This lets drivers stop using the global gossiper instance. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	38c77d0d85	snitch: Keep gossiper reference The reference is put on the snitch_ptr because this is the sharded<> thing and because gossiper reference is the same for different snitch drivers. Also, getting gossiper from snitch_ptr by driver will look simpler than getting it from any base class. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:57:40 +03:00
Pavel Emelyanov	f85e12ffa5	snitch: Move snitch_base::get_endpoint_info() This method is only needed by production_snitch_base inheritants Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-03 10:34:52 +03:00
Avi Kivity	1e1c0226a6	treewide: abort() after switch in formatters It is typical in switch statements to select on an enum type and rely on the compliler to complain if an enum value was missed. But gcc isn't satisified since the enum could have a value outside the declared list. Call abort() in this impossible situation to pacify it.	2022-04-18 12:27:18 +03:00
Pavel Emelyanov	828a951886	snitch: Remove create_snitch/stop_snitch After previous patches both, create_snitch() and stop_snitch() no look like the classica sharded service start/stop sequence. Finally both helpers can be removed and the rest of the user can just call start/stop on locally obtained sharded references. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:43:25 +03:00
Pavel Emelyanov	20e623f16d	snitch: Simplify stop (and pause_io) Both first stop/pause snitch driver on io-ing shard, then proceed with the rest. This sequence is pretty pointless and here's why. The only non-trivial stop()/pause_io() method out there is in the property-file snitch driver. In it, both methods check if the current shard is the io-ing one, if no -- return back the resolved future, if yes -- go ahead and stop/pause some IO. With this, for all shards but io-ing one there's no point in starting after io-ing one is stopped, they all can start (and finish) in parallel. So what this patch does is just removes the pre-stop/pause kicking of the io-ing shard. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:43:23 +03:00
Pavel Emelyanov	2e42578dc8	snitch: Move io_is_stopped to property-file driver This whole engine is only used by that driver, there's no point in it sitting on the base class Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:43:20 +03:00
Pavel Emelyanov	28ecdc66ad	snitch: Remove init_snitch_obj() Now it's just a wrapper around sharded<snitch_ptr>::start() Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:43:16 +03:00
Pavel Emelyanov	b3eaae629e	snitch: Move instance creation into snitch_ptr constructor Current API to create snitch is not like other services -- there's a dedicated helper that does sharded<>.start() + invoke_on_all(&start) calls. These helpers complicate do-globalization of snitch and rework of services start-stop sequence, things get simpler if snitch uses the same start-stop API as all the others. The first step towards this change is moving the non-waiting parts of snitch initialization code from init_snitch_obj() into snitch_ptr constructor. A note on this change: after patch #2 the snitch_ptr<->driver linkage connects local objects with each other, not container() of any. This is important, because connecting container() would be impossible inside constructor, as the container pointer is initialized by seastar _after_ the service constructor itself. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:38:35 +03:00
Pavel Emelyanov	633746b87d	snitch: Make config-based construction of all drivers Currently snitch drivers register themselves in class-registry with all sorts of construction options possible. All those different constuctors are in fact "config options". When later snitch will declare its dependencies (gossiper and system keyspace), it will require patching all this registrations, which's very inconvenient. This patch introduces the snitch_config struct and replaces all the snitch constructors with the snitch_driver(snitch_config cfg) one. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:38:34 +03:00
Pavel Emelyanov	fa59ccb89d	snitch: Declare snitch_ptr peering and rework container() method This patch makes the snitch base class reference local snitch_ptr, not its sharded<> container and, respectively, makes the base container() method return _backreference->container() instead. The motivation of this change is, again, in the next patch, which will move snitch_ptr<->driver_object linkage into snitch_ptr constructor. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:38:32 +03:00
Pavel Emelyanov	552a08ecd0	snitch: Introduce container() method Some snitch drivers want the peering_sharded_service::container() functionality, but they can't directly use it, because the driver class is in fact the pimplification behind the sharded<snitch_ptr> service. To overcome this there's a _my_distributed pointer on the driver base class that points back to sharded<snitch_ptr> object. This patch replaces the direct _my_distributed usage with the container() method that does it and also asserts that the pointer in question is initialized (some drivers already do it, some don't). Other than making the code more peering_sharded_service-like, this patch allows changing _my_distributed into _backreference that points to this shard's snitch_ptr, see next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-11 14:38:27 +03:00
Pavel Emelyanov	05a32328fc	snitch: Remove gossiper_starting() No longer used Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-01 13:16:09 +03:00
Pavel Emelyanov	41332e183a	snitch: Remove gossip_snitch_info() No longer in use Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-01 13:16:09 +03:00
Pavel Emelyanov	38b0ee9822	property-file snitch: Re-gossip states with the help of .get_app_states() This is the last place that still uses gossip_snitch_info(). It can be reworked to use the get_app_states(), then the former helper can be removed. Another motivation for this is to stop using the _gossiper_started boolean from the base class. This, in turn, will allow to remove the whole gossiper_starting() notification altogether. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-01 13:16:09 +03:00
Pavel Emelyanov	6f71baa472	property-file snitch: Reload state in .start() In its .start() helper the property-file driver does everything but registers the reconnectable helper (like the ec2 m.r. one from the previous patch did). Similarly to ec2 m.r. snitch this one can also register its helper in .start(), before gossiper_starting() is called. One thing to care about in this driver is that some tests start this snitch without starting gossiper, thus an extra protection against not initialized gossiper is needed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-04-01 13:16:09 +03:00

... 3 4 5 6 7 ...

648 Commits