storage_service: Set NORMAL status after token_metadata is replicated

Commit 2d5fb9d109 (gms/gossiper: Replicate changes incrementally to
other shards) changes the way we replicate _token_metadata and
endpoint_state_map. Before they are replicated at the same time, after
they are not any more. This causes a shard in NORMAL status can still be
with a empty _token_metadata.

We saw errors:

   [shard 12] token_metadata - sorted_tokens is empty in first_token_index!

during CorruptThenRepairNemesis.

Fix by setting the gossip status to NORMAL after replication of
_token_metadata, so that once a node is in NORMAL, we can do repair. The
commit 69c81bcc87 (repair: Do not allow repair until node is in NORMAL
status) prevents the early repair operation by checking if a node is in
NORMAL status.

Fixes #3121

Message-Id: <af6a223733d2e11351f1fa35f59eacfa7d65dd30.1516065564.git.asias@scylladb.com>
This commit is contained in:
Asias He
2018-01-16 09:19:42 +08:00
committed by Avi Kivity
parent 2b0b703615
commit 3c8ed255ac

View File

@@ -1161,10 +1161,10 @@ void storage_service::set_tokens(std::unordered_set<token> tokens) {
slogger.debug("Setting tokens to {}", tokens);
db::system_keyspace::update_tokens(tokens).get();
auto local_tokens = get_local_tokens().get0();
set_gossip_tokens(local_tokens);
_token_metadata.update_normal_tokens(tokens, get_broadcast_address());
set_mode(mode::NORMAL, "node is now in normal status", true);
replicate_to_all_cores().get();
set_gossip_tokens(local_tokens);
set_mode(mode::NORMAL, "node is now in normal status", true);
}
void storage_service::set_gossip_tokens(const std::unordered_set<dht::token>& local_tokens) {