storage_proxy: fix race and crash in case of MV and other node shutdown
Recently, in merge commit2718c90448, we added the ability to cancel pending view-update requests when we detect that the target node went down. This is important for view updates because these have a very long timeout (5 minutes), and we wanted to make this timeout even longer. However, the implementation caused a race: Between *creating* the update's request handler (create_write_response_handler()) and actually starting the request with this handler (mutate_begin()), there is a preemption point and we may end up deleting the request handler before starting the request. So mutate_begin() must gracefully handle the case of a missing request handler, and not crash with a segmentation fault as it did before this patch. Eventually the lifetime management of request handlers could be refactored to avoid this delicate fix (which requires more comments to explain than code), or even better, it would be more correct to cancel individual writes when a node goes down, not drop the entire handler (see issue #4523). However, for now, let's not do such invasive changes and just fix bug that we set out to fix. Fixes #4386. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20190620123949.22123-1-nyh@scylladb.com> (cherry picked from commit6e87bca65d)
This commit is contained in:
@@ -1447,6 +1447,22 @@ future<> storage_proxy::mutate_begin(std::vector<unique_response_handler> ids, d
|
||||
stdx::optional<clock_type::time_point> timeout_opt) {
|
||||
return parallel_for_each(ids, [this, cl, timeout_opt] (unique_response_handler& protected_response) {
|
||||
auto response_id = protected_response.id;
|
||||
// This function, mutate_begin(), is called after a preemption point
|
||||
// so it's possible that other code besides our caller just ran. In
|
||||
// particular, Scylla may have noticed that a remote node went down,
|
||||
// called storage_proxy::on_down(), and removed some of the ongoing
|
||||
// handlers, including this id. If this happens, we need to ignore
|
||||
// this id - not try to look it up or start a send.
|
||||
if (_response_handlers.find(response_id) == _response_handlers.end()) {
|
||||
protected_response.release(); // Don't try to remove this id again
|
||||
// Requests that time-out normally below after response_wait()
|
||||
// result in an exception (see ~abstract_write_response_handler())
|
||||
// However, here we no longer have the handler or its information
|
||||
// to put in the exception. The exception is not needed for
|
||||
// correctness (e.g., hints are written by timeout_cb(), not
|
||||
// because of an exception here).
|
||||
return make_exception_future<>(std::runtime_error("unstarted write cancelled"));
|
||||
}
|
||||
// it is better to send first and hint afterwards to reduce latency
|
||||
// but request may complete before hint_to_dead_endpoints() is called and
|
||||
// response_id handler will be removed, so we will have to do hint with separate
|
||||
|
||||
Reference in New Issue
Block a user