Merge "fix slow truncation under flush pressure" from Glauber

Truncating a table is very slow if the system is under pressure. Because
in that case we mostly just want to get rid of the existing data, it
shouldn't take this long. The problem happens because truncate has to
wait for memtable flushes to end, twice. This is regardless of whether
or not the table being truncated has any data.

1. The first time is when we call truncate itself:

if auto_snapshot is enabled, we will flush the contents of this table
first and we are expected to be slow. However, even if auto_snapshot is
disabled we will still do it -- which is a bug -- if the table is marked
as durable. We should just not flush in this case and it is a silly bug.

1. The second time is when we call cf->stop(). Stopping a table will
wait for a flush to finish. At this point, regardless of which path
(Durable or non-durable) we took in the previous step we will have no
more data in the table. However, calling `flush()` still need to acquire
a flush_permit, which means we will wait for whichever memtable is
flushing at that very moment to end.

If the system is under pressure and a memtable flush will take many
seconds, so will truncate.  Even if auto_snapshots are enabled, we
shouldn't have to flush twice. The first flush should already put is in
a state in which the next one is immediate (maybe holding on to the
permit, maybe destroying the memtable_list already at that point ->
since no other memtables should be created).

If auto_snapshots are not enabled, the whole thing should just be
instantaneous.

This patchset fixes that by removing the flush need when !auto_snapshot,
and special casing the flush of an empty table.

Fixes #4294

* git@github.com:glommer/scylla.git slowtruncate-v2:
  database: immediately flush tables with no memtables.
  truncate: do not flush memtables if auto_snapshot is false.
This commit is contained in:
Tomasz Grabiec
2019-03-06 13:54:58 +01:00
2 changed files with 10 additions and 3 deletions

View File

@@ -1283,7 +1283,7 @@ future<> dirty_memory_manager::shutdown() {
}
future<> memtable_list::request_flush() {
if (!may_flush()) {
if (empty() || !may_flush()) {
return make_ready_future<>();
} else if (!_flush_coalescing) {
_flush_coalescing = shared_promise<>();
@@ -1684,9 +1684,8 @@ future<> database::truncate(sstring ksname, sstring cfname, timestamp_func tsf)
future<> database::truncate(const keyspace& ks, column_family& cf, timestamp_func tsf, bool with_snapshot) {
return cf.run_async([this, &ks, &cf, tsf = std::move(tsf), with_snapshot] {
const auto durable = ks.metadata()->durable_writes();
const auto auto_snapshot = with_snapshot && get_config().auto_snapshot();
const auto should_flush = durable || auto_snapshot;
const auto should_flush = auto_snapshot;
// Force mutations coming in to re-acquire higher rp:s
// This creates a "soft" ordering, in that we will guarantee that

View File

@@ -201,6 +201,14 @@ public:
return bool(_seal_immediate_fn);
}
bool empty() const {
for (auto& m : _memtables) {
if (!m->empty()) {
return false;
}
}
return true;
}
shared_memtable back() {
return _memtables.back();
}