compaction: compact_all_sstables demo function

This is an example of how to use the low-level compact_sstable() function
to compact all the sstables of one column family into one. It is not a
full-fledged "compaction strategy" but the real ones can be based on this
example.

Among the things that this code doesn't do yet is to delete the old
sstables. In the future, this should happen automatically in the sstable
destructor when all the references to the sstable get deleted.

Signed-off-by: Nadav Har'El <nyh@cloudius-systems.com>
This commit is contained in:
Nadav Har'El
2015-06-25 10:11:12 +03:00
committed by Avi Kivity
parent b54d35dcbb
commit 3f5114e415
2 changed files with 56 additions and 0 deletions

View File

@@ -19,6 +19,7 @@
#include <boost/algorithm/string/classification.hpp>
#include <boost/algorithm/string/split.hpp>
#include "sstables/sstables.hh"
#include "sstables/compaction.hh"
#include <boost/range/adaptor/transformed.hpp>
#include <boost/range/adaptor/map.hpp>
#include "locator/simple_snitch.hh"
@@ -496,6 +497,56 @@ column_family::seal_active_memtable(database* db) {
// FIXME: provide back-pressure to upper layers
}
// FIXME: this is just an example, should be changed to something more general
// Note: We assume that the column_family does not get destroyed during compaction.
future<>
column_family::compact_all_sstables() {
auto sstables_to_compact =
make_lw_shared<std::vector<sstables::shared_sstable>>();
for (auto&& entry : *_sstables) {
sstables_to_compact->push_back(entry.second);
}
auto new_tables = make_lw_shared<std::vector<
std::pair<unsigned, sstables::shared_sstable>>>();
auto create_sstable = [this, new_tables] {
// FIXME: this generation calculation should be in a function.
auto gen = _sstable_generation++ * smp::count + engine().cpu_id();
// FIXME: use "tmp" marker in names of incomplete sstable
auto sst = make_lw_shared<sstables::sstable>(_config.datadir, gen,
sstables::sstable::version_types::la,
sstables::sstable::format_types::big);
new_tables->emplace_back(gen, sst);
return sst;
};
return sstables::compact_sstables(*sstables_to_compact, _schema,
create_sstable).then([this, new_tables, sstables_to_compact] {
// Build a new list of _sstables: We remove from the existing list the
// tables we compacted (by now, there might be more sstables flushed
// later), and we add the new tables generated by the compaction.
// We create a new list rather than modifying it in-place, so that
// on-going reads can continue to use the old list.
auto current_sstables = _sstables;
_sstables = make_lw_shared<sstable_list>();
std::unordered_set<sstables::shared_sstable> s(
sstables_to_compact->begin(), sstables_to_compact->end());
for (const auto& oldtab : *current_sstables) {
if (!s.count(oldtab.second)) {
_sstables->emplace(oldtab.first, oldtab.second);
}
}
// FIXME: We need to make sure that destructing an sstable object
// deletes it from disk - otherwise old sstables will not be deleted
// after compaction.
for (const auto& newtab : *new_tables) {
// FIXME: rename the new sstable(s). Verify a rename doesn't cause
// problems for the sstable object.
_sstables->emplace(newtab.first, newtab.second);
}
});
}
future<> column_family::populate(sstring sstdir) {
return lister::scan_dir(sstdir, directory_entry_type::regular, [this, sstdir] (directory_entry de) {

View File

@@ -146,6 +146,11 @@ public:
seal_active_memtable(db);
return _in_flight_seals.close();
}
// FIXME: this is just an example, should be changed to something more
// general. compact_all_sstables() starts a compaction of all sstables.
// It doesn't flush the current memtable first. It's just a ad-hoc method,
// not a real compaction policy.
future<> compact_all_sstables();
private:
seastar::gate _in_flight_seals;