reader_permit: give better names to active* states
The names of these states have been the source of confusion ever since they were introduced. Give them names which better reflects their true meaning and gives less room for misinterpretation. The changes are: * active/unused -> active * active/used -> active/need_cpu * active/blocked -> active/await Hopefully the new names do a better job at conveying what these states really mean: * active - a regular admitted permit, which is active (as opposed to an inactive permit). * active/need_cpu - an active permit which was marked as needing CPU for the read to make progress. This permit prevents admission of new permits while it is in this state. * active/await - a former active/need_cpu permit, which has to wait on I/O or a remote shard. While in this state, it doesn't block the admission of new permits (pending other criteria such as resource availability).
This commit is contained in:
@@ -17,7 +17,7 @@ There are 3 main ways to create permits:
|
||||
|
||||
A permit is admitted if the following conditions are met:
|
||||
* There are enough resources to admit the read. Currently, each permit takes 1 count resource and 128K memory resource on admission.
|
||||
* There are no reads which currently only need CPU to make further progress. Permits can opt-in to participate in this criteria (block other permits from being admitted, while they need more CPU) by being marked as "used".
|
||||
* There are no reads which currently only need CPU to make further progress. Permits can opt-in to participate in this criteria (block other permits from being admitted, while they need more CPU) by being marked as "need_cpu".
|
||||
|
||||
Reader concurrency semaphore diagnostic dumps
|
||||
=============================================
|
||||
@@ -27,8 +27,8 @@ Example diagnostics dump:
|
||||
|
||||
[shard 1] reader_concurrency_semaphore - Semaphore _read_concurrency_sem with 35/100 count and 14858525/209715200 memory resources: timed out, dumping permit diagnostics:
|
||||
permits count memory table/description/state
|
||||
34 34 14M ks1.table1_mv_0/data-query/active/blocked
|
||||
1 1 16K ks1.table1_mv_0/data-query/active/used
|
||||
34 34 14M ks1.table1_mv_0/data-query/active/await
|
||||
1 1 16K ks1.table1_mv_0/data-query/active/need_cpu
|
||||
7 0 0B ks1.table1/data-query/waiting
|
||||
1251 0 0B ks1.table1_mv_0/data-query/waiting
|
||||
|
||||
@@ -47,19 +47,19 @@ The dump contains the following information:
|
||||
* Dump of the permit states;
|
||||
|
||||
Permits are grouped by table, description, and state, while groups are sorted by memory consumption.
|
||||
The first group in this example contains 34 permits, all for reads against table `ks1.table1_mv_0`, all data-query reads and in state `active/blocked`.
|
||||
The first group in this example contains 34 permits, all for reads against table `ks1.table1_mv_0`, all data-query reads and in state `active/await`.
|
||||
|
||||
Permits have the following states:
|
||||
* waiting - the permit is waiting for admission;
|
||||
* active/unused - the permit was admitted but doesn't participate in CPU based admission;
|
||||
* active/used - the permit was admitted and it participates in CPU based admission;
|
||||
* active/blocked - a previously active/used permit, which needs something other than CPU to proceed, it is waiting on I/O or a remote shards;
|
||||
* active - the permit was admitted;
|
||||
* active/need_cpu - the permit was admitted and it participates in CPU based admission;
|
||||
* active/await - a previously active/need_cpu permit, which needs something other than CPU to proceed, it is waiting on I/O or a remote shards;
|
||||
* inactive - the read was marked inactive, it can be evicted to make room for admitting more permits if needed;
|
||||
* evicted - the read was inactive and then evicted;
|
||||
|
||||
The dump can reveal what the bottleneck holding up the reads is:
|
||||
* CPU - there will be one active/used permit (there might be active/blocked and active/unused permits too), both count and memory resources are available (not maxed out);
|
||||
* Disk - count resource is maxed out by active/blocked permits using up all count resources;
|
||||
* CPU - there will be one active/need_cpu permit (there might be active/await and active permits too), both count and memory resources are available (not maxed out);
|
||||
* Disk - count resource is maxed out by active/await permits using up all count resources;
|
||||
* Memory - memory resource is maxed out (usually even above the limit);
|
||||
|
||||
There might be inactive reads if CPU is a bottleneck; otherwise, there shouldn't be any (they should be evicted to free up resources).
|
||||
|
||||
@@ -138,7 +138,7 @@ private:
|
||||
reader_resources _base_resources;
|
||||
bool _base_resources_consumed = false;
|
||||
reader_resources _resources;
|
||||
reader_permit::state _state = reader_permit::state::active_unused;
|
||||
reader_permit::state _state = reader_permit::state::active;
|
||||
uint64_t _used_branches = 0;
|
||||
bool _marked_as_used = false;
|
||||
uint64_t _blocked_branches = 0;
|
||||
@@ -174,14 +174,14 @@ private:
|
||||
}
|
||||
void on_permit_active() {
|
||||
if (_used_branches) {
|
||||
_state = reader_permit::state::active_used;
|
||||
_state = reader_permit::state::active_need_cpu;
|
||||
on_permit_used();
|
||||
if (_blocked_branches) {
|
||||
_state = reader_permit::state::active_blocked;
|
||||
_state = reader_permit::state::active_await;
|
||||
on_permit_blocked();
|
||||
}
|
||||
} else {
|
||||
_state = reader_permit::state::active_unused;
|
||||
_state = reader_permit::state::active;
|
||||
}
|
||||
}
|
||||
|
||||
@@ -308,7 +308,7 @@ public:
|
||||
}
|
||||
|
||||
void on_admission() {
|
||||
assert(_state != reader_permit::state::active_blocked);
|
||||
assert(_state != reader_permit::state::active_await);
|
||||
on_permit_active();
|
||||
consume(_base_resources);
|
||||
_base_resources_consumed = true;
|
||||
@@ -326,7 +326,7 @@ public:
|
||||
}
|
||||
|
||||
void on_register_as_inactive() {
|
||||
assert(_state == reader_permit::state::active_unused || _state == reader_permit::state::active_used || _state == reader_permit::state::waiting_for_memory);
|
||||
assert(_state == reader_permit::state::active || _state == reader_permit::state::active_need_cpu || _state == reader_permit::state::waiting_for_memory);
|
||||
on_permit_inactive(reader_permit::state::inactive);
|
||||
}
|
||||
|
||||
@@ -386,11 +386,11 @@ public:
|
||||
|
||||
void mark_used() noexcept {
|
||||
++_used_branches;
|
||||
if (!_marked_as_used && _state == reader_permit::state::active_unused) {
|
||||
_state = reader_permit::state::active_used;
|
||||
if (!_marked_as_used && _state == reader_permit::state::active) {
|
||||
_state = reader_permit::state::active_need_cpu;
|
||||
on_permit_used();
|
||||
if (_blocked_branches && !_marked_as_blocked) {
|
||||
_state = reader_permit::state::active_blocked;
|
||||
_state = reader_permit::state::active_await;
|
||||
on_permit_blocked();
|
||||
}
|
||||
}
|
||||
@@ -406,15 +406,15 @@ public:
|
||||
if (_marked_as_blocked) {
|
||||
on_permit_unblocked();
|
||||
}
|
||||
_state = reader_permit::state::active_unused;
|
||||
_state = reader_permit::state::active;
|
||||
on_permit_unused();
|
||||
}
|
||||
}
|
||||
|
||||
void mark_blocked() noexcept {
|
||||
++_blocked_branches;
|
||||
if (_blocked_branches == 1 && _state == reader_permit::state::active_used) {
|
||||
_state = reader_permit::state::active_blocked;
|
||||
if (_blocked_branches == 1 && _state == reader_permit::state::active_need_cpu) {
|
||||
_state = reader_permit::state::active_await;
|
||||
on_permit_blocked();
|
||||
}
|
||||
}
|
||||
@@ -423,7 +423,7 @@ public:
|
||||
assert(_blocked_branches);
|
||||
--_blocked_branches;
|
||||
if (_marked_as_blocked && !_blocked_branches) {
|
||||
_state = reader_permit::state::active_used;
|
||||
_state = reader_permit::state::active_need_cpu;
|
||||
on_permit_unblocked();
|
||||
}
|
||||
}
|
||||
@@ -639,14 +639,14 @@ std::ostream& operator<<(std::ostream& os, reader_permit::state s) {
|
||||
case reader_permit::state::waiting_for_execution:
|
||||
os << "waiting_for_execution";
|
||||
break;
|
||||
case reader_permit::state::active_unused:
|
||||
os << "active/unused";
|
||||
case reader_permit::state::active:
|
||||
os << "active";
|
||||
break;
|
||||
case reader_permit::state::active_used:
|
||||
os << "active/used";
|
||||
case reader_permit::state::active_need_cpu:
|
||||
os << "active/need_cpu";
|
||||
break;
|
||||
case reader_permit::state::active_blocked:
|
||||
os << "active/blocked";
|
||||
case reader_permit::state::active_await:
|
||||
os << "active/await";
|
||||
break;
|
||||
case reader_permit::state::inactive:
|
||||
os << "inactive";
|
||||
@@ -1387,9 +1387,9 @@ void reader_concurrency_semaphore::dequeue_permit(reader_permit::impl& permit) {
|
||||
case reader_permit::state::evicted:
|
||||
--_stats.inactive_reads;
|
||||
break;
|
||||
case reader_permit::state::active_unused:
|
||||
case reader_permit::state::active_used:
|
||||
case reader_permit::state::active_blocked:
|
||||
case reader_permit::state::active:
|
||||
case reader_permit::state::active_need_cpu:
|
||||
case reader_permit::state::active_await:
|
||||
on_internal_error_noexcept(rcslog, format("reader_concurrency_semaphore::dequeue_permit(): unrecognized queued state: {}", permit.get_state()));
|
||||
}
|
||||
permit.unlink();
|
||||
|
||||
@@ -83,9 +83,9 @@ public:
|
||||
waiting_for_admission,
|
||||
waiting_for_memory,
|
||||
waiting_for_execution,
|
||||
active_unused,
|
||||
active_used,
|
||||
active_blocked,
|
||||
active,
|
||||
active_need_cpu,
|
||||
active_await,
|
||||
inactive,
|
||||
evicted,
|
||||
};
|
||||
|
||||
Reference in New Issue
Block a user