Store schema_ptr in reader permit instead of storing a const pointer to
schema to ensure that the schema doesn't get changed elsewhere when the
permit is holding on to it. Also update the constructors and all the
relevant callers to pass down schema_ptr instead of a raw pointer.
Fixes#16180
Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
Closesscylladb/scylladb#16658
State changes are processed as a batch and
there is no reason to maintain them as an ordered map.
Instead, use a std::unordered_map that is more efficient.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Storage group is the storage of tablets. This new concept is helpful
for tablet splitting, where the storage of tablet will be split
in multiple compaction groups, where each can be compacted
independently.
The reason for not going with arena concept is that it added
complexity, and it felt much more elegant to keep compaction
group unchanged which at the end of the day abstracts the concept
of a set of sstables that can be compacted and operated
independently.
When splitting, the storage group for a tablet may therefore own
multiple compaction groups, left, right, and main, where main
keeps the data that needs splitting. When splitting completes,
only left and right compaction groups will be populated.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Reduce code duplication by defining each metric just once, instead of three times, by having the semaphore register metrics by itself. This also makes the lifecycle of metrics contained in that of the semaphore. This is important on enterprise where semaphores are added and removed, together with service levels.
We don't want all semaphores to export metrics, so a new parameter is introduced and all call-sites make a call whether they opt-in or not.
Fixes: https://github.com/scylladb/scylladb/issues/16402Closesscylladb/scylladb#16383
* github.com:scylladb/scylladb:
database, reader_concurrency_sempaphore: deduplicate reader_concurrency_sempaphore metrics
reader_concurrency_semaphore: add register_metrics constructor parameter
sstables: name sstables_manager
reader_concurrency_sempaphore are triplicated: each metrics is registered
for streaming, user, and system classes.
To fix, just move the metrics registration from database to
reader_concurrency_sempaphore, so each reader_concurrency_sempaphore
instantiated will register its metrics (if its creator asked for it).
Adjust the names given to reader_concurrency_sempaphore so we don't
change the labels.
scylla-gdb is adjusted to support the new names.
* seastar bab1625c...17183ed4 (73):
> thread_pool: Reference reactor, not point to
> sstring: inherit publicly from string_view formatter
> circleci: use conditional steps
> weak_ptr: include used header
> build: disable the -Wunused-* warnings for checkheaders
> resource: move variable into smaller lexical scope
> resource: use structured binding when appropriate
> httpd: Added server and client addresses to request structure
> io_queue: do not dereference moved-away shared pointer
> treewide: explicitly define ctor and assignment operator
> memory: use `err` for the error string
> doc: Add document describing all the math behind IO scheduler
> io_queue: Add flow-rate based self slowdown backlink
> io_queue: Make main throttler uncapped
> io_queue: Add queue-wide metrics
> io_queue: Introduce "flow monitor"
> io_queue: Count total number of dispatched and completed requests so far
> io_queue: Introduce io_group::io_latency_goal()
> tests: test the vector overload for when_all_succeed
> core: add a vector overload to when_all_succeed
> loop: Fix iterator_range_estimate_vector_capacity for random iters
> loop: Add test for iterator_range_estimate_vector_capacity
> core/posix return old behaviour using non-portable pthread_attr_setaffinity_np when present
> memory: s/throw()/noexcept/
> build: enable -Wdeprecated compiler option
> reactor: mark kernel_completion's dtor protected
> tests: always wait for promise
> http, json, net: define-generated copy ctor for polymorphic types
> treewide: do not define constexpr static out-of-line
> reactor: do not define dtor of kernel_completion
> http/exception: stop using dynamic exception specification
> metrics: replace vector with deque
> metrics: change metadata vector to deque
> utils/backtrace.hh: make simple_backtrace formattable
> reactor: Unfriend disk_config_params
> reactor: Move add_to_flush_poller() to internal namespace
> reactor: Unfriend a bunch of sched group template calls
> rpc_test: Test rpc send glitches
> net: Implement batch flush support for existing sockets
> iostream: Configure batch flushes if sink can do it
> net: Added remote address accessors
> circleci: update the image to CircleCI "standard" image
> build: do not add header check target if no headers to check
> build: pass target name to seastar_check_self_contained
> build: detect glibc features using CMake
> build: extract bits checking libc into CheckLibc.cmake
> http/exception: add formatter for httpd::base_exception
> http/client: Mark write_body() const
> http/client: Introduce request::_bytes_written
> http/client: Mark maybe_wait_for_continue() const
> http/client: Mark send_request_head() const
> http/client: Detach setup_request()
> http/api_docs: copy in api_docs's copy constructor
> script: do not inherit from object
> scripts: addr2line: change StdinBacktraceIterator to a function
> scripts: addr2line: use yield instead defining a class
> tests: skip tests that require backtrace if execinfo.h is not found
> backtrace: check for existence of execinfo.h
> core: use ino_t and off_t as glibc sets these to 64bit if 64bit api is used
> core: add sleep_abortable instantiation for manual_clock
> tls: Return EPIPE exception when writing to shutdown socket
> http/client: Don't cache connection if server advertises it
> http/client: Mark connection as "keep in cache"
> core: fix strerror_r usage from glibc extension
> reactor: access sigevent.sigev_notify_thread_id with a macro
> posix: use pthread_setaffinity_np instead of pthread_attr_setaffinity_np
> reactor: replace __mode_t with mode_t
> reactor: change sys/poll.h to posix poll.h
> rpc: Add unit test for per-domain metrics
> rpc: Report client connections metrics
> rpc: Count dead client stats
> rpc: Add seastar::rpc::metrics
> rpc: Make public queues length getters
io-scheduler fixes
refs: #15312
refs: #11805
http client fixes
refs: #13736
refs: #15509
rpc fixes
refs: #15462Closesscylladb/scylladb#15774
This commit changes the interface to
using endpoint_state_ptr = lw_shared_ptr<const endpoint_state>
so that users can get a snapshot of the endpoint_state
that they must not modify in-place anyhow.
While internally, gossiper still has the legacy helpers
to manage the endpoint_state.
Fixesscylladb/scylladb#14799
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This field is about to be removed in newer seastar, so it
shouldn't be checked in scylla-gdb
(see also ae6fdf1599)
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#15203
scylla-gdb.py has two methods for iterating over all tables:
* all_tables()
* for_each_table()
Despite this, many places in the code iterate over the column family map
directly. This patch leaves just a single method (for_each_table()) and
migrates all the codebase to use it, instead of iterating over the raw
map. While at it, the access to the map is made backward compatible with
pre 52afd9d42d code, said commit wrapped database::_column_families in
tables_metadata object. This broke scylla-gdb.py for older versions.
Closes#15121
We aim for a large number of tablets, therefore let's switch
to chunked_vector to avoid large contiguous allocs.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
As a preparation for ensuring access safety for column families
related maps, add tables_metadata, access to members of which
would be protected by rwlock.
There's -k|--keyspace argument to the tables command that's supposed to
filter tables belonging to specific keyspace that doesn't work. Fix it
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14634
The scylla netw command prints clients from [0] index only, but there
are more of them on messaging service. Print all
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14633
The command is to print interesting and/or hard-to-get-by-hand info about individual tables
Closes#14635
* github.com:scylladb/scylladb:
test: Add 'scylla table' cmd test
scylla-gdb: Print table phased barriers
scylla-gdb: Add 'table' command
to inspect the sstable generation after uuid-based generation
change. in this change:
* a pretty printer for sstable::generation_type is added
* now that the pretty printer for the generation_type is registered,
we can just leverage it when printing the sstable name, so
instead of checking if `_generation` member variable contains
`_value`, we use delegate it to `str()`, which is used by
`str.format()`. as the behavior of `str()` is similar to that of
the gdb `print` command, and calls `value.format_string()`, which
in turn calls into `to_string()` if the "value" in question has
a pretty printer.
after this change, the printer is able to print both the generations
before the uuid change and the ones after the change.
a typical gdb session looks like:
```
(gdb) p generation._value
$5 = f0770b40-1c7c-11ee-b136-bf28f8d18b88
(gdb) p generation
$10 = 3g7g_0bu7_0jpvk2p0mmtlsb8lu0
(gdb) p/x generation._value.least_sig_bits
$7 = 0xb136bf28f8d18b88
(gdb) p/x generation._value.most_sig_bits
$8 = 0xf0770b401c7c11ee
```
if we use `scripts/base36-uuid.py` to encode
the msb and lsb, we'd need to:
```console
scripts/base36-uuid.py -e 0xf0770b401c7c11ee 0xb136bf28f8d18b88
3g7g_0bu7_0jpvk2p0mmtlsb8lu0
```
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14561
These barriers show if there's any operation in progress (read, write,
flush or stream). These are crucial to know if stopping fails, e.g. see
issue #13100
These barriers are symmarized in 'scylla memory' command, but they are
also good to know on per-table basis
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There's 'scylla tables' one that lists tables on the given/current
shard, but the list is unable to show lots of information. It prints the
table address so it can be explored by hand, but some data is more handy
to be parsed and printed with the script
The syntax is
$ scylla table ks.cf
For now just print the schema version. To be extended in the future.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
In that level no io_priority_class-es exist. Instead, all the IO happens
in the context of current sched-group. File API no longer accepts prio
class argument (and makes io_intent arg mandatory to impls).
So the change consists of
- removing all usage of io_priority_class
- patching file_impl's inheritants to updated API
- priority manager goes away altogether
- IO bandwidth update is performed on respective sched group
- tune-up scylla-gdb.py io_queues command
The first change is huge and was made semi-autimatically by:
- grep io_priority_class | default_priority_class
- remove all calls, found methods' args and class' fields
Patching file_impl-s is smaller, but also mechanical:
- replace io_priority_class& argument with io_intent* one
- pass intent to lower file (if applicatble)
Dropping the priority manager is:
- git-rm .cc and .hh
- sed out all the #include-s
- fix configure.py and cmakefile
The scylla-gdb.py update is a bit hairry -- it needs to use task queues
list for IO classes names and shares, but to detect it should it checks
for the "commitlog" group is present.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13963
The command prints segment_manager address, because it's the manager
who's on interest, not the db::commitlog itself. Also it prints out all
found segments, it's just for convenience -- segments are in a vector of
shared pointers and it's handy to have object addresses instantly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#14088
All users of global proxy are gone (*), proxy can be made fully main/cql_test_env local.
(*) one test case still needs it, but can get it via cql_test_env
Closes#13616
* github.com:scylladb/scylladb:
code: Remove global proxy
schema_change_test: Use proxy from cql_test_env
test: Carry proxy reference on cql_test_env
Adjust scylla-gdb.get_gms_version_value
to get the versioned_value version as version_type
(utils::tagged_integer).
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Prepare for next patch that makes gms::versioned_value
members private, and provides methods by the same name
as the current members.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
No code needs global proxy anymore. Keep on-stack values in main and
cql_test_env and keep the pointer on debug:: namespace.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
To avoid confusion with task manager tasks compaction::task is renamed
to compaction::compaction_task_exector. All inheriting classes are
modified similarly.
compaction_manager::task needs to be accessed from task manager compaction
tasks. Thus, compaction_manager::task and all inheriting classes are moved
from compaction manager to compaction namespace.
A read that requested memory and has to wait for it can be registered as inactive. This can happen for example if the memory request originated from a background I/O operation (a read-ahead maybe).
Handling this case is currently very difficult. What we want to do is evict such a read on-the-spot: the fact that there is a read waiting on memory means memory is in demand and so inactive reads should be evicted. To evict this reader, we'd first have to remove it from the memory wait list, which is almost impossible currently, because `expiring_fifo<>`, the type used for the wait list, doesn't allow for that. So in this PR we set out to make this possible first, by transforming all current queues to be intrusive lists of permits. Permits are already linked into an intrusive list, to allow for enumerating all existing permits. We use these existing hooks to link the permits into the appropriate queue, and back to `_permit_list` when they are not in any special queue. To make this possible we first have to make all lists store naked permits, moving all auxiliary data fields currently stored in wrappers like `entry` into the permit itself. With this, all queues and lists in the semaphore are intrusive lists, storing permits directly, which has the following implications:
* queues no longer take extra memory, as all of them are intrusive
* permits are completely self-sufficient w.r.t to queuing: code can queue or dequeue permits just with a reference to a permit at hand, no other wrapper, iterator, pointer, etc. is necessary.
* queues don't keep permits alive anymore; destroying a permit will automatically unlink it from the respective queue, although this might lead to use-after-free. Not a problem in practice, only one code-path (`reader_concurrenc_semaphore::with_permit()`) had to be adjusted.
After all that extensive preparations, we can now handle the case of evicting a reader which is queued on memory.
Fixes: #12700Closes#12777
* github.com:scylladb/scylladb:
reader_concurrency_semaphore: handle reader blocked on memory becoming inactive
reader_concurrency_semaphore: move _permit_list next to the other lists
reader_permit: evict inactive read on timeout
reader_concurrency_semaphore: move inactive_read to .cc
reader_concurrency_semaphore: store permits in _inactive_reads
reader_concurrency_semaphore: inactive_read: de-inline more methods
reader_concurrency_semaphore: make _ready_list intrusive
reader_permit: add wait_for_execution state
reader_concurrency_semaphore: make wait lists intrusive
reader_concurrency_semaphore: move most wait_queue methods out-of-line
reader_concurrency_semaphore: store permits directly in queues
reader_permit: introduce (private) operator * and ->
reader_concurrency_semaphore: remove redundant waiters() member
reader_concurrency_semaphore: add waiters counter
reader_permit: use check_abort() for timeout
reader_concurrency_semaphore: maybe_dump_permit_diagnostics(): remove permit list param
reader_concurrency_semaphroe: make foreach_permit() const
reader_permit: add get_schema() and get_op_name() accessors
reader_concurrency_semaphore: mark maybe_dump_permit_diagnostics as noexcept
I've no idea why the quotes are there at all, it works even without
them. However, with quotes gdb-13 fails to find the _all_threads static
thread-local variable _unless_ it's printed with gdb "p" command
beforehand.
fixes: #13125
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closes#13132
Use it to keep track of all permits that are currently waiting on
something: admission, memory or execution.
Currently we keep track of size, by adding up the result of size() of
the various queues. In future patches we are going to change the queues
such that they will not have constant time size anymore, move to an
explicit counter in preperation to that.
Another change this commit makes is to also include ready list entries
in this counter. Permits in the ready list are also waiters, they wait
to be executed. Soon we will have a separate wait state for this too.
Since 5c0f9a8180 ("mutation_partition: Switch cache of
rows onto B-tree") it's no longer in use, except in some
performance test, so remove it.
Although scylla-gdb.py is sometimes used with older releases,
it's so outdated we can remove it from there too.
Closes#12868
This phrase is inaccurate and unnecessary. We know all lines in the
printout are for reads and they are semaphores: no need to repeat this
information on each line.
Example:
Read Concurrency Semaphores:
read: 0/100, 0/ 41901096, queued: 0
streaming: 0/ 10, 0/ 41901096, queued: 0
system: 0/ 10, 0/ 41901096, queued: 0
Closes#12633
Sets the current schema to be used by schema-aware commands.
Setting the schema allows some commands and printers to interpret
schema-dependent objects and present them in a more friendly form.
Some commands require schema to work, for example to sort keys, and
will fail otherwise.
A possibly blocking request for more memory. If the collective memory
consumption of all reads goes above
$serialize_limit_multiplier * $memory_limit this request will block for
all but one reader (the first requester). Until this situation is
resolved, that is until memory stays above the above explained limit,
only this one reader is allowed to make progress. This should help reign
in the memory consumption of reads in a situation where their memory
consumption used to baloon without constraints before.
Retrieves the configuration item with the given name and prints its
value as well as its metadata.
Example:
(gdb) scylla get-config-value compaction_static_shares
value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart
Closes#12362
* github.com:scylladb/scylladb:
scylla-gdb.py: add scylla get-config-value gdb command
scylla-gdb.py: extract $downcast_vptr logic to standalone method
test: scylla-gdb/run: improve diagnostics for failed tests
Retrieves the configuration item with the given name and prints its
value as well as its metadata.
Example:
(gdb) scylla get-config-value compaction_static_shares
value: 100, type: "float", source: SettingsFile, status: Used, live: MustRestart
When a class inherits from multiple virtual base classes, pointers to
instances of this class via one of its base classes, might point to
somewhere into the object, not at its beginning. Therefore, the simple
method employed currently by $downcast_vptr() of casting the provided
pointer to the type extracted from the vtable name fails. Instead when
this situation is detected (detectable by observing that the symbol name
of the partial vtable is not to an offset of +16, but larger),
$downcast_vptr() will iterate over the base classes, adjusting the
pointer with their offsets, hoping to find the true start of the object.
In the one instance I tested this with, this method worked well.
At the very least, the method will now yield a null pointer when it
fails, instead of a badly casted object with corrupt content (which the
developer might or might not attribute to the bad cast).
Closes#11892
Currently, to_string() recursively calls itself for engaged optionals.
Eliminate it. Also, use the std_optional wrapper instead of accessing
std::optional internals directly.
Scylla fiber uses a crude method of scanning inbound and outbound
references to/from other task objects of recognized type. This method
cannot detect user instantiated promise<> objects. Add a note about this
to the printout, so users are beware of this.