Commit Graph

53 Commits

Author SHA1 Message Date
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Pavel Emelyanov
f1515c610e code: Remove query-context.hh
The whole thing is unused now, so the header is no longer needed

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2023-08-08 11:11:07 +03:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Piotr Dulikowski
983b440a81 secondary_index_manager: don't add clustering key columns to index table of static column index
The implementation of secondary indexes on static columns relies on the
fact that the index table only includes partition key columns of the
base table, but not clustering key columns. A static column's value
determines a set of full partitions, so including the clustering key
would only be redundant. It would also generate more work as a single
static column update would require a large portion of the index to be
updated.

This commit makes sure that clustering columns are not included in the
index table for indexes based on a static column.
2022-12-06 11:21:16 +01:00
Nadav Har'El
67990d2170 cql, index: don't use IS NOT NULL on collection column
When the secondary-index code builds a materialized view on column x, it
adds "x IS NOT NULL" to the where-clause of the view, as required.

However, when we index a collection column, we index individual pieces
of the collection (keys, values), the the entire collection, so checking
if the entire collection is null does not make sense. Moreover, for a
collection column x, "x IS NOT NULL" currently doesn't work and throws
errors when evaluating that expression when data is written to the table.

The solution used in this patch is to simply avoid adding the "x IS NOT
NULL" when creating the materialized view for a collection index.
Everything works just fine without it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-08-14 10:29:52 +03:00
Michał Radwański
10e241988e cql/expr/expression, index/secondary_index_manager: needs_filtering and
index_supports_expression rewrite to accomodate for indexes over
collections
2022-08-14 10:29:52 +03:00
Karol Baryła
ac97086855 cql3, index: Use entries() indexes on collections for queries
Previous commit added the ability to use GSI over non-frozen collections in queries,
but only the keys() and values() indexes. This commit adds support for the missing
index type - entries() index.

Signed-off-by: Karol Baryła <karol.baryla@scylladb.com>
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-08-14 10:29:52 +03:00
Karol Baryła
7966841d37 cql3, index: Use keys() and values() indexes on collections for queries.
Previous commits added the possibility of creating GSI on non-frozen collections.
This (and next) commit allow those indexes to actually be used by queries.
This commit enables both keys() and values() indexes, as they are pretty similar.
2022-08-14 10:29:52 +03:00
Michał Radwański
f1a9def2e1 schema, index/secondary_index_manager: make schema for index-induced mv
Indexes over collections use materialized views. Supposing that we're
dealing with global indexes, and that pk, ck were the partition and
clustering keys of the base table, the schema of the materialized view,
apart from having idx_token (which is used to preserve the order on the
entries in the view), has a computed column coll_value (the name is not
guaranteed to be exactly) and potentially also
coll_keys_for_values_index, if the index was over collection values.
This is needed, since values in a specific collection need not be
unique.

To summarize, the primary key is as follows:

coll_value, idx_token, pk, ck, coll_keys_for_values_index?

where coll_value is the computed value from the collection, be it a key
from the collection, a value from the collection, or the tuple containing
both.
2022-08-14 10:29:14 +03:00
Michał Radwański
60d50f6016 index/secondary_index_manager: extract keys, values, entries types from collection
These functions are relevant for indexes over collections (creating
schema for a materialized view related to the index).

Signed-off-by: Michał Radwański <michal.radwanski@scylladb.com>
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-08-14 10:29:14 +03:00
Michał Radwański
166afd46b5 Cql.g, treewide: support cql syntax INDEX ON table(VALUES(collection))
Brings support of cql syntax `INDEX ON table(VALUES(collection))`, even
though there is still no support for indexes over collections.
Previously, index_target::target_type::values was refering to values of
a regular (non-collection) column. Rename it to `regular_values`.

Fixes #8745.
2022-08-14 10:29:13 +03:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Botond Dénes
9a44c26d7e index/secondary_index_manager: switch to using data dictionary
Instead of directly using replica::table.
2022-03-25 11:44:31 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Avi Kivity
6221b90b89 secondary_index_manager: stop including expression.hh
Use a forward declaration of cql3::expr::oper_t to reduce the
number of translation units depending on expression.hh.

Before:

    $ find build/dev -name '*.d' | xargs cat | grep -c expression.hh
    272

After:

    $ find build/dev -name '*.d' | xargs cat | grep -c expression.hh
    154

Some translation units adjust their includes to restore access
to required headers.

Closes #9229
2021-08-22 21:21:46 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
6dd10e771d index-manager: Move feature evaluation one level up
The create_view_for_index needs to know the state of the
correct-idx-token-in-secondary-index feature. To get one
it takes quite a long route through global storage service
instance.

Since there's only one caller of the method in question,
and the method is called in a loop, it's a bit faster to
get the feature value in caller and pass it in argument.

This will also help to get rid of the call for global
storage service.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-12-11 21:14:12 +03:00
Piotr Grabowski
2342b386f4 secondary_index: use new token_column_computation
Switches token column computation to (new) token_column_computation,
which fixes #7443, because new token column will be compared using
signed comparisons, not the previous unsigned comparison of CQL bytes
type.

This column computation type is only set if cluster supports
correct_idx_token_in_secondary_index feature to make sure that all nodes
will be able to compute (new) token_column_computation. Also old
indexes will need to be rebuilt to take advantage of this fix, as new
token column computation type is only set for new indexes.
2020-11-04 12:02:42 +01:00
Piotr Grabowski
b1350af951 token_column_computation: rename as legacy
Raname token_column_computation to legacy_token_column_computation, as
it will be replaced with new column_computation. The reason is that this
computation returns bytes, but all tokens in Scylla can now be
represented by int64_t. Moreover, returning bytes causes invalid token
ordering as bytes comparison is done in unsigned way (not signed as
int64_t). See issue:

https://github.com/scylladb/scylla/issues/7443
2020-11-04 12:00:18 +01:00
Avi Kivity
ecb2bdad54 Merge 'Replace operator_type with an enum' from Dejan
"
operator_type is awkward because it's not copyable or assignable. Replace it with a new enum class.

Tests: unit(dev)
"

* dekimir-operator-type:
  cql3: Drop operator_type entirely
  cql3: Drop operator_type from the parser
  cql3/expr: Replace operator_type with an enum
2020-08-18 13:45:20 +03:00
Dejan Mircevski
71c921111d cql3/expr: Replace operator_type with an enum
operator_type is awkward because it's not copyable or assignable.
Replace it in expression representation with a new enum class, oper_t.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-08-18 12:27:00 +02:00
Piotr Jastrzebski
c001374636 codebase wide: replace count with contains
C++20 introduced `contains` member functions for maps and sets for
checking whether an element is present in the collection. Previously
`count` function was often used in various ways.

`contains` does not only express the intend of the code better but also
does it in more unified way.

This commit replaces all the occurences of the `count` with the
`contains`.

Tests: unit(dev)

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <b4ef3b4bc24f49abe04a2aba0ddd946009c9fcb2.1597314640.git.piotr@scylladb.com>
2020-08-15 20:26:02 +03:00
Nadav Har'El
7922b9eb8f materialized views: reduce recompilation when db/view/view.hh changes.
Before this patch, when db/view/view.hh was modified, 89 source files had to
be recompiled. After this patch, this number is down to 5.

Most of the irrelevant source files got view.hh by including database.hh,
which included view.hh just for the definition of statistics. So in this
patch we split the view statistics to a separate header file, view_stats.hh,
and database.hh only includes that. A few source files which included
only database.hh and also needed view.hh (for materialized-view related
functions) now need to include view.hh explicitly.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20200319121031.540-1-nyh@scylladb.com>
2020-03-19 15:46:14 +02:00
Amnon Heiman
6f58d51c83 secondary_index_manager: add the index_name_from_table_name function
index_name_from_table_name is a reverse of index_table_name,
it gets a table name that was generated for an index and return the name
of the index that generated that table.

Relates to #4192

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2020-01-15 15:06:00 +02:00
Piotr Sarna
2ee8c6f595 index: add is_global_index() utility
The helper function is useful for determining if given schema
represents a global index.
2019-10-14 17:13:32 +02:00
Piotr Sarna
a8f7d64a08 index: mark token column as 'computed' when creating mv
Secondary indexes use a computed token column to preserve proper
query ordering. This column is now marked as 'computed'.
2019-07-19 11:58:42 +02:00
Piotr Sarna
074ed2c8a5 index: use proper local index target when adding index
With global indexes, target column name is always the same as the string
kept in 'options[target]' field. It's not the case for local indexes,
and so a proper extracting function is used to get the value.
2019-03-20 10:20:24 +01:00
Piotr Sarna
5672edc149 index: enable parsing multi-key targets
Parsing index targets that consist of partition key columns
followed by clustering key columns is enabled.
2019-03-20 10:20:24 +01:00
Piotr Sarna
9c984f9da9 index: fix indentation 2019-03-20 09:51:46 +01:00
Piotr Sarna
3b908b7b5d index: add base partition keys to local index schema
When the index is local, its partition key in underlying materialized
view is the the same as base's, and the indexed column is a first
clustering key. This implementation ensures that view and base rows
will reside on the same partition, while querying the indexed column
will be possible by putting it as a first clustering key part.
2019-03-20 09:51:46 +01:00
Piotr Sarna
cb20fc2e4f index: make non-pointer overload of is_index function
Previous interface enforced passing a shared pointer, which
might result in calling unneeded shared_from_this().
2019-02-20 12:52:32 +01:00
Piotr Sarna
94db098d39 index: avoid copying when checking for is_index
Previously is_index implementation used list_indexes() helper function,
which copies data.
2019-02-20 12:52:32 +01:00
Duarte Nunes
aa476cd6c9 index/secondary_index_manager: Add virtual columns to MV
Virtual columns are MV-specific columns that contribute to the
liveness of view rows. However, we were not adding those columns when
creating an index's underlying MV, causing indexes to miss base rows.

Fixes #4144
Branches: master, branch-3.0

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2019-01-27 22:30:12 +00:00
Avi Kivity
7ae23d8f9b index: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Avi Kivity
6f23403137 Merge "Virtualize IndexInfo system table" from Duarte
"
The IndexInfo table tracks the secondary indexes that have already
been populated. Since our secondary index implementation is backed by
materialized views, we can virtualize that table so queries are
actually answered by built_views.

Fixes #3483
"

* 'built-indexes-virtual-reader/v2' of github.com:duarten/scylla:
  tests/virtual_reader_test: Add test for built indexes virtual reader
  db/system_keysace: Add virtual reader for IndexInfo table
  db/system_keyspace: Explain that table_name is the keyspace in IndexInfo
  index/secondary_index_manager: Expose index_table_name()
  db/legacy_schema_migrator: Don't migrate indexes
2018-06-06 17:35:51 +03:00
Piotr Sarna
4a9bf7ed5b index, tests: add token column to secondary index schema
Additional token column is now present in every view schema
that backs a secondary index. This column is always a first part
of the clustering key, so it forces token order on queries.
Column's name is ideally idx_token, but can be postfixed
with a number to ensure its uniqueness.

It also updates tests to make them acknowledge the new token order.

Fixes #3423
2018-06-06 09:02:33 +02:00
Duarte Nunes
bc4db67524 index/secondary_index_manager: Expose index_table_name()
Expose secondary_index::index_table_name() so knowledge on how to
built an index name can remain centralized.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-06-04 11:14:17 +01:00
Nadav Har'El
79c6bb642f secondary index: fix indexing of partition-key column
Indexing an only partition key component is not allowed (because it would
be redundant), but it should be allowed to index one of several partition
key components. We had a bug in that case: the underlying materialized view
we created had the same column as both a partition key and a clustering
key, which resulted in an assertion failure. This patch fixes that.

Fixes #3404.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180501121544.22869-1-nyh@scylladb.com>
2018-05-02 08:06:38 +03:00
Nadav Har'El
46d4f6f352 secondary index: fix yet another case sensitivity bug
When the secondary index code builds a "%s IS NOT NULL" clause for a
CQL statement, it needs to quote the column name if it needs to be
(not only lowercase, digits and _).

Fixes #3401.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Message-Id: <20180429221857.6248-7-nyh@scylladb.com>
2018-04-30 00:27:23 +02:00
Duarte Nunes
9146de3118 service/migration_manager: Don't drop index-backing MV
Unless dropped by the index itself, forbid dropping an index-backing
MV using `drop materialized view`.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180424140745.7144-1-duarte@scylladb.com>
2018-04-24 17:01:59 +01:00
Pekka Enberg
3150962cb7 index: Fix index view schema when primary key component is indexed
This fixes index view schema to exclude indexed column when a primary
key component like clustering key is indexed. This fixes a server crash
when CREATE INDEX statement is executed on a clustering key column.
2017-11-03 10:12:58 +02:00
Pekka Enberg
678a6f6e2f index: Implement index::supports_expression() for EQ operator 2017-11-03 09:10:43 +02:00
Pekka Enberg
1ae9343f68 index: Fix index::supports_expression() operator parameter type
The cql3::operator_type is supposed to be passed around as const
reference, not by value; otherwise equality won't work.
2017-11-03 09:10:43 +02:00
Pekka Enberg
ed4c96c025 index: Add secondary_index_manager::create_view_for_index()
This patch adds a create_view_for_index() function, which creates a
view_ptr for index_metadata.
2017-10-05 10:07:44 +03:00
Pekka Enberg
50943ce592 index: Add secondary_index_manager::get_dependent_indices() 2017-10-05 10:07:44 +03:00
Pekka Enberg
9ebd8be82b index: Add secondary_index_manager::reload()
This patch adds a reload() function, which updates the secondary index
manager index map to match underlying column family indices.
2017-09-18 14:31:35 +03:00
Pekka Enberg
2ae6b141e5 index: Add secondary_index_manager::list_indexes() 2017-09-18 14:27:35 +03:00
Pekka Enberg
870de26e35 index: Add index class
Add a simple index class, which represents an instantiated index.
2017-08-24 14:00:02 +03:00