Files
scylla/test/topology_experimental_raft/test_mv_tablets.py
Nadav Har'El 4505a86f46 tablets, mv: fix base-view pairing to consider base replication map
In the view update code, the function get_view_natural_endpoint()
determines which view replica this base replica should send an update
to. It currently gets the *view* table's replication map (i.e., the map
from view tokens to lists of replicas holding the token), but assumes
that this is also the *base* table's replication map.

This assumption was true with vnodes, but is no longer true with
tablets - the base table's replication map can be completely different
from the view table's. By looking at the wrong mapping,
get_view_natural_endpoint() can believe that this node isn't really
a base-replica and drop the view update. Alternatively, it can think
it is a base replica - but use the wrong base-view pairing and create
base-view inconsistencies.

This patch solves this bug - get_view_natural_endpoint() now gets two
separate replication maps - the base's and the view's. The callers
need to remember what the base table was (in some cases they didn't
care at the point of the call), and pass it to the function call.

This patch also includes a simple test that reproduces the bug, and
confirms it is fixed: The test has a 6-node cluster using tablets
and a base table with RF=1, and writes one row to it. Before this
patch, the code usually gets confused, thinking the base replica
isn't a replica and loses the view update. With this patch, the
view update works.

Fixes #16227.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#16228
2023-12-04 16:38:54 +02:00

76 lines
3.7 KiB
Python

#
# Copyright (C) 2023-present ScyllaDB
#
# SPDX-License-Identifier: AGPL-3.0-or-later
#
# Tests for interaction of materialized views with *tablets*
from test.pylib.manager_client import ManagerClient
import pytest
import asyncio
import logging
logger = logging.getLogger(__name__)
@pytest.mark.asyncio
async def test_tablet_mv_create(manager: ManagerClient):
"""A basic test for creating a materialized view on a table stored
with tablets on a one-node cluster. We just create the view and
delete it - that's it, we don't read or write the table.
Reproduces issue #16194.
"""
servers = await manager.servers_add(1)
cql = manager.get_cql()
await cql.run_async("CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1, 'initial_tablets': 100}")
await cql.run_async("CREATE TABLE test.test (pk int PRIMARY KEY, c int)")
await cql.run_async("CREATE MATERIALIZED VIEW test.tv AS SELECT * FROM test.test WHERE c IS NOT NULL AND pk IS NOT NULL PRIMARY KEY (c, pk)")
await cql.run_async("DROP KEYSPACE test")
@pytest.mark.asyncio
async def test_tablet_mv_simple(manager: ManagerClient):
"""A simple test for reading and writing a materialized view on a table
stored with tablets on a one-node cluster. Because it's a one-node
cluster, we don't don't need any sophisticated mappings or pairings
to work correctly for this test to pass - everything is on this single
node anyway.
Reproduces issue #16209.
"""
servers = await manager.servers_add(1)
cql = manager.get_cql()
await cql.run_async("CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1, 'initial_tablets': 100}")
await cql.run_async("CREATE TABLE test.test (pk int PRIMARY KEY, c int)")
await cql.run_async("CREATE MATERIALIZED VIEW test.tv AS SELECT * FROM test.test WHERE c IS NOT NULL AND pk IS NOT NULL PRIMARY KEY (c, pk) WITH SYNCHRONOUS_UPDATES = TRUE")
await cql.run_async("INSERT INTO test.test (pk, c) VALUES (2, 3)")
# We used SYNCHRONOUS_UPDATES=TRUE, so the view should be updated:
assert [(3,2)] == list(await cql.run_async("SELECT * FROM test.tv WHERE c=3"))
await cql.run_async("DROP KEYSPACE test")
@pytest.mark.asyncio
async def test_tablet_mv_simple_6node(manager: ManagerClient):
"""A simple reproducer for a bug of forgetting that the view table has a
different tablet mapping from the base: Using the wrong tablet mapping
for the base table or view table can cause us to send a view update
to the wrong view replica - or not send a view update at all. A row
that we write on the base table will not be readable in the view.
We start a large-enough cluster (6 nodes) to increase the probability
that if the mapping is different for the one row we write, and the test
will fail if the bug exists.
Reproduces #16227.
"""
servers = await manager.servers_add(6)
cql = manager.get_cql()
await cql.run_async("CREATE KEYSPACE test WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1, 'initial_tablets': 100}")
await cql.run_async("CREATE TABLE test.test (pk int PRIMARY KEY, c int)")
await cql.run_async("CREATE MATERIALIZED VIEW test.tv AS SELECT * FROM test.test WHERE c IS NOT NULL AND pk IS NOT NULL PRIMARY KEY (c, pk) WITH SYNCHRONOUS_UPDATES = TRUE")
await cql.run_async("INSERT INTO test.test (pk, c) VALUES (2, 3)")
# We used SYNCHRONOUS_UPDATES=TRUE, so the view should be updated:
assert [(3,2)] == list(await cql.run_async("SELECT * FROM test.tv WHERE c=3"))
await cql.run_async("DROP KEYSPACE test")