test: test_group0_schema_versioning: wait for schema sync in system.local

`test_schema_versioning_with_recovery` is currently flaky. It performs
a write with CL=ALL and then checks if the schema version is the same on
all nodes by calling `verify_table_versions_synced`. All nodes are expected
to sync their schema before handling the replica write. The node in
RECOVERY mode should do it through a schema pull, and other nodes should do
it through a group 0 read barrier.

The problem is in `verify_local_schema_versions_synced` that compares the
schema versions in `system.local`. The node in RECOVERY mode updates the
schema version in `system.local` after it acknowledges the replica write
as completed. Hence, the check can fail.

We fix the problem by making the function wait until the schema versions
match.

Note that RECOVERY mode is about to be retired together with the whole
gossip-based topology in 2026.2. So, this test is about to be deleted.
However, we still want to fix it, so that it doesn't bother us in older
branches.

Fixes #23803

Closes scylladb/scylladb#28114

(cherry picked from commit 6b5923c64e)

Closes scylladb/scylladb#28178
This commit is contained in:
Patryk Jędrzejczak
2026-01-12 14:49:55 +01:00
parent 4e4bfee41e
commit 8b15975cb8

View File

@@ -15,7 +15,7 @@ from cassandra.query import SimpleStatement # type: ignore
from cassandra.pool import Host # type: ignore
from test.pylib.manager_client import ManagerClient, ServerInfo
from test.pylib.util import wait_for_cql_and_get_hosts
from test.pylib.util import wait_for, wait_for_cql_and_get_hosts
from test.pylib.log_browsing import ScyllaLogFile
from test.cluster.util import reconnect_driver, wait_until_upgrade_finishes, \
enter_recovery_state, delete_raft_data_and_upgrade_state, new_test_keyspace
@@ -53,12 +53,16 @@ async def get_scylla_tables_version(cql: Session, h: Host, keyspace_name: str, t
async def verify_local_schema_versions_synced(cql: Session, hs: list[Host]) -> None:
versions = {h: await get_local_schema_version(cql, h) for h in hs}
logger.info(f"system.local schema_versions: {versions}")
h1, v1 = next(iter(versions.items()))
for h, v in versions.items():
if v != v1:
pytest.fail(f"{h1}'s system.local schema_version {v1} is different than {h}'s version {v}")
async def check():
versions = {h: await get_local_schema_version(cql, h) for h in hs}
logger.info(f"system.local schema_versions: {versions}")
h1, v1 = next(iter(versions.items()))
for h, v in versions.items():
if v != v1:
logger.info(f"{h1}'s system.local schema_version {v1} is different than {h}'s version {v}; retrying")
return None
return True
await wait_for(check, deadline=time.time() + 5.0, period=1.0)
async def verify_group0_schema_versions_synced(cql: Session, hs: list[Host]) -> None: