test/cql-pytest: add option to run cql-pytes tests against specific release

This patch adds the option "--release <version>" to test/cql-pytest/run,
which downloads the pre-compiled Scylla release with the given version
number and runs the tests against that version. For example, it can be used
to demonstrate that #15559 was indeed a regression between 2022.1 and 2022.2,
by running a recently-added test against these two old versions:

test/cql-pytest/run --release 2022.1 --runxfail \
        test_prepare.py::test_duplicate_named_bind_marker_prepared

test/cql-pytest/run --release 2022.2 --runxfail \
        test_prepare.py::test_duplicate_named_bind_marker_prepared

The first run passes, the second fails - showing the regression.

The Scylla releases are downloaded from ScyllaDB's S3 bucket
(downloads.scylladb.com). They are saved in the build/ directory
(e.g., build/2022.2.9), and if that directory is not removed, when
"run --release" requests the same version again, the previous download
is reused.

Release numbers can look like:

    * 5.4.7
    * 5.4 (will get the latest in the 5.4 branch, e.g., 5.4.7)
    * 5.4.0~rc2 (a prerelease)
    * 2021.1.9 (Enterprise release)
    * 2023.1 (latest in this branch, Enterprise release)

Fixes #13189

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#19228
This commit is contained in:
Nadav Har'El
2024-06-11 12:18:32 +03:00
committed by Avi Kivity
parent f3dee5b636
commit 6712fcc316
4 changed files with 222 additions and 9 deletions

View File

@@ -47,6 +47,28 @@ Additional useful pytest options, especially useful for debugging tests:
* -v: show the names of each individual test running instead of just dots.
* -s: show the full output of running tests (by default, pytest captures the test's output and only displays it if a test fails)
The "run" script also has an ability to run tests against a specific old
release of Scylla downloaded (pre-compiled) from ScyllaDB's official
release collection. For example:
```
test/cql-pytest/run --release 2022.1 --runxfail \
test_prepare.py::test_duplicate_named_bind_marker_prepared
test/cql-pytest/run --release 2022.2 --runxfail \
test_prepare.py::test_duplicate_named_bind_marker_prepared
```
can demonstrate a regression of a test between ScyllaDB Enterprise releases
2022.1 and 2022.2. The `--release` option (which must be the first option
to "run") downloads the requested official release and caches it in the
`build/` directory (e.g., `build/2021.1.9`), and then runs the requested
tests against that version.
The `--release` option supports various version specifiers, such as 5.4.7
(a specific version), 5.4 (asking for the latest version in the 5.4 branch),
5.4.0~rc2 (a pre-release), or Enterprise releases such as 2021.1.9 or 2023.1
(the latest in that branch).
# Developing new cql-pytest tests
The cql-pytest test framework is designed to encourage Scylla developers

128
test/cql-pytest/fetch_scylla.py Executable file
View File

@@ -0,0 +1,128 @@
#!/usr/bin/python
# Copyright (C) 2024-present ScyllaDB
# SPDX-License-Identifier: AGPL-3.0-or-later
#
# Fetch from ScyllaDB's S3 bucket (downloads.scylladb.com) a pre-compiled
# version of Scylla for a desired release, ready to be run by tests (run.py).
# What we need to fetch is a "relocatable" package containing just the Scylla
# executable and the shared libraries it needs - we don't need a full OS and
# not even stuff like JMX or Python that cql-pytest tests don't need.
#
# Can fetch released versions with names like:
# * 5.4.7
# * 5.4 (latest in the 5.4 branch, e.g., 5.4.7)
# * 5.4.0~rc2 (prerelease)
# * 2021.1.9 (Enterprise release)
# * 2023.1 (latest in this branch, Enterprise release)
import boto3
import sys
import os
import subprocess
# Parse version 5.6.2 into the array [5, 6, 2]. Ignore rc numbers.
def parse_version(version):
return version.split('~')[0].split('.')
def s3_download(s3, bucket, object_key, out):
print(f'Downloading {object_key}... ', end='')
if not sys.stdout.isatty():
bucket.download_file(object_key, out)
else:
# When interactive, show download progress:
print('')
objlen = s3.Object(bucket.name, object_key).content_length
downloaded = 0
def progress(chunk):
nonlocal downloaded
downloaded += chunk
print('\33[2K\r%02d%% %d MB' % (int(downloaded*100/objlen), int(downloaded/1024/1024)), end='')
bucket.download_file(object_key, out, Callback=progress)
print('')
print('done.')
# Download Scylla release "release" (e.g., 5.4, 5.4.7, 2023.1, 2023.1.7)
# into a subdirectory of directory "dir" named by the specific release
# downloaded (e.g., 5.4.7). Returns the ready-to-use scylla executable
# (actually, a shell-script wrapper) in that subdirectory - or raise exception
# if this release cannot be found.
# If the same download was already done in the past, discover this as quickly
# as possible and don't download again. A 3-component release like 5.4.7
# doesn't even require a network lookup, a 2-component release like 5.4 does
# require a lookup to see if a newer 5.4 release appeared - but may not
# require a full download if the same release remains the latest.
def download_scylla(release, dir):
scylla_arch = 'x86_64'
v = parse_version(release)
# If release has 3 components (e.g., 5.4.7) we know exactly which version
# we are looking for, and can check if we already downloaded it without
# even fetching the list of versions from s3:
if len(v) == 3 and os.path.exists(dir + '/' + release + '/scylla_wrapper'):
print(f'Already downloaded, in {dir}/{release}')
return dir + '/' + release + '/scylla_wrapper'
major = f'{v[0]}.{v[1]}'
bucket = f'downloads.scylladb.com'
scylla_arch_string = '.' + scylla_arch + '.'
if int(v[0]) >= 2023: # Enterprise release (new organization)
prefix = f'downloads/scylla-enterprise/relocatable/scylladb-{major}/scylla-enterprise-{release}'
elif int(v[0]) > 2000: # Enterprise release (old organization)
prefix = f'downloads/scylla-enterprise/relocatable/scylladb-{major}/scylla-enterprise-{scylla_arch}-package-{release}'
scylla_arch_string = '' # arch already restricted by prefix
elif [int(v[0]),int(v[1])] <= [4,5]: # Open source (very old organization)
prefix = f'downloads/scylla/relocatable/scylladb-{major}/scylla-package-{release}'
scylla_arch_string = '' # no arch (only x86)
elif [int(v[0]),int(v[1])] <= [5,1]: # Open source (old organization)
prefix = f'downloads/scylla/relocatable/scylladb-{major}/scylla-{scylla_arch}-package-{release}'
scylla_arch_string = '' # arch already restricted by prefix
else: # Open source (new organization)
prefix = f'downloads/scylla/relocatable/scylladb-{major}/scylla-{release}'
# This prefix has many different packages belonging to all releases in
# the same major version. We need to look only for those matching the
# minor version, and take the highest minor number.
s3 = boto3.resource('s3')
bucket = s3.Bucket(bucket)
matches = bucket.objects.filter(Prefix=prefix)
candidates = [o.key.removeprefix(prefix) for o in matches if scylla_arch_string in o.key]
# candidates for release 5.4 look like ".2-0.20240117.6c625e8cd3c6.x86_64.tar.gz"
# refering to 5.4.2. Remove the first character (can be . or -) and take
# the one with highest number (assume it is up to two digits).
# Some old Enterprise releases used the convension of rc7 instead of
# 0~rc7, so the rcs come first in this sorting instead of last, so we
# need to fix that too.
def comparison_key(x):
x = x[1:].replace('.', '-').split('-')[0]
if x.startswith('rc'):
return '00~'+x
elif x.startswith('0~rc'):
return '0'+x
else:
return '%02d' % (int(x))
candidates = sorted(candidates, key=comparison_key, reverse=True)
if not candidates:
raise Exception(f"Can't find release {release}")
chosen = prefix + candidates[0]
# different versions used different delimeters
candidates[0] = candidates[0].replace('.', '-')
minor = candidates[0].split('-')[1]
chosen_release = release if len(v)==3 else major + '.' + minor
print(f'Chosen download for ScyllaDB {release}: {chosen_release} ({chosen})')
out_dir = dir + '/' + chosen_release
out_wrapper = out_dir + '/scylla_wrapper'
if os.path.exists(out_wrapper):
# We already downloaded this version, nothing left to do.
print(f'Already downloaded, in {out_dir}')
return out_wrapper
if not os.path.exists(out_dir):
os.mkdir(out_dir)
out_tar = out_dir + '/download.tgz'
s3_download(s3, bucket, chosen, out_tar)
subprocess.run(['tar', '--extract', '--file', out_tar,
'--strip-components=1', '-C', out_dir])
os.unlink(out_tar)
with open(out_wrapper, 'w') as f:
f.write(f'#!/bin/sh\nLD_LIBRARY_PATH={out_dir}/libreloc {out_dir}/libreloc/ld.so {out_dir}/libexec/scylla "$@"')
os.chmod(out_wrapper, 0o755)
print(f'Downloaded to {out_dir}')
return out_wrapper

View File

@@ -14,6 +14,20 @@ else:
cmd = run.run_scylla_cmd
check_cql = run.check_cql
# If the first option (TODO: improve the command line processing this up)
# is "--release", download that release (see fetch_scylla.py for supported
# release numbers), and use that.
# The downloaded Scylla will be cached in the directory build/<release>,
# where <release> is the specific release downloaded (e.g., if the user
# asks "--release 2022.1" and the downloaded release is 2022.1.9, it
# will be stored in build/2022.1.9.
if sys.argv[1] == '--release':
release = sys.argv[2]
exe = run.download_precompiled_scylla(release)
cmd = lambda pid, dir: run.run_precompiled_scylla_cmd(exe, pid, dir)
check_cql = run.check_cql
sys.argv = sys.argv[0:1] + sys.argv[3:]
# If the "--vnodes" option is given, drop the "tablets" experimental
# feature (turned on in run.py) so that all tests will be run with the
# old vnode-based replication instead of tablets. This option only has

View File

@@ -275,11 +275,11 @@ def run_scylla_cmd(pid, dir):
'--smp', '2',
'-m', '1G',
'--overprovisioned',
'--max-networking-io-control-blocks', '1000',
'--max-networking-io-control-blocks=1000',
'--unsafe-bypass-fsync', '1',
'--kernel-page-cache', '1',
'--kernel-page-cache=1',
'--commitlog-use-o-dsync', '0',
'--flush-schema-tables-after-modification', 'false',
'--flush-schema-tables-after-modification=false',
'--api-address', ip,
'--rpc-address', ip,
'--listen-address', ip,
@@ -292,7 +292,7 @@ def run_scylla_cmd(pid, dir):
'--logger-log-level', 'migration_manager=warn',
# Use lower settings for some parameters to allow faster testing
'--num-tokens', '16',
'--query-tombstone-page-limit', '1000',
'--query-tombstone-page-limit=1000',
# Significantly increase default timeouts to allow running tests
# on a very slow setup (but without network losses). Note that these
# are server-side timeouts: The client should also avoid timing out
@@ -318,15 +318,15 @@ def run_scylla_cmd(pid, dir):
# and other modules dependent on it: e.g. service levels
'--authenticator', 'PasswordAuthenticator',
'--authorizer', 'CassandraAuthorizer',
'--strict-allow-filtering', 'true',
'--strict-is-not-null-in-views', 'true',
'--strict-allow-filtering=true',
'--strict-is-not-null-in-views=true',
'--permissions-update-interval-in-ms', '100',
'--permissions-validity-in-ms', '5',
'--shutdown-announce-in-ms', '0',
'--maintenance-socket', 'workdir',
'--service-levels-interval-ms', '500',
'--maintenance-socket=workdir',
'--service-levels-interval-ms=500',
# Avoid unhelpful "guardrails" warnings
'--minimum-replication-factor-warn-threshold', '-1',
'--minimum-replication-factor-warn-threshold=-1',
], env)
# Same as run_scylla_cmd, just use SSL encryption for the CQL port (same
@@ -340,6 +340,55 @@ def run_scylla_ssl_cql_cmd(pid, dir):
]
return (cmd, env)
# Download the requested precompiled Scylla release using fetch_scylla.py,
# and cache it in a subdirectory of <source_path>/build whose name is
# the specific release actually download.
# This function returns a path of the executable to run Scylla (actually,
# a shell-script wrapper that sets the LD_LIBRARY_PATH appropriately).
def download_precompiled_scylla(release):
import fetch_scylla
return fetch_scylla.download_scylla(release, os.path.join(source_path, 'build'))
# Instead of run_scylla_cmd, which runs Scylla executable compiled in this
# build directory, run_precompiled_scylla_cmd runs a release of Scylla
# downloaded by fetch_precompiled_scylla:
def run_precompiled_scylla_cmd(exe, pid, dir):
(cmd, env) = run_scylla_cmd(pid, dir)
# run_scylla_cmd linked "test_scylla" to the built Scylla, we want to
# link it to the wrapper script we just downloaded:
scylla_link = os.path.join(dir, 'test_scylla')
os.unlink(scylla_link)
os.symlink(exe, scylla_link)
# Unfortunately, earlier Scylla versions required different command line
# options to run, so for some old versions we need to drop some of the
# command line options added above in run_scylla_cmd, or add more
# options. We do this hard-coded for particular versions, which is
# kind of ugly and high-maintenance :-( Maybe in the future we could
# detect this automatically by using "--help" or something.
version = os.path.basename(os.path.dirname(exe)).split('~')[0].split('.')
major = [int(version[0]), int(version[1])]
enterprise = major[0] > 2000
if major[0] < 6 or (enterprise and major <= [2024,1]):
cmd.remove('--enable-tablets=true')
cmd.remove('--maintenance-socket=workdir')
cmd.remove('--service-levels-interval-ms=500')
if major < [5,4] or (enterprise and major <= [2023,1]):
cmd.remove('--strict-is-not-null-in-views=true')
cmd.remove('--minimum-replication-factor-warn-threshold=-1')
if major <= [5,1] or (enterprise and major <= [2022,2]):
cmd.remove('--query-tombstone-page-limit=1000')
if major <= [5,0] or (enterprise and major <= [2022,1]):
cmd.remove('--experimental-features=keyspace-storage-options')
if major <= [4,5] or (enterprise and major <= [2021,1]):
cmd.remove('--kernel-page-cache=1')
cmd.remove('--flush-schema-tables-after-modification=false')
cmd.remove('--strict-allow-filtering=true')
if major <= [4,5]:
cmd.remove('--max-networking-io-control-blocks=1000')
if major == [5,4] or major == [2024,1]:
cmd.append('--force-schema-commit-log=true')
return (cmd, env)
# Get a Cluster object to connect to CQL at the given IP address (and with
# the appropriate username and password). It's important to shutdown() this
# Cluster object when done with it, otherwise we can get errors at the end