Test suites¶
A test suite is a declarative list of CLI invocations and the results each one should produce. Click Extra runs the suite against any command or binary as separate subprocesses, checking exit codes and output. It is the black-box, subprocess-level complement to CliRunner, which drives a CLI in-process: a test suite never imports the target, so it works just as well against a compiled binary, a shell command, or a CLI written in another language.
Important
A suite file’s format is taken from its extension. TOML (.toml) and JSON (.json) work out of the box; YAML (.yaml, .yml), JSON5, JSONC, and Hjson each need their parser installed. See extra dependencies for the optional installs, and the --config formats for the full list of formats and extensions.
The engine itself (building CLITestCase objects and running them) needs none of these: only parsing a serialized suite does.
Writing a suite¶
A suite is a list of cases. Each entry is one case: the parameters to append to the command, plus the expectations to check. A case with no expectation only asserts that the command ran.
The same suite is shown below in every supported format. TOML and JSON come first, since they need no extra dependency. TOML has no bare top-level list, so its cases sit under a [[cases]] array of tables; every other format is a bare list of case mappings.
[[cases]]
cli_parameters = "--version"
exit_code = 0
[[cases]]
cli_parameters = "forecast --city paris"
stdout_contains = "Sunny"
timeout = 5
[[cases]]
cli_parameters = "--help"
stdout_regex_matches = ["Usage:.+"]
skip_platforms = ["windows"]
[
{
"cli_parameters": "--version",
"exit_code": 0
},
{
"cli_parameters": "forecast --city paris",
"stdout_contains": "Sunny",
"timeout": 5
},
{
"cli_parameters": "--help",
"stdout_regex_matches": ["Usage:.+"],
"skip_platforms": ["windows"]
}
]
- cli_parameters: --version
exit_code: 0
- cli_parameters: forecast --city paris
stdout_contains: Sunny
timeout: 5
- cli_parameters: --help
stdout_regex_matches:
- Usage:.+
skip_platforms:
- windows
[
// Print the version and check it exits cleanly.
{
cli_parameters: '--version',
exit_code: 0,
},
{
cli_parameters: 'forecast --city paris',
stdout_contains: 'Sunny',
timeout: 5,
},
{
cli_parameters: '--help',
stdout_regex_matches: ['Usage:.+'],
skip_platforms: ['windows'],
},
]
[
// Print the version and check it exits cleanly.
{
"cli_parameters": "--version",
"exit_code": 0,
},
{
"cli_parameters": "forecast --city paris",
"stdout_contains": "Sunny",
"timeout": 5,
},
{
"cli_parameters": "--help",
"stdout_regex_matches": ["Usage:.+"],
"skip_platforms": ["windows"],
},
]
[
# Print the version and check it exits cleanly.
{
cli_parameters: --version
exit_code: 0
}
{
cli_parameters: forecast --city paris
stdout_contains: Sunny
timeout: 5
}
{
cli_parameters: --help
stdout_regex_matches: [
Usage:.+
]
skip_platforms: [
windows
]
}
]
The directives map one-to-one onto CLITestCase fields:
Directive |
Meaning |
|---|---|
|
Arguments appended to the command (a string is split, a list is used as-is). |
|
The expected process exit code. |
|
Substrings that must appear. |
|
Regexes that must each match somewhere. |
|
A regex that must fully match, line by line. |
|
The same three checks, but against the combined output (stdout and stderr interleaved in the order the command wrote them, like a terminal). |
|
Strip ANSI escapes before matching. |
|
Seconds before the case fails as a timeout. |
|
|
The output_* directives are mutually exclusive with the stdout_* / stderr_* ones in a single case, since one subprocess run captures either the merged stream or the separate ones. For order-sensitive checks make the command write unbuffered (like python -u): a child that block-buffers stdout will have it surface after stderr.
Running from the command line¶
The click-extra test-suite subcommand runs a suite against a target. Point it at a command on the PATH, a command line, or a path to a binary:
$ click-extra test-suite --command weather --suite-file suite.yaml
Running 3 test cases across 7 workers (os.cpu_count()=8).
Test suite results - Total: 3, Skipped: 0, Failed: 0
Cases run in parallel by default, one fewer than the available logical CPUs (see --jobs). Pass --jobs max to use every core, or --jobs 1 for sequential execution, which lets --exit-on-error stop on the first failure. On an interactive terminal a spinner reports progress; it is silent in pipes and CI logs, and --no-progress turns it off.
Configuring the suite¶
Rather than passing --suite-file every time, a project can declare its suite once under [tool.click-extra.test-suite], and click-extra test-suite picks it up when no suite is given on the command line:
[tool.click-extra.test-suite]
file = "tests/cli-test-suite.toml" # default; format taken from the extension
# inline = "- cli_parameters: --version" # or embed a YAML suite directly
# timeout = 30 # default per-case timeout in seconds
Or write the cases natively in the config file itself, under a cases array of tables — no separate suite file needed:
[[tool.click-extra.test-suite.cases]]
cli_parameters = "--version"
exit_code = 0
[[tool.click-extra.test-suite.cases]]
cli_parameters = "forecast --city paris"
stdout_contains = "Sunny"
The resolution precedence is: --suite-file/--suite-envvar, then [tool.click-extra.test-suite] cases, then inline, then file, then a built-in default suite that exercises --version and --help. The config maps onto the TestSuiteConfig schema (wrapped by ClickExtraConfig).
Running from Python¶
load_test_suite() reads a suite file, picking the format from its extension (use parse_test_suite() for a suite already held as a string), and run_test_suite() runs the cases, returning a Counter of total, skipped, and failed:
from pathlib import Path
from click_extra import load_test_suite, run_test_suite
cases = list(load_test_suite(Path("suite.toml")))
counter = run_test_suite("weather", cases, jobs=4)
if counter["failed"]:
raise SystemExit(1)
Build cases directly when a suite is computed rather than read from a file (this path needs no parser at all):
from click_extra import CLITestCase, run_test_suite
cases = [
CLITestCase(cli_parameters="--version", exit_code=0),
CLITestCase(cli_parameters="forecast --city lyon", stdout_contains="Cloudy"),
]
run_test_suite("weather", cases)
click_extra.test_suite API¶
classDiagram
Exception <|-- SkippedTest
Declarative, black-box CLI test suites.
A test suite is a list of CLITestCase invocations: each runs a target
command (a name, a command line, or a path to a binary) once with extra
parameters, then checks its exit code and stdout/stderr against literal,
substring, or regex expectations. Cases carry their own platform skip/only
rules, so one suite runs across operating systems unchanged.
Suites are written in any list-capable configuration format and loaded with
load_test_suite() (which picks the format from the file extension) or
parse_test_suite() (which parses a serialized string). TOML and JSON are
built in; YAML and the other SUITE_FORMATS need their matching
click-extra[…] extra. run_test_suite() drives a list of cases against
a target, parallelized per the resolved --jobs count (see
click_extra.execution.run_jobs()) and reporting live progress through a
click_extra.spinner.Spinner.
This is the black-box, subprocess-level complement to
click_extra.testing.CliRunner, which drives a CLI in-process.
- click_extra.test_suite.SUITE_FORMATS: tuple[ConfigFormat, ...] = (ConfigFormat.TOML, ConfigFormat.JSON, ConfigFormat.YAML, ConfigFormat.JSON5, ConfigFormat.JSONC, ConfigFormat.HJSON)
Configuration formats a test suite can be serialized in, built-in ones first.
These are the formats able to represent a top-level list of case mappings, matched against a file’s extension by
load_test_suite(). TOML and JSON parse with no extra dependency; the others each need their matchingclick-extra[…]extra. TOML has no bare top-level array, so a TOML suite lists its cases under a[[cases]]array of tables (seeparse_test_suite()); the others use a bare list. INI (no nesting) and XML (no natural list representation) are excluded.Per-format availability is resolved by
ConfigFormat, so a format whose parser is not installed raises anImportErrorpointing at its extra at parse time.
- exception click_extra.test_suite.SkippedTest[source]
Bases:
ExceptionRaised when a test case should be skipped.
- class click_extra.test_suite.CLITestCase(cli_parameters=<factory>, skip_platforms=<factory>, only_platforms=<factory>, timeout=None, exit_code=None, strip_ansi=False, output_contains=<factory>, stdout_contains=<factory>, stderr_contains=<factory>, output_regex_matches=<factory>, stdout_regex_matches=<factory>, stderr_regex_matches=<factory>, output_regex_fullmatch=None, stdout_regex_fullmatch=None, stderr_regex_fullmatch=None, execution_trace=None)[source]
Bases:
objectA single CLI test case: how to invoke the command and what to expect.
Each case runs the command-under-test once with cli_parameters appended, then checks the captured result against the expectation directives below. A case with no expectation only asserts the command ran (plus exit_code, if set).
- cli_parameters: tuple[str, ...] | str
Arguments and options appended to the command-under-test.
A plain string is split into arguments (on spaces on Windows, with shlex elsewhere); a list or tuple is used as-is.
- skip_platforms: Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[_TNestedReferences]]
Platforms (or platform-group IDs) on which to skip this case.
Accepts extra_platforms identifiers such as linux, macos, windows, in any case, mixed freely with group IDs.
- only_platforms: Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[_TNestedReferences]]
Restrict this case to these platforms; skip it everywhere else.
The mirror image of skip_platforms, using the same identifiers.
- timeout: float | str | None = None
Seconds before the command is killed and the case fails as a timeout.
Falls back to the command’s –timeout default, then to no limit.
- strip_ansi: bool = False
Strip ANSI escape sequences from the captured output before matching.
- output_contains: tuple[str, ...] | str
Substrings that must all be present in the combined output.
The combined output interleaves stdout and stderr in the order the command wrote them, matching what a user sees in a terminal. The
output_*directives are mutually exclusive with thestdout_*/stderr_*ones: a single subprocess run captures either the merged stream or the separate ones, not both.
- output_regex_matches: tuple[Pattern | str, ...] | str
Regexes that must each match somewhere in the combined output (searched, re.DOTALL). See output_contains for the merged-stream semantics.
- stdout_regex_matches: tuple[Pattern | str, ...] | str
Regexes that must each match somewhere in stdout (searched, re.DOTALL).
- stderr_regex_matches: tuple[Pattern | str, ...] | str
Regexes that must each match somewhere in stderr (searched, re.DOTALL).
- output_regex_fullmatch: Pattern | str | None = None
Regex that must fully match the combined output, line by line. See output_contains for the merged-stream semantics.
- stdout_regex_fullmatch: Pattern | str | None = None
Regex that must fully match stdout, line by line.
- stderr_regex_fullmatch: Pattern | str | None = None
Regex that must fully match stderr, line by line.
- execution_trace: str | None = None
Rendering of the command execution and its output.
Populated after the case runs, for inspection on failure; not a directive you set in a test suite.
- property has_merged_output_directives: bool
Whether any
output_*directive (merged stream) is set.
- property has_separate_stream_directives: bool
Whether any
stdout_*orstderr_*directive (separate streams) is set.
- run_cli_test(command, additional_skip_platforms, default_timeout)[source]
Run a CLI command and check its output against the test case.
The provided command can be either:
a path to a binary or script to execute;
a command name to be searched in the PATH,
a command line with arguments to be parsed and executed by the shell.
`{todo} Add support for environment variables. `- Return type:
- click_extra.test_suite.cases_from_data(data)[source]
Build
CLITestCaseinstances from already-parsed suite data.The in-memory counterpart to
parse_test_suite()(which parses a string) andload_test_suite()(which reads a file): feed it a suite that is already a Python object, such as the nativecasesmappings declared in a[tool.<cli>.test-suite]config section.A suite is a list of case mappings, each keyed by
CLITestCasedirective names. Formats with no bare top-level array (TOML) carry that list under a top-levelcaseskey, so a mapping is unwrapped here.- Raises:
ValueError – the suite is empty, a mapping suite omits
cases, or a case uses unknown directives.TypeError – the suite is not a list, or a case is not a mapping.
- Return type:
- click_extra.test_suite.parse_test_suite(suite_string, fmt=ConfigFormat.YAML)[source]
Parse a serialized test suite string into
CLITestCaseinstances.fmtselects the serialization format, one ofSUITE_FORMATS; it defaults to YAML for string sources with no extension to key on (an environment variable, an inline config value).load_test_suite()is the file-based counterpart.- Raises:
ValueError – the suite is empty,
fmtcannot express a suite, a mapping suite omitscases, or a case uses unknown directives.TypeError – the suite is not a list, or a case is not a mapping.
ImportError – the format’s optional parser is not installed.
- Return type:
- click_extra.test_suite.load_test_suite(path)[source]
Read a test suite file and parse it by the format of its extension.
The format is resolved from
path’s name over the list-capableSUITE_FORMATS(sosuite.tomlparses as TOML,suite.yamlas YAML). Reading and format detection are delegated toclick_extra.config.formats.read_file().- Raises:
ValueError – the file extension matches no suite format.
ImportError – the matched format’s optional parser is not installed.
- Return type:
- click_extra.test_suite.run_test_suite(command, cases, *, jobs=1, select_test=None, skip_platform=None, timeout=None, exit_on_error=False, show_trace_on_error=True, stats=True, show_progress=True)[source]
Run a list of test cases against a target command and tally the results.
Cases are parallelized per
jobs(seeclick_extra.execution.run_jobs()): at one worker they run sequentially and lazily, soexit_on_errorcan stop before the rest start; otherwise they run in a thread pool and every case runs to completion. Either way outcomes are tallied in submission order. On an interactive terminal aclick_extra.spinner.Spinnerreports progress unlessshow_progressis false.- Parameters:
command (
Path|str) – The target to test: a command name, a command line, or a path to a binary or script.cases (
Sequence[CLITestCase]) – The test cases to run.jobs (
int) – Number of parallel workers;1runs sequentially.select_test (
Sequence[int] |None) – 1-based case numbers to run; others are skipped.skip_platform (
Trait|Group|str|None|Iterable[Trait|Group|str|None|Iterable[Trait|Group|str|None|Iterable[Trait|Group|str|None|Iterable[_TNestedReferences]]]]) – Extra platforms (or group IDs) to skip every case on.timeout (
float|None) – Default per-case timeout in seconds when a case sets none.exit_on_error (
bool) – Stop at the first failure (sequential runs only).show_trace_on_error (
bool) – Echo the execution trace of each failed case.stats (
bool) – Echo a one-line worker summary up front and a result tally.show_progress (
bool) – Allow the progress spinner on an interactive terminal.
- Return type:
- Returns:
A
collections.Counterwithtotal,skipped, andfailedkeys. A non-zerofailedcount signals the caller to exit with an error.
Future directions¶
The current design is a declarative list of directives. Two points of comparison suggest where it could go next.
Click Extra’s click:run and click:source Sphinx directives apply the same run-and-check idea from the documentation side: they execute a CLI in-process while the docs build and assert on its output, so every example doubles as a test. A test suite does it at the subprocess level instead, against any binary. Letting a documented example and a test case share one source is an open avenue.
scrut is a standalone toolkit aimed at the same black-box CLI testing problem, with a different authoring model: expectations are written inline beneath each command in a Markdown or Cram file, and scrut update regenerates them. I came across it after building this feature for my own needs, so the resemblance is convergence, not lineage. Its snapshot-style workflow (generate and refresh expectations instead of hand-writing them), per-case environment and working-directory controls, and glob expectations are the directions worth weighing for a later revision.