Test suites

A test suite is a declarative list of CLI invocations and the results each one should produce. Click Extra runs the suite against any command or binary as separate subprocesses, checking exit codes and output. It is the black-box, subprocess-level complement to CliRunner, which drives a CLI in-process: a test suite never imports the target, so it works just as well against a compiled binary, a shell command, or a CLI written in another language.

Important

A suite file’s format is taken from its extension. TOML (.toml) and JSON (.json) work out of the box; YAML (.yaml, .yml), JSON5, JSONC, and Hjson each need their parser installed. See extra dependencies for the optional installs, and the --config formats for the full list of formats and extensions.

The engine itself (building CLITestCase objects and running them) needs none of these: only parsing a serialized suite does.

Writing a suite

A suite is a list of cases. Each entry is one case: the parameters to append to the command, plus the expectations to check. A case with no expectation only asserts that the command ran.

The same suite is shown below in every supported format. TOML and JSON come first, since they need no extra dependency. TOML has no bare top-level list, so its cases sit under a [[cases]] array of tables; every other format is a bare list of case mappings.

[[cases]]
cli_parameters = "--version"
exit_code = 0

[[cases]]
cli_parameters = "forecast --city paris"
stdout_contains = "Sunny"
timeout = 5

[[cases]]
cli_parameters = "--help"
stdout_regex_matches = ["Usage:.+"]
skip_platforms = ["windows"]
[
  {
    "cli_parameters": "--version",
    "exit_code": 0
  },
  {
    "cli_parameters": "forecast --city paris",
    "stdout_contains": "Sunny",
    "timeout": 5
  },
  {
    "cli_parameters": "--help",
    "stdout_regex_matches": ["Usage:.+"],
    "skip_platforms": ["windows"]
  }
]
- cli_parameters: --version
  exit_code: 0

- cli_parameters: forecast --city paris
  stdout_contains: Sunny
  timeout: 5

- cli_parameters: --help
  stdout_regex_matches:
    - Usage:.+
  skip_platforms:
    - windows
[
  // Print the version and check it exits cleanly.
  {
    cli_parameters: '--version',
    exit_code: 0,
  },
  {
    cli_parameters: 'forecast --city paris',
    stdout_contains: 'Sunny',
    timeout: 5,
  },
  {
    cli_parameters: '--help',
    stdout_regex_matches: ['Usage:.+'],
    skip_platforms: ['windows'],
  },
]
[
  // Print the version and check it exits cleanly.
  {
    "cli_parameters": "--version",
    "exit_code": 0,
  },
  {
    "cli_parameters": "forecast --city paris",
    "stdout_contains": "Sunny",
    "timeout": 5,
  },
  {
    "cli_parameters": "--help",
    "stdout_regex_matches": ["Usage:.+"],
    "skip_platforms": ["windows"],
  },
]
[
  # Print the version and check it exits cleanly.
  {
    cli_parameters: --version
    exit_code: 0
  }
  {
    cli_parameters: forecast --city paris
    stdout_contains: Sunny
    timeout: 5
  }
  {
    cli_parameters: --help
    stdout_regex_matches: [
      Usage:.+
    ]
    skip_platforms: [
      windows
    ]
  }
]

The directives map one-to-one onto CLITestCase fields:

Directive

Meaning

cli_parameters

Arguments appended to the command (a string is split, a list is used as-is).

exit_code

The expected process exit code.

stdout_contains / stderr_contains

Substrings that must appear.

stdout_regex_matches / stderr_regex_matches

Regexes that must each match somewhere.

stdout_regex_fullmatch / stderr_regex_fullmatch

A regex that must fully match, line by line.

output_contains / output_regex_matches / output_regex_fullmatch

The same three checks, but against the combined output (stdout and stderr interleaved in the order the command wrote them, like a terminal).

strip_ansi

Strip ANSI escapes before matching.

timeout

Seconds before the case fails as a timeout.

skip_platforms / only_platforms

extra_platforms identifiers (linux, macos, windows, group IDs) controlling where the case runs.

The output_* directives are mutually exclusive with the stdout_* / stderr_* ones in a single case, since one subprocess run captures either the merged stream or the separate ones. For order-sensitive checks make the command write unbuffered (like python -u): a child that block-buffers stdout will have it surface after stderr.

Running from the command line

The click-extra test-suite subcommand runs a suite against a target. Point it at a command on the PATH, a command line, or a path to a binary:

$ click-extra test-suite --command weather --suite-file suite.yaml
Running 3 test cases across 7 workers (os.cpu_count()=8).
Test suite results - Total: 3, Skipped: 0, Failed: 0

Cases run in parallel by default, one fewer than the available logical CPUs (see --jobs). Pass --jobs max to use every core, or --jobs 1 for sequential execution, which lets --exit-on-error stop on the first failure. On an interactive terminal a spinner reports progress; it is silent in pipes and CI logs, and --no-progress turns it off.

Configuring the suite

Rather than passing --suite-file every time, a project can declare its suite once under [tool.click-extra.test-suite], and click-extra test-suite picks it up when no suite is given on the command line:

[tool.click-extra.test-suite]
file = "tests/cli-test-suite.toml"  # default; format taken from the extension
# inline = "- cli_parameters: --version"  # or embed a YAML suite directly
# timeout = 30  # default per-case timeout in seconds

Or write the cases natively in the config file itself, under a cases array of tables — no separate suite file needed:

[[tool.click-extra.test-suite.cases]]
cli_parameters = "--version"
exit_code = 0

[[tool.click-extra.test-suite.cases]]
cli_parameters = "forecast --city paris"
stdout_contains = "Sunny"

The resolution precedence is: --suite-file/--suite-envvar, then [tool.click-extra.test-suite] cases, then inline, then file, then a built-in default suite that exercises --version and --help. The config maps onto the TestSuiteConfig schema (wrapped by ClickExtraConfig).

Running from Python

load_test_suite() reads a suite file, picking the format from its extension (use parse_test_suite() for a suite already held as a string), and run_test_suite() runs the cases, returning a Counter of total, skipped, and failed:

from pathlib import Path

from click_extra import load_test_suite, run_test_suite

cases = list(load_test_suite(Path("suite.toml")))
counter = run_test_suite("weather", cases, jobs=4)
if counter["failed"]:
    raise SystemExit(1)

Build cases directly when a suite is computed rather than read from a file (this path needs no parser at all):

from click_extra import CLITestCase, run_test_suite

cases = [
    CLITestCase(cli_parameters="--version", exit_code=0),
    CLITestCase(cli_parameters="forecast --city lyon", stdout_contains="Cloudy"),
]
run_test_suite("weather", cases)

click_extra.test_suite API

        classDiagram
  Exception <|-- SkippedTest
    

Declarative, black-box CLI test suites.

A test suite is a list of CLITestCase invocations: each runs a target command (a name, a command line, or a path to a binary) once with extra parameters, then checks its exit code and stdout/stderr against literal, substring, or regex expectations. Cases carry their own platform skip/only rules, so one suite runs across operating systems unchanged.

Suites are written in any list-capable configuration format and loaded with load_test_suite() (which picks the format from the file extension) or parse_test_suite() (which parses a serialized string). TOML and JSON are built in; YAML and the other SUITE_FORMATS need their matching click-extra[…] extra. run_test_suite() drives a list of cases against a target, parallelized per the resolved --jobs count (see click_extra.execution.run_jobs()) and reporting live progress through a click_extra.spinner.Spinner.

This is the black-box, subprocess-level complement to click_extra.testing.CliRunner, which drives a CLI in-process.

click_extra.test_suite.SUITE_FORMATS: tuple[ConfigFormat, ...] = (ConfigFormat.TOML, ConfigFormat.JSON, ConfigFormat.YAML, ConfigFormat.JSON5, ConfigFormat.JSONC, ConfigFormat.HJSON)

Configuration formats a test suite can be serialized in, built-in ones first.

These are the formats able to represent a top-level list of case mappings, matched against a file’s extension by load_test_suite(). TOML and JSON parse with no extra dependency; the others each need their matching click-extra[…] extra. TOML has no bare top-level array, so a TOML suite lists its cases under a [[cases]] array of tables (see parse_test_suite()); the others use a bare list. INI (no nesting) and XML (no natural list representation) are excluded.

Per-format availability is resolved by ConfigFormat, so a format whose parser is not installed raises an ImportError pointing at its extra at parse time.

exception click_extra.test_suite.SkippedTest[source]

Bases: Exception

Raised when a test case should be skipped.

class click_extra.test_suite.CLITestCase(cli_parameters=<factory>, skip_platforms=<factory>, only_platforms=<factory>, timeout=None, exit_code=None, strip_ansi=False, output_contains=<factory>, stdout_contains=<factory>, stderr_contains=<factory>, output_regex_matches=<factory>, stdout_regex_matches=<factory>, stderr_regex_matches=<factory>, output_regex_fullmatch=None, stdout_regex_fullmatch=None, stderr_regex_fullmatch=None, execution_trace=None)[source]

Bases: object

A single CLI test case: how to invoke the command and what to expect.

Each case runs the command-under-test once with cli_parameters appended, then checks the captured result against the expectation directives below. A case with no expectation only asserts the command ran (plus exit_code, if set).

cli_parameters: tuple[str, ...] | str

Arguments and options appended to the command-under-test.

A plain string is split into arguments (on spaces on Windows, with shlex elsewhere); a list or tuple is used as-is.

skip_platforms: Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[_TNestedReferences]]

Platforms (or platform-group IDs) on which to skip this case.

Accepts extra_platforms identifiers such as linux, macos, windows, in any case, mixed freely with group IDs.

only_platforms: Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[_TNestedReferences]]

Restrict this case to these platforms; skip it everywhere else.

The mirror image of skip_platforms, using the same identifiers.

timeout: float | str | None = None

Seconds before the command is killed and the case fails as a timeout.

Falls back to the command’s –timeout default, then to no limit.

exit_code: int | str | None = None

Expected process exit code; the case fails on any other code.

strip_ansi: bool = False

Strip ANSI escape sequences from the captured output before matching.

output_contains: tuple[str, ...] | str

Substrings that must all be present in the combined output.

The combined output interleaves stdout and stderr in the order the command wrote them, matching what a user sees in a terminal. The output_* directives are mutually exclusive with the stdout_* / stderr_* ones: a single subprocess run captures either the merged stream or the separate ones, not both.

stdout_contains: tuple[str, ...] | str

Substrings that must all be present in stdout.

stderr_contains: tuple[str, ...] | str

Substrings that must all be present in stderr.

output_regex_matches: tuple[Pattern | str, ...] | str

Regexes that must each match somewhere in the combined output (searched, re.DOTALL). See output_contains for the merged-stream semantics.

stdout_regex_matches: tuple[Pattern | str, ...] | str

Regexes that must each match somewhere in stdout (searched, re.DOTALL).

stderr_regex_matches: tuple[Pattern | str, ...] | str

Regexes that must each match somewhere in stderr (searched, re.DOTALL).

output_regex_fullmatch: Pattern | str | None = None

Regex that must fully match the combined output, line by line. See output_contains for the merged-stream semantics.

stdout_regex_fullmatch: Pattern | str | None = None

Regex that must fully match stdout, line by line.

stderr_regex_fullmatch: Pattern | str | None = None

Regex that must fully match stderr, line by line.

execution_trace: str | None = None

Rendering of the command execution and its output.

Populated after the case runs, for inspection on failure; not a directive you set in a test suite.

property has_merged_output_directives: bool

Whether any output_* directive (merged stream) is set.

property has_separate_stream_directives: bool

Whether any stdout_* or stderr_* directive (separate streams) is set.

run_cli_test(command, additional_skip_platforms, default_timeout)[source]

Run a CLI command and check its output against the test case.

The provided command can be either:

  • a path to a binary or script to execute;

  • a command name to be searched in the PATH,

  • a command line with arguments to be parsed and executed by the shell.

`{todo} Add support for environment variables. `

Return type:

None

click_extra.test_suite.cases_from_data(data)[source]

Build CLITestCase instances from already-parsed suite data.

The in-memory counterpart to parse_test_suite() (which parses a string) and load_test_suite() (which reads a file): feed it a suite that is already a Python object, such as the native cases mappings declared in a [tool.<cli>.test-suite] config section.

A suite is a list of case mappings, each keyed by CLITestCase directive names. Formats with no bare top-level array (TOML) carry that list under a top-level cases key, so a mapping is unwrapped here.

Raises:
  • ValueError – the suite is empty, a mapping suite omits cases, or a case uses unknown directives.

  • TypeError – the suite is not a list, or a case is not a mapping.

Return type:

Generator[CLITestCase, None, None]

click_extra.test_suite.parse_test_suite(suite_string, fmt=ConfigFormat.YAML)[source]

Parse a serialized test suite string into CLITestCase instances.

fmt selects the serialization format, one of SUITE_FORMATS; it defaults to YAML for string sources with no extension to key on (an environment variable, an inline config value). load_test_suite() is the file-based counterpart.

Raises:
  • ValueError – the suite is empty, fmt cannot express a suite, a mapping suite omits cases, or a case uses unknown directives.

  • TypeError – the suite is not a list, or a case is not a mapping.

  • ImportError – the format’s optional parser is not installed.

Return type:

Generator[CLITestCase, None, None]

click_extra.test_suite.load_test_suite(path)[source]

Read a test suite file and parse it by the format of its extension.

The format is resolved from path’s name over the list-capable SUITE_FORMATS (so suite.toml parses as TOML, suite.yaml as YAML). Reading and format detection are delegated to click_extra.config.formats.read_file().

Raises:
  • ValueError – the file extension matches no suite format.

  • ImportError – the matched format’s optional parser is not installed.

Return type:

Generator[CLITestCase, None, None]

click_extra.test_suite.run_test_suite(command, cases, *, jobs=1, select_test=None, skip_platform=None, timeout=None, exit_on_error=False, show_trace_on_error=True, stats=True, show_progress=True)[source]

Run a list of test cases against a target command and tally the results.

Cases are parallelized per jobs (see click_extra.execution.run_jobs()): at one worker they run sequentially and lazily, so exit_on_error can stop before the rest start; otherwise they run in a thread pool and every case runs to completion. Either way outcomes are tallied in submission order. On an interactive terminal a click_extra.spinner.Spinner reports progress unless show_progress is false.

Parameters:
  • command (Path | str) – The target to test: a command name, a command line, or a path to a binary or script.

  • cases (Sequence[CLITestCase]) – The test cases to run.

  • jobs (int) – Number of parallel workers; 1 runs sequentially.

  • select_test (Sequence[int] | None) – 1-based case numbers to run; others are skipped.

  • skip_platform (Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[Trait | Group | str | None | Iterable[_TNestedReferences]]]]) – Extra platforms (or group IDs) to skip every case on.

  • timeout (float | None) – Default per-case timeout in seconds when a case sets none.

  • exit_on_error (bool) – Stop at the first failure (sequential runs only).

  • show_trace_on_error (bool) – Echo the execution trace of each failed case.

  • stats (bool) – Echo a one-line worker summary up front and a result tally.

  • show_progress (bool) – Allow the progress spinner on an interactive terminal.

Return type:

Counter

Returns:

A collections.Counter with total, skipped, and failed keys. A non-zero failed count signals the caller to exit with an error.

Future directions

The current design is a declarative list of directives. Two points of comparison suggest where it could go next.

Click Extra’s click:run and click:source Sphinx directives apply the same run-and-check idea from the documentation side: they execute a CLI in-process while the docs build and assert on its output, so every example doubles as a test. A test suite does it at the subprocess level instead, against any binary. Letting a documented example and a test case share one source is an open avenue.

scrut is a standalone toolkit aimed at the same black-box CLI testing problem, with a different authoring model: expectations are written inline beneath each command in a Markdown or Cram file, and scrut update regenerates them. I came across it after building this feature for my own needs, so the resemblance is convergence, not lineage. Its snapshot-style workflow (generate and refresh expectations instead of hand-writing them), per-case environment and working-directory controls, and glob expectations are the directions worth weighing for a later revision.