Running Specs¶
The behave command runs Behave specs.
Default behavior¶
With no arguments, behave looks for a specs/ directory under the current working directory and runs every file matching spec.raku (recursively).
1 | |
Selecting files¶
Pass one or more spec file paths to run a subset:
1 | |
Local development¶
When you're working on Behave itself (or your project's lib/ is not yet installed), tell Raku where to find the modules:
1 2 | |
Options¶
| Option | Effect |
|---|---|
--help |
Display usage |
--verbose |
Print each spec file as it is loaded |
--tag NAME |
Run only examples tagged NAME (repeatable; OR semantics). See Tags. |
--exclude-tag NAME |
Skip examples tagged NAME (repeatable). |
--example PATTERN |
Run only examples whose full nested description matches PATTERN (substring; or /regex/). Repeatable; OR semantics. |
-e PATTERN |
Alias for --example. |
--aggregate-failures / --aggregate-failures=LABEL |
Wrap every example in aggregate-failures semantics; converts uncaught example exceptions into recorded failures. With =LABEL the label tags each failure. Per-example/group :aggregate-failures metadata overrides this. See Aggregate failures. |
--order ORDER |
Example execution order: random (default) or defined. Random order shuffles the children of every group and the suite, surfacing hidden order dependencies. See Order and seed. |
--seed N |
Seed the random-order RNG for reproducible runs. Ignored when --order=defined. Auto-generated when omitted and --order=random. See Order and seed. |
--fail-fast |
Stop after the first failed example. Equivalent to --fail-fast=1. See Fail-fast. |
--fail-fast=N |
Stop after N failed examples (N must be a positive integer). See Fail-fast. |
--retry N |
Retry failing examples up to N additional times (a total of N+1 attempts). Per-example :retry(M) metadata overrides this default. See Retry and Only-Failures. |
--only-failures |
Run only examples that failed in the previous run (read from .behave-failures). See Retry and Only-Failures. |
--failures-path=PATH |
Override the path used to persist (and read, with --only-failures) the list of failing examples. Defaults to ./.behave-failures. See Retry and Only-Failures. |
--only-example LOC |
Run only examples whose file:line matches LOC (repeatable; OR semantics). LOC is FILE:LINE — FILE may be absolute, relative, or a basename. See Bisect. |
--bisect |
Find the minimal set of examples that, run in declared order before each failing example, reproduce the failure. See Bisect. |
--bisect-data |
Machine-readable output for use by --bisect. Suppresses normal output and emits behave-executed: / behave-failed: lines. See Bisect. |
--profile / --profile=N |
Print the top N slowest examples after the run (default N=10). Across multiple spec files — and across --parallel workers — the profile is a single global section after the Overall: counts. See Timing. |
--slow-threshold=SECONDS |
Print an inline SLOW line under any example whose body takes at least SECONDS seconds. SECONDS may be fractional. See Timing. |
--memory-profile / --memory-profile=N |
Track per-example RSS deltas and print the top N memory-heaviest examples after the run (default N=10). Aggregated across --parallel workers. See Memory profiling. |
--memory-threshold=KB |
Print an inline MEMORY line under any example whose RSS delta meets or exceeds KB kilobytes. Enables measurement on its own. See Memory profiling. |
--format NAME |
Select the output formatter for the run. NAME is the name of a registered formatter (default is built in). See Formatters. |
--config PATH |
Load Raku-based config from PATH. Skips the default ~/.behave and ./.behave lookups. See Configuration. |
--no-config / --no-user-config / --no-project-config |
Skip all / user / project config files for this run. BEHAVE_DISABLE_CONFIG=1 is equivalent to --no-config. See Configuration. |
--parallel N |
Run specs across N worker subprocesses with group-affinity LPT distribution. Mutually exclusive with --bisect / --bisect-data. Aggregates coverage across workers when combined with --coverage (see Coverage under --parallel). Ignored under --doc. See Parallel Execution. |
--seed-mode MODE |
How --seed combines with --parallel N. xor (default) derives per-worker seeds as seed XOR worker-index and uses LPT distribution. stable keeps the global execution order identical regardless of N via hash-based bucket assignment. See Seed mode. |
--parallel-mode MODE |
Bucket distribution strategy under --parallel N. lpt (default) uses static longest-processing-time-first assignment by example count. queue uses dynamic work-stealing: workers pull buckets from a parent-managed queue, useful when bucket runtimes are wildly uneven. Forces --order=defined per worker. See Parallel mode. |
--parallel-retry N |
When a worker subprocess crashes (exits with code > 1: signal, OOM, uncaught exception in the runner itself — not test failures), re-spawn it with the same manifest up to N additional times. Default 0. Composes with --retry (per-example flake retry). LPT mode only; queue-mode crashes remain fatal. See Per-shard retry. |
--progress-total |
Append a running (N/TOTAL) counter after each example char emitted by the progress formatter under --parallel. No-op without --parallel. See Live progress totals. |
--watch |
Watch source and spec files; re-run affected specs whenever a file changes. Reads r/a/f/q commands from stdin. Mutually exclusive with --bisect / --bisect-data / --coverage / --doc / --parallel. See Watch Mode. |
--watch-path PATH |
Add PATH to the watched roots (repeatable). Defaults to ./lib and ./specs when omitted. See Watch Mode. |
--dry-run |
Load specs but skip execution; print the hierarchical example list and a count. Honors --tag, --exclude-tag, --example, --only-example, focus mode, and skipped examples. Combine with --verbose to show each example's file:line and effective tags. See Dry run and listing. |
--list-examples |
Emit the metadata query result one line per example — FILE:LINE\t<full description> by default. Designed for editor integrations (run-this-test, jump-to-failure). See Dry run and listing. |
--list-examples-format=FORMAT |
Output format for --list-examples: text (default) or json. The JSON document is {version, count, examples: [...]} with every example's tags, metadata, and status. See Dry run and listing. |
Order and seed¶
Behave runs examples in random order by default. This shuffles the children of every describe / context group (and the top-level suite) before execution. Random ordering catches accidental order dependencies — examples that pass only because a sibling ran first — and is the recommended default.
When a run finishes, Behave prints the seed used so the order is reproducible:
1 2 3 | |
Pass --seed N to reproduce a specific order:
1 | |
If random order surfaces a failure, the seed in the summary is all you need to re-run the same permutation.
--order defined¶
For tests that intentionally depend on declaration order across sibling examples (cross-example accumulation, side-effect testing, hook-cascade verification), pass --order defined:
1 | |
No seed is auto-generated and no seed line is printed under defined order.
Per-group order override¶
A single describe / context block can opt out of random order with :order<defined> metadata:
1 2 3 4 5 6 7 8 9 10 11 12 | |
:order<defined> inherits through nested groups, so an outer :order<defined> covers every descendant unless an inner group explicitly sets :order<random>.
Programmatic use¶
BDD::Behave::Runner::Runner.new defaults to :order<defined> (deterministic) for programmatic / library use. bin/behave is what flips the user-facing default to random. Construct a Runner explicitly when you need a specific order:
1 | |
Runner.new(:order<sideways>) (or any value other than 'random' / 'defined') dies at construction time.
Fail-fast¶
By default, Behave runs every example in the suite even after a failure, so the run produces a complete picture of what is broken. When iterating on a single problem — or when you want a faster signal in CI — pass --fail-fast to stop as soon as the first failure occurs:
1 | |
After the threshold is hit, Behave prints the normal failure list and counts, plus an abort banner:
1 | |
Pass --fail-fast=N to keep running until N failures have accumulated:
1 | |
N must be a positive integer; --fail-fast=0 and non-numeric values exit with a non-zero status and a helpful error on stderr.
When multiple spec files are passed on the command line, the threshold is shared across them — once it is reached, the remaining suites are not loaded. Skipped and pending examples do not count toward the threshold.
Programmatic use¶
BDD::Behave::Runner::Runner.new accepts :fail-fast(N) (default 0, meaning unbounded). The runner exposes .aborted (a Bool) after .run returns, so callers can distinguish a clean finish from an early abort:
1 2 3 | |
Runner.new(:fail-fast(-1)) (or any negative integer) dies at construction time.
Retry and only-failures¶
Flaky examples can be retried automatically via --retry N (or per-example :retry(N) metadata). After every non-bisect run, the list of failing examples is persisted to ./.behave-failures so the next run can be scoped to just those failures with --only-failures. See Retry and Only-Failures for the full reference.
Bisect¶
When a failure shows up only when a specific other example ran first — classic order-dependent test pollution — --bisect finds the minimal set of preceding examples needed to reproduce the failure.
1 | |
What it does¶
- Initial pass in declared order (
--order defined); records which examples ran and which failed. - For each failing example, replays subsets of the prior examples in a fresh subprocess and shrinks the prior set until further pruning loses the failure.
- Prints the minimal prior set and a ready-to-run reproduction command.
Each iteration spawns bin/behave --bisect-data --order defined --only-example … in a fresh subprocess, so user-code state (module-level vars, file handles, registries) cannot leak across iterations.
Output¶
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 | |
If the failing example reproduces alone (no prior needed), Bisect reports Failure reproduces in isolation — not order-dependent. If the initial pass has no failures, Bisect exits 0 with no failing examples.
--only-example FILE:LINE¶
--only-example is the targeting primitive Bisect uses to replay subsets. It is also useful directly:
1 2 | |
FILE matches if any of these hold: exact-string equality with the example's stored path, absolute-path equality, path/to/file.raku suffix match, or basename equality. LINE must equal the line of the it block. Repeating --only-example is OR semantics; the runner runs every example matching any pattern.
Positional FILE:LINE shorthand¶
A positional argument of the form FILE:LINE is shorthand for "load FILE, then run only the example at LINE" — equivalent to passing FILE plus --only-example FILE:LINE. The shorthand only triggers when FILE exists on disk; an arg matching the :N pattern but pointing at a non-existent file is left alone (and will surface as a normal "could not load" error).
1 2 3 | |
The shorthand and explicit --only-example compose freely; both append to the same internal list, so all matching examples run.
Line snapping (editor integration)¶
LINE does not have to land exactly on the it / describe / context keyword. Both the FILE:LINE shorthand and --only-example FILE:LINE apply a text-based snap: if LINE does not point at one of those keywords, Behave scans FILE for the nearest preceding line whose first non-whitespace token is describe, context, fdescribe, fcontext, xdescribe, xcontext, it, fit, xit, or pending, and uses that line instead.
Given this fixture:
1 2 3 4 5 6 7 8 9 10 | |
| You pass | Snaps to | Behavior |
|---|---|---|
:2 |
:2 |
Runs alpha (exact it line). |
:4 |
:2 |
Runs alpha (cursor inside its body). |
:5 |
:2 |
Runs alpha (between body close and next). |
:7 |
:7 |
Runs beta only (exact context line; descends into the inner group). |
:1 |
:1 |
Runs every example (exact describe line). |
When the snapped line is a describe or context, every example whose ancestry includes that group runs. This is what makes editor integrations work: bind your "run example at cursor" key to behave $FILE:$LINENO and it does the right thing whether the cursor is on the it line, inside the body, or inside an enclosing describe.
The snap is purely text-based and only looks at the start of each line, so it will not be confused by it appearing in a comment or string inside an it body. It will not snap into a closing brace; a line that has no preceding keyword in the file (e.g. :1 when the file starts with use BDD::Behave;) is left unchanged and matches nothing.
--bisect-data¶
Used by --bisect for inter-process communication and exposed for editor/tool integrations that want a parseable listing of executed and failed examples:
1 2 3 | |
--bisect-data suppresses normal output. It is mutually exclusive with --bisect.
Limits¶
- Bisect uses
--order definedfor sub-runs. Failures that only reproduce under a specific random--seedneed to be diagnosed differently — re-run with the failing seed and--order definedafter locking in the order. - Sub-runs use the same
--tag,--exclude-tag,--example, and--aggregate-failuresyou passed tobin/behave --bisect. - The shrink uses binary halving first, then one-at-a-time minimization when halving stalls. Iteration count grows roughly with
log(prior) + minimal-prior-count.
Filtering by description¶
--example PATTERN (alias -e) runs only examples whose full nested description matches PATTERN (substring or /regex/). See Example Filter for the full reference, including how it composes with --tag.
Output¶
Behave prints each describe/context with a ⮑ marker, indenting nested groups, and reports SUCCESS / FAILURE / PENDING / SKIPPED per example. See Focus and Skip for xit / fit / xdescribe / fdescribe. After all specs run it prints a summary like:
1 2 3 | |
Exit code¶
behave exits 0 if every example passed, 1 if any example failed.