Introducing vitest-command-line
Testing command-line tools is deceptively hard - hanging processes, lost stderr, temp file juggling, and unreadable failures. I built vitest-command-line to make CLI testing in Vitest simple and robust.
Ben Houston • June 10, 2026 • 8 min read
I write a lot of command-line tools, and I test all of them. Over time I kept rewriting the same fragile test scaffolding: spawn a subprocess, wire up stdout and stderr capture, add a timeout so a hung process does not stall CI forever, create a temp directory for fixtures, clean it all up afterward, and then write assertions that produce useless failure messages when something goes wrong.
That scaffolding is deceptively hard to get right. So I extracted it into a small, typed library: vitest-command-line. I use it to test the loa CLI for Land of Assets, which sponsored its development, and it has made those tests both shorter and far more reliable.
Why Testing CLIs Is Hard#
A naive CLI test using child_process directly has a surprising number of failure modes:
- Hanging processes. If your CLI waits on stdin, spins on a network call, or spawns children of its own, a raw
spawntest can hang your entire test run. You need timeouts, kill signals, escalation toSIGKILL, and sometimes whole-process-tree cleanup. - Lost output. stdout and stderr arrive as interleaved chunks on separate streams. Most ad hoc test helpers capture one, drop the other, or lose the interleaving, which is exactly what you need when debugging a failure.
- Temp file juggling. CLI tests almost always need scratch files and directories - inputs to read, outputs to assert on - and they need to be created and removed reliably per test.
- Unreadable failures. When
expect(stdout).toContain('done')fails, you get a string diff with no context. What was the exit code? What was on stderr? Did it time out? You end up re-running the test withconsole.logsprinkled in.
vitest-command-line packages solutions to all four into one small API.
The Core API#
You define a command target once, then run it with per-call arguments and options. Every run returns a single CommandResult containing everything that happened:
import { commandLine, extendMatchers } from 'vitest-command-line'; extendMatchers(); const cli = commandLine({ command: ['node', './dist/cli.js'], name: 'my-cli', env: { FORCE_COLOR: '0' }, timeout: 10_000, }); const result = await cli.run(['build', '--verbose']); result.exitCode; // number | null result.signal; // NodeJS.Signals | null result.timedOut; // boolean result.stdout; // captured stdout result.stderr; // captured stderr result.output; // merged, interleaved output result.chunks; // timestamped stream chunks result.durationMs; // how long it took result.success; // exitCode === 0, no signal, no timeout, no error
Options like cwd, env, and timeout can be set as defaults on the instance, overridden per run(), or baked into a derived instance with withOptions() - handy when most tests share a working directory but a few need their own.
Readable Assertions with Custom Matchers#
The package ships Vitest matchers that understand CommandResult. Calling extendMatchers() once installs them on expect:
expect(result).toSucceed(); expect(result).toExitWith(2); expect(result).toHaveStdout(/build complete/i); expect(result).toHaveStderr('warning: deprecated flag'); expect(result).toHaveOutput('done'); // merged stdout + stderr expect(result).toHaveTimedOut(); expect(result).toHaveJsonStdout({ status: 'ok' }); // parse stdout as JSON, deep-compare expect(result).toCompleteWithin(2_000); // duration budget in ms
Since --format json flags are everywhere in modern CLIs, results also expose result.json<T>() to parse stdout directly, and a stripAnsi: true run option removes color codes and other escape sequences from the captured output before you assert on it.
The real win is the failure messages. When toSucceed() fails, you do not get expected false to be true. You get the command, arguments, working directory, exit code, signal, timeout state, stdout, and stderr - everything you need to diagnose the failure without re-running the test:
Expected command to succeed.
command: my-cli
args: ["build","--verbose"]
cwd: /tmp/vitest-command-line-3f2a.../
exitCode: 1
signal: null
timedOut: false
stdout: ""
stderr: "Error: config file not found\n"
Hanging Processes, Handled#
This is the part that motivated the library in the first place. A CLI test framework that cannot reliably kill a stuck process is a CI liability. Every run() accepts:
const result = await cli.run(['serve'], { timeout: 5_000, // give up after 5 seconds killSignal: 'SIGTERM', // ask politely first forceKillAfterMs: 2_000, // then escalate to SIGKILL subprocessCleanup: 'process-tree', // kill children and grandchildren too }); expect(result).toHaveTimedOut();
When the timeout fires, the process gets killSignal (default SIGTERM). If it has not exited after forceKillAfterMs, it gets SIGKILL. And subprocessCleanup: 'process-tree' spawns the command detached in its own process group so that the entire tree - including any children your CLI spawned - is terminated together. No more orphaned processes accumulating on your CI runners.
Importantly, a timeout is not an exception. It is just a result state, so you can assert on it like anything else, and your afterEach cleanup still runs normally.
Scratch Directories and Files#
CLI tests constantly need temporary files. scratchDirectory() gives you a disposable directory under the OS temp dir with a typed API for creating files and nested directories, plus matchers for asserting on what your CLI wrote:
import { scratchDirectory } from 'vitest-command-line'; await using directory = scratchDirectory(); // removed automatically on scope exit await directory.create(); // Create an input fixture const configFile = await directory.file({ filename: 'config.json', content: JSON.stringify({ output: 'report.json' }), }); // Reserve an output path without creating the file const reportFile = await directory.file('report.json'); const result = await cli.run(['build', '--config', configFile.path], { cwd: directory.path, }); expect(result).toSucceed(); expect(reportFile).toExist(); expect(reportFile).toHaveFileContents(); // exists, is a file, size > 0 expect(JSON.parse(await reportFile.text())).toMatchObject({ status: 'ok' });
There is also toMatchFileContents() for byte-for-byte comparison against a golden file, files([...]) for creating several fixtures at once, dir() for nested directories, and copyFrom(path) for seeding the scratch directory from a fixture/template tree. The path helpers refuse absolute paths and .. segments, so a typo cannot escape the scratch root. And because ScratchDirectory implements Symbol.asyncDispose, the await using declaration above means you never write the cleanup code at all.
In-Process Testing with the Wrapper Runner#
Spawning a real subprocess is the most faithful test, but it is also the slowest, and it cuts you off from in-process test infrastructure like in-memory databases or seams for dependency injection. For that, commandLine() accepts an optional run hook that replaces subprocess execution entirely while keeping the exact same CommandResult shape and matchers:
const cli = commandLine<{ deps: TestDeps }>({ command: ['my-cli'], context: { deps: createTestDeps() }, run: async ({ command, context, io, cwd, env }) => { // command is the full expanded vector, e.g. ['my-cli', 'build', '--json'] await runMyCliInProcess(command.slice(1), { deps: context!.deps, stdout: io.stdout, stderr: io.stderr, }); return 0; // becomes result.exitCode }, });
The hook receives the working directory, environment, an abort signal (wired to the same timeout machinery as subprocesses), a typed context you can use for dependency injection, and an io object whose stdout and stderr writers feed the same capture pipeline. Your tests do not change at all - the same run() calls and the same matchers work against both execution modes.
The yargs Adapter#
Wiring a yargs-based CLI into the wrapper runner used to require boilerplate that everyone got subtly wrong: you have to call .exitProcess(false) so a parse failure cannot kill the test runner, install a .fail() handler that re-throws instead of printing and exiting, and capture the console output that yargs uses for help text and errors.
So the package ships a dedicated adapter at vitest-command-line/yargs that handles all of it:
import yargs from 'yargs'; import { yargsCommandLine } from 'vitest-command-line/yargs'; const cli = yargsCommandLine<{ deps: CliDeps }>({ name: 'loa', context: { deps: testDeps }, build: ({ argv, context, io }) => yargs(argv).scriptName('loa').command(buildCommands(context!.deps, io)).strict().help(), }); // Now hundreds of tests read like this: const result = await cli.run(['config', 'set', '--org', 'my-org']); expect(result).toSucceed(); // Parse failures are assertable results, not unhandled rejections: const bad = await cli.run(['no-such-command']); expect(bad).toFail(); expect(bad).toHaveStderr(/unknown/i); // Even help output is captured instead of escaping to the terminal: const help = await cli.run(['--help']); expect(help).toHaveOutput(/config set/);
Your build function receives the per-invocation argv, the typed context for dependency injection, and the io capture streams - the adapter applies exitProcess(false) and the re-throwing fail() handler for you, and temporarily routes console.log/console.error (which yargs uses for help and error output) into the captured result. If a command handler throws an error with an integer exitCode property, that becomes the reported exit code, so CLIs with meaningful exit codes stay fully testable.
This is how I test the yargs-based loa CLI at Land of Assets: the full CLI runs in-process against an in-memory API server, with fresh dependencies and a fresh scratch directory per test. The tests are fast, fully isolated, and exercise the same yargs command definitions that ship to users.
Conclusion#
Testing CLI tools should not require every project to rebuild subprocess management, output capture, timeout escalation, and temp file handling from scratch. vitest-command-line packages those into a small, typed API with matchers that produce genuinely useful failure messages.
If you maintain a CLI and use Vitest, give it a try:
pnpm add -D vitest vitest-command-line
The source is on GitHub under the MIT license. Issues and pull requests are welcome.