sona exposes offline transcription commands through the main desktop executable. Packaged installs do not add sona to your shell PATH, so run the installed app binary with CLI subcommands. Source builds can run the same commands with Cargo.
The CLI is intentionally narrow: single-file and directory offline transcription, preset model listing/download/deletion, and headless HTTP API server startup. It does not include live recording, LLM polish, or LLM translation.
Run It
- Windows: run
Sona.exe transcribe ...from the installation directory - macOS: run
/Applications/Sona.app/Contents/MacOS/Sona transcribe ... - Linux packages: run the packaged
Sonabinary with CLI subcommands from the install location - AppImage: run the mounted AppImage executable with CLI subcommands
- Source:
cargo run --manifest-path src-tauri/Cargo.toml -- transcribe ./sample.mp4 --config ./sona-cli.toml
Common Commands
Transcribe a file
sona transcribe ./sample.mp4 \
--config ./sona-cli.toml \
--output ./sample.srtWithout --output, transcription writes JSON to stdout. With --output, the format is inferred from the file extension unless --format is provided. Existing output files are protected by default; pass --force only when you intend to overwrite them.
Transcribe a directory
sona transcribe \
--input-dir ./media \
--output-dir ./transcripts \
--format srt \
--recursive \
--jobs 1 \
--config ./sona-cli.tomlDirectory mode writes one transcript per supported media file into --output-dir. By default it scans only direct children; add --recursive to include subdirectories. Transcript content goes to files, while a JSON success/failure summary is written to stdout.
You can also pass multiple input files or glob patterns. These use the same batch output planning as directory mode and require --output-dir:
sona transcribe ./media/*.wav ./media/interview.mp4 --output-dir ./transcripts --format srtList, download, or delete models
sona models list --mode offline --type whisper
sona models list --language zh --installed
sona models list --json
sona models download sherpa-onnx-whisper-turbo
sona models delete sherpa-onnx-whisper-turbomodels list prints a readable table by default. Use --json when scripts need the full machine-readable shape, including install_path.
models download automatically downloads required companion models, such as silero-vad or the default punctuation model, when the selected preset needs them.
models delete removes only the specified model. It does not delete companion models automatically.
Start the API server
sona serve --host 127.0.0.1 --port 14200 --api-key your_secure_keyFor HTTP API endpoints and request examples, continue to the HTTP API Guide.
Config File
Pass a TOML file with --config. Command-line flags override config file values.
Minimal transcribe example:
models_dir = "C:/Users/you/AppData/Local/com.asoda.sona/models"
model_id = "sherpa-onnx-whisper-turbo"
vad_model_id = "silero-vad"
language = "auto"
threads = 4
enable_itn = false
vad_buffer_size = 5.0
gpu_acceleration = "auto"
hotwords = "Sona,offline ASR"
format = "srt"
quiet = false
jobs = 1transcribe config keys
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
models_dir | Optional | Filesystem path | Desktop app models directory, when inferable | Pass explicitly if the CLI cannot find desktop models. |
model_id | Required unless --model-id is passed | Offline preset model id | None | Use sona models list --mode offline to find ids. |
vad_model_id | Optional | Preset model id | silero-vad when required | Used when the selected model requires VAD; overrides the default. |
punctuation_model_id | Optional | Preset model id | sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8 when required | Used when the selected model requires punctuation; overrides the default. |
language | Optional | auto or a model language code, such as zh, en, ja | auto | Overrides automatic language detection. |
threads | Optional | Integer greater than 0 | 4 | Recognizer thread count. |
enable_itn | Optional | true or false | false | Enables inverse text normalization. |
hotwords | Optional | Comma-separated words | None | Custom ASR hotwords; currently supported by Transducer and Qwen3 models. |
quiet | Optional | true or false | false | Hides transcription progress when set. CLI --quiet also enables this. |
jobs | Optional | Integer greater than 0 | 1 | Maximum concurrent file jobs for directory, multiple-input, or glob mode. CLI --jobs overrides this. |
vad_buffer_size | Optional | Number greater than 0 | 5.0 | VAD buffer size in seconds. |
gpu_acceleration | Optional | auto, cpu, cuda, coreml, directml | auto | Use cpu to disable GPU acceleration. |
format | Optional | json, txt, srt, vtt, md | json on stdout or in directory mode, otherwise inferred from --output | Overrides output extension inference. |
serve config keys
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
host | Optional | Bind address | 0.0.0.0 | Use 127.0.0.1 for local-only access. |
port | Optional | TCP port 0 to 65535 | 14200 | API server port. |
api_key | Optional | String | Empty | Empty means requests are not protected by Bearer auth. |
models_dir | Optional | Filesystem path | Desktop app models directory, when inferable | Used to resolve installed models. |
ip_whitelist | Optional | Comma-separated rules | localhost | Supports localhost, exact IPs, CIDR, *, and IPv4 wildcards like 192.168.*. |
max_streaming | Optional | Non-negative integer | 2 | Maximum concurrent streaming WebSocket connections. |
max_concurrent | Optional | Non-negative integer | 2 | Maximum concurrent batch jobs. |
max_queue_size | Optional | Non-negative integer | 100 | 0 means the queue is effectively unlimited. |
max_upload_size_mb | Optional | Non-negative integer | 50 | 0 disables the upload size limit. |
job_ttl_minutes | Optional | Non-negative integer | 60 | 0 disables completed/failed job cleanup. |
gpu_acceleration | Optional | auto, cpu, cuda, coreml, directml | auto | Server-level default for local batch and streaming jobs. |
vad_model_id | Optional | Preset model id | silero-vad | Default companion model for API server jobs. |
punctuation_model_id | Optional | Preset model id | sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8 | Default punctuation companion for API server jobs. |
Parameters
Global
sona
-V, --version
-v, --verbose
-h, --help
helpUse -V or --version to print the Sona version. Use -v or --verbose before a subcommand to enable detailed diagnostic logs. Use -h, --help, or help to print command help:
sona --version
sona -V
sona -v models list
sona --verbose transcribe ./sample.mp4 --config ./sona-cli.toml
sona transcribe --helpVerbose diagnostics are written to stderr. Command output, including table or JSON output from models list and transcribe without --output, remains on stdout so it can still be piped to other tools.
Advanced wrappers and tests can set SONA_FORCE_CLI=1 to force CLI mode even when the executable is launched without a recognized CLI subcommand.
Generate shell completion scripts with sona completions <shell>. Supported shells are bash, zsh, fish, powershell, and elvish; the script is printed to stdout.
transcribe
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
<input>... | Required unless --input-dir is passed | Local audio/video file paths or glob patterns | None | One input keeps single-file mode. Multiple inputs or glob patterns use batch mode and require --output-dir. |
--input-dir <dir> | Required for directory mode | Directory path | None | Transcribes supported media files in the directory. |
--config <path> | Optional | TOML file path | None | Loads defaults from config. |
--output <path> | Optional | Filesystem path | stdout | Output file path for single-file mode only. Errors if the file exists unless --force is passed. |
--output-dir <dir> | Required with --input-dir, multiple inputs, or glob patterns | Directory path | None | Writes one transcript per input file. Existing planned outputs error unless --force is passed. |
--recursive | Optional | Flag | Off | Scans subdirectories and preserves relative output paths. |
--jobs <n> | Optional | Integer greater than 0 | jobs config or 1 | Maximum concurrent file jobs in batch mode. |
--format <format> | Optional | json, txt, srt, vtt, md | json on stdout or in directory mode, otherwise inferred from --output | Overrides config and output extension inference. |
--language <code> | Optional | auto or a model language code | auto | Overrides config. |
--model-id <id> | Required unless model_id is configured | Offline preset model id | None | Main transcription model. |
--models-dir <path> | Optional | Filesystem path | Desktop app models directory, when inferable | Overrides config. |
--vad-model-id <id> | Optional | Preset model id | silero-vad when required | Overrides the default VAD companion. |
--punctuation-model-id <id> | Optional | Preset model id | sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8 when required | Overrides the default punctuation companion. |
--threads <n> | Optional | Integer greater than 0 | 4 | Overrides config. |
--enable-itn | Optional | Flag | false | Conflicts with --disable-itn. |
--disable-itn | Optional | Flag | false | Overrides enable_itn = true; conflicts with --enable-itn. |
--hotwords <words> | Optional | Comma-separated words | None | Overrides hotwords; currently supported by Transducer and Qwen3 models. |
--gpu-acceleration <provider> | Optional | auto, cpu, cuda, coreml, directml | auto | Overrides config. |
--vad-buffer <seconds> | Optional | Number greater than 0 | 5.0 | CLI name for vad_buffer_size. |
--save-wav <path> | Optional | Filesystem path | None | CLI-only; saves the intermediate resampled WAV. Not supported with --input-dir. |
--quiet | Optional | Flag | Off | Hides transcription progress and overrides quiet = false. |
--force | Optional | Flag | Off | Allows overwriting existing output files. Duplicate planned batch outputs still fail. |
models list
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
--models-dir <path> | Optional | Filesystem path | Desktop app models directory, when inferable | Used to detect installed presets. |
--mode <mode> | Optional | streaming, offline | All modes | Filters by supported mode. |
--type <type> | Optional | Preset model type, such as whisper, vad, punctuation | All types | Filters by model type. |
--language <code> | Optional | Language token, such as zh, en, ja, yue | All languages | Filters by supported language token. |
--installed | Optional | Flag | Off | Shows only models present in models_dir. |
--json | Optional | Flag | Off | Prints machine-readable JSON instead of the default table. |
| Output | Always | Table or JSON | Table | Printed to stdout. |
models download
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
<model_id> | Required | Known preset model id | None | Main model to download. |
--models-dir <path> | Optional | Filesystem path | Desktop app models directory, when inferable | Target models directory. |
--quiet | Optional | Flag | Off | Hides per-download progress. |
| Companion downloads | Automatic | Required VAD and punctuation presets | Automatic | Downloading a main model also downloads required companions. |
models delete
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
<model_id> | Required | Known preset model id | None | Model to delete. |
--models-dir <path> | Optional | Filesystem path | Desktop app models directory, when inferable | Target models directory. |
--yes | Optional | Flag | Off | Skips the interactive confirmation prompt. |
| Missing install path | No | Known but not installed preset | Successful no-op | Prints a notice to stderr and exits with status 0. |
| Companion deletion | No | Required VAD and punctuation presets | Not deleted | Delete companion models explicitly if you no longer need them. |
serve
| Parameter / config key | Required | Range | Default | Notes |
|---|---|---|---|---|
--config <path> | Optional | TOML file path | None | Loads defaults from config. |
--host <ip> | Optional | Bind address | 0.0.0.0 | Overrides config. |
--port <port> | Optional | TCP port 0 to 65535 | 14200 | Overrides config. |
--api-key <key> | Optional | String | Empty | Empty means no Bearer auth. |
--models-dir <path> | Optional | Filesystem path | Desktop app models directory, when inferable | Overrides config. |
--ip-whitelist <rules> | Optional | Comma-separated rules | localhost | Supports localhost, exact IPs, CIDR, *, and IPv4 wildcards like 192.168.*. |
--max-streaming <n> | Optional | Non-negative integer | 2 | Maximum concurrent streaming connections. |
--max-concurrent <n> | Optional | Non-negative integer | 2 | Maximum concurrent batch jobs. |
--max-queue-size <n> | Optional | Non-negative integer | 100 | 0 means the queue is effectively unlimited. |
--max-upload-size-mb <n> | Optional | Non-negative integer | 50 | 0 disables the upload size limit. |
--job-ttl-minutes <n> | Optional | Non-negative integer | 60 | 0 disables completed/failed job cleanup. |
--gpu-acceleration <provider> | Optional | auto, cpu, cuda, coreml, directml | auto | HTTP API requests do not accept a per-request GPU override. |
--vad-model-id <id> | Optional | Preset model id | silero-vad | Default VAD companion for API server jobs. |
--punctuation-model-id <id> | Optional | Preset model id | sherpa-onnx-punct-ct-transformer-zh-en-vocab272727-2024-04-12-int8 | Default punctuation companion for API server jobs. |
Run sona <command> --help for the full clap-generated help text.