HTTP API Guide

Sona exposes a privacy-first local HTTP API server for external headless integration. Automated tools, batch-processing scripts, or secondary apps can control speech-to-text workflows locally through REST endpoints.

Configuration and server activation

The API server can be started in two ways.

GUI client settings

Navigate to Settings > API Server and configure:

Enable API Server: toggle activation.
Host: bind IP address. Use 127.0.0.1 to restrict access to the local machine, or 0.0.0.0 to bind all network interfaces.
Port: TCP port for the server. The default is 14200.
API Key: optional Bearer token to protect endpoints. Use Generate to create a secure key and Copy to write it to your clipboard.

Headless CLI mode

You can also launch Sona in headless CLI server mode:

sona serve --host 127.0.0.1 --port 14200 --api-key your_secure_key --ip-whitelist localhost --max-streaming 2 --gpu-acceleration auto

GPU acceleration is configured as a server-level default through GUI model settings or sona serve --gpu-acceleration. Batch and streaming API requests do not accept a per-request GPU override.

For the full serve option table, see CLI Guide.

Authentication

When an API Key is configured, every HTTP request must include it in the Authorization header as a Bearer token:

Authorization: Bearer your_secure_key

If no API key is set, the server permits unauthenticated requests.

Server info and capabilities

Retrieve server platform information, hardware status, installed models, and available online ASR providers.

URL: /v1/info
Method: GET

Response (`200 OK`)

{
  "platform": "win32",
  "gpuAvailable": true,
  "models": ["sensevoice", "sherpa-onnx-whisper-turbo"],
  "vadInstalled": true,
  "punctuationInstalled": true,
  "onlineAsrProviders": [
    {
      "id": "volcengine-doubao",
      "configured": true,
      "supportsBatch": true,
      "supportsStreaming": true
    }
  ]
}

Server health and stats

Retrieve server uptime, active/pending job counts, and temporary storage usage.

URL: /health
Method: GET

Response (`200 OK`)

{
  "status": "ok",
  "uptime": 3600,
  "activeJobs": 1,
  "pendingJobs": 0,
  "cacheSpaceBytes": 10485760
}

List all jobs

Query the current status of all transcription jobs in the manager.

URL: /v1/transcriptions/jobs
Method: GET

Response (`200 OK`)

Returns a map of job_id to the current job status.

{
  "c86e0c65-2746-4e56-9141-866d51bbca43": "Pending",
  "a1b2c3d4-e5f6-4g7h-8i9j-k0l1m2n3o4p5": "Processing"
}

Submit a transcription job

Submit a local audio or video file for speech-to-text processing. Jobs are queued and executed by the background transcription worker.

URL: /v1/transcriptions
Method: POST
Content-Type: multipart/form-data

Request payload

Field name	Type	Required	Description
`file`	Binary	Yes	The audio or video file to transcribe.
`model_id`	String	Yes	The identifier of a local ASR model, such as `sensevoice`, or a configured Cloud ASR provider, such as `volcengine-doubao`.
`language`	String	No	Target language code, such as `zh`, `en`, `ja`, `ko`, or `yue`. Defaults to `auto`.
`hotwords`	String	No	Custom vocabulary or keywords to enhance recognition, separated by newlines.
`webhook_url`	String	No	HTTP URL to receive a POST notification once transcription finishes or fails.
`webhook_secret`	String	No	Secret used to sign the webhook payload with HMAC-SHA256.

Response (`200 OK`)

Returns the unique job_id allocated for the transcription task:

{
  "job_id": "c86e0c65-2746-4e56-9141-866d51bbca43"
}

Curl example

curl -X POST http://127.0.0.1:14200/v1/transcriptions \
  -H "Authorization: Bearer your_secure_key" \
  -F "file=@/path/to/interview.wav" \
  -F "model_id=sensevoice" \
  -F "language=zh"

Query job status

Query the lifecycle state and transcription result for a submitted job.

URL: /v1/transcriptions/:job_id
Method: GET

Response structures

Depending on progress, the endpoint returns one of these JSON patterns.

Pending

The job is queued and waiting for the transcription worker.

"Pending"

Processing

The job is active and transcription is currently underway.

"Processing"

Completed

The transcription succeeded. The response returns segment-level text with millisecond timestamps:

{
  "Completed": [
    {
      "id": 0,
      "start": 120,
      "end": 2840,
      "text": "Hello, welcome to Sona.",
      "speaker": "Speaker 0"
    },
    {
      "id": 1,
      "start": 3100,
      "end": 5600,
      "text": "We are processing speech locally on your machine.",
      "speaker": "Speaker 1"
    }
  ]
}

Failed

The transcription failed and includes the specific error message:

{
  "Failed": "Failed to decode audio file: invalid format"
}

Curl example

curl http://127.0.0.1:14200/v1/transcriptions/c86e0c65-2746-4e56-9141-866d51bbca43 \
  -H "Authorization: Bearer your_secure_key"

Webhooks and verification

If webhook_url was specified when submitting the job, Sona posts the final JSON state to that URL on job completion or failure.

Webhook signature (`X-Sona-Signature`)

To secure webhooks, specify a webhook_secret when submitting the job. Sona computes an HMAC-SHA256 signature of the JSON payload string using this secret and sends it in the headers:

Header name: X-Sona-Signature
Format: sha256=<hex_encoded_signature>

Verification algorithm

const crypto = require('crypto');

function verifySignature(payloadString, secret, receivedSignatureHeader) {
  const [algorithm, signature] = receivedSignatureHeader.split('=');
  if (algorithm !== 'sha256') return false;

  const expectedSignature = crypto
    .createHmac('sha256', secret)
    .update(payloadString)
    .digest('hex');

  return crypto.timingSafeEqual(
    Buffer.from(signature, 'hex'),
    Buffer.from(expectedSignature, 'hex')
  );
}

HTTP API Guide

Configuration and server activation

GUI client settings

Headless CLI mode

Authentication

Server info and capabilities

Response (200 OK)

Server health and stats

Response (200 OK)

List all jobs

Response (200 OK)

Submit a transcription job

Request payload

Response (200 OK)

Curl example

Query job status

Response structures

Pending

Processing

Completed

Failed

Curl example

Webhooks and verification

Webhook signature (X-Sona-Signature)

Verification algorithm

Response (`200 OK`)

Response (`200 OK`)

Response (`200 OK`)

Response (`200 OK`)

Webhook signature (`X-Sona-Signature`)