Deploy

GPU and Speech Inference

Optional self-hosted STT/TTS inference stack for feros.

The inference module supports self-hosted STT/TTS services.

Build and Run

From repository root:

make inf-build-stt
make inf-build-tts
make inf-stt
make inf-tts

Default ports:

STT: 9001
TTS: 9002

Typical Usage

Use this stack when you want local GPU inference instead of external provider APIs.

Hardware Notes

NVIDIA GPUs and NVIDIA Container Toolkit are required.
Running STT and TTS on separate GPUs improves stability under barge-in traffic.

Docker Compose Deployment

Local-first deployment reference for the feros compose stack.

Observability

Logging, event tracing, and telemetry controls in feros.

On this page

GPU and Speech Inference Build and Run Typical Usage Hardware Notes