Documentation

Complete documentation for voicetyped — from installation to production deployment.

Welcome to the voicetyped documentation. voicetyped is a private, self-hosted, API-first voice automation platform. It terminates calls and exposes them as programmable, real-time sessions with local speech recognition and dialog control.

Choose a section to get started:

Getting Started

Install and run voicetyped in under 10 minutes.

Architecture Overview

How voicetyped's four core services work together to process voice calls.

Media Gateway

SIP termination, RTP audio handling, and codec management for voicetyped.

Speech Gateway

Local ASR and TTS engine powered by whisper.cpp — no cloud, no data leakage.

Conversation Runtime

Deterministic dialog execution engine with finite state machines, turn detection, and optional LLM hooks.

Integration Gateway

Production-grade backend integration with retry logic, circuit breaking, and rate limiting.

Call Event Stream API

Subscribe to real-time call events via the voicetyped REST API.

Call Control API

Programmatically control active calls — play TTS, transfer, hang up, and more.

Speech API

Direct access to the local ASR and TTS engines via REST API and WebSocket.

Dialog Hooks API

Implement webhook endpoints that voicetyped calls during dialog execution.

Single-Node Deployment

Deploy voicetyped as a single binary on a Linux server with systemd.

Kubernetes Deployment

Deploy voicetyped on Kubernetes with Helm, GPU scheduling, and autoscaling.

Air-Gapped Deployment

Deploy voicetyped in isolated networks with no internet connectivity.

Security & Compliance

mTLS, encryption, audit logging, and compliance controls for voicetyped.

Observability

Prometheus metrics, OpenTelemetry tracing, and structured logging for voicetyped.

Roadmap

voicetyped development roadmap — from MVP to enterprise platform.