Roadmap

voicetyped follows a phased development roadmap. Each phase builds on the previous one, adding capabilities while maintaining the core principles: self-hosted, private, and programmable.

Current Status

v0.1 — MVP (in progress)

The initial release focuses on proving the core value proposition: a self-hosted service that terminates SIP calls and exposes them as programmable sessions with local ASR.

v0.1 Scope

SIP inbound call termination
PCM audio extraction from RTP
Streaming ASR (whisper.cpp)
Voice activity detection
Turn detection
One dialog FSM (YAML-defined)
TTS playback (Piper)
REST API & webhook integration
Simple CLI installer
Prometheus metrics endpoint
DTMF detection
Call transfer (REFER)
Air-gapped installer

Phase 1 — Infrastructure

Focus: Solid telephony foundation and speech pipeline.

Feature	Status	Description
Media Gateway	Complete	SIP endpoint, RTP handling, codec support
Speech Gateway	Complete	whisper.cpp ASR, Piper TTS
Streaming APIs	Complete	SSE streaming for events, WebSocket for audio
Call Control API	Complete	PlayTTS, Hangup, Transfer
DTMF Support	In Progress	RFC 2833 and in-band detection
Opus Codec	In Progress	WebRTC-compatible audio
SIP Registration	Planned	Register with external PBX
WebRTC Gateway	Planned	Browser-based voice sessions

Phase 2 — Dialog Runtime

Focus: Deterministic conversation engine and integration framework.

Feature	Status	Description
FSM Engine	Complete	YAML-defined state machines
Per-call State	Complete	Isolated session state
Timeout Handling	Complete	Configurable per-state timeouts
Barge-in	Complete	Caller can interrupt TTS
Dialog Hooks	Complete	Webhook hooks to customer backends
Multi-digit DTMF	Planned	Collect account numbers, PINs
Dialog Variables	Complete	Template variables in dialogs
Retry & Fallback	In Progress	Graceful error handling
Dialog Routing	Planned	Route calls to dialogs by SIP headers
Conditional Transitions	Complete	Expression-based routing
Sub-dialogs	Planned	Reusable dialog fragments

Phase 3 — Enterprise Features

Focus: Multi-tenancy, access control, and production hardening.

Feature	Status	Description
Multi-tenant Isolation	Planned	Separate tenants with resource quotas
RBAC	Planned	Role-based access to APIs
Quotas	Planned	Per-tenant call and resource limits
HA Deployment	Planned	Active-active with Redis state store
Audit Logging	In Progress	Comprehensive audit trail
mTLS	In Progress	Mutual TLS between all components
Encryption at Rest	Planned	AES-256-GCM for stored data
Certificate Rotation	Planned	Automated cert management
Admin API	Planned	Configuration management via API
Helm Chart	In Progress	Kubernetes deployment

Phase 4 — Intelligence Layer

Focus: Optional AI capabilities layered on top of the deterministic runtime.

Feature	Status	Description
Intent Classifiers	Planned	Local intent classification models
LLM Nodes	Planned	Optional LLM-powered dialog states
RAG Adapters	Planned	Retrieval-augmented generation for knowledge
Sentiment Analysis	Planned	Real-time caller sentiment scoring
Language Detection	Planned	Automatic language detection and routing
Keyword Spotting	Planned	Real-time keyword alerts
Multilingual ASR	Planned	Multi-language support in single calls

Future — Expansion Opportunities

These features are being evaluated for future phases:

Voice Analytics — call quality scoring, agent performance metrics
Compliance Redaction — automatic PII detection and redaction in transcripts
Keyword Alerts — real-time alerting on specific phrases
Multilingual Support — per-call language switching
Offline Translation — local translation between languages
Recording & Playback — secure call recording with consent management
Call Queuing — built-in queue management for high-volume scenarios
Outbound Dialing — programmatic outbound call initiation

What We Will NOT Build

These are explicitly out of scope — they are customer-side concerns:

CRM integration (customers build this via hooks)
Ticketing system (customers use their existing tools)
Agent desktop (customers build or buy this separately)
Visual dialog designers (dialogs are YAML — code, not clicks)
Analytics dashboards (customers use Grafana or build their own)
Consumer-facing applications (we are infrastructure)

Release Cadence

Patch releases (v0.1.x): Bug fixes, security patches — as needed
Minor releases (v0.x.0): New features, backward compatible — monthly
Major releases (vX.0.0): Breaking changes — annually (with migration guides)

Contributing

voicetyped uses an open-core model:

Open Source (Apache 2.0):

Speech Gateway
Basic Media Gateway
CLI tools
Client libraries

Commercial License:

Conversation Runtime
Integration Gateway
Admin tooling
Enterprise security features
Multi-tenant support

Contributions to open-source components are welcome. See the GitHub repository for contribution guidelines.

Feedback

We prioritize features based on customer and community feedback. To request a feature or report a bug:

GitHub Issues — for bug reports and feature requests
Discussions — for architecture questions and use-case discussions
Email — enterprise@voicetyped.com for commercial inquiries