Integration Gateway

Production-grade backend integration with retry logic, circuit breaking, and rate limiting.

The Integration Gateway is the boundary between voicetyped and your existing systems. It handles outbound calls to customer backends via REST and webhooks, with built-in authentication, retry logic, rate limiting, and circuit breaking. This is where voicetyped’s infrastructure DNA meets your business logic.

Responsibilities

  • Backend connectivity — REST and HTTP calls to customer services
  • Authentication — mTLS, API keys, OAuth2 client credentials
  • Retry logic — Exponential backoff with jitter
  • Rate limiting — Per-service and per-call rate limits
  • Circuit breaking — Prevents cascading failures
  • Request logging — Audit trail for all outbound calls
  • Timeout management — Configurable per-service timeouts

Configuration

# /etc/voice-gateway/config.yaml — integration section

integration:
  api_port: 8080                     # REST API port
  http_port: 8080                    # HTTP API port
  default_timeout: 5s                # Default timeout for backend calls
  max_retries: 3                     # Default retry count

  # Backend service definitions
  services:
    ticketing:
      type: http
      url: https://ticketing.internal/api/v1
      method: POST
      headers:
        Authorization: "Bearer ${TICKETING_TOKEN}"
      timeout: 10s
      retry:
        max_attempts: 3
        initial_backoff: 100ms
        max_backoff: 5s
        backoff_multiplier: 2.0
      rate_limit:
        requests_per_second: 50
        burst: 10
      circuit_breaker:
        threshold: 5               # Failures before opening
        timeout: 30s               # Time in open state before half-open
        half_open_requests: 3      # Requests to try in half-open state

    issue_classifier:
      type: http
      url: https://classifier.internal/classify
      method: POST
      headers:
        Authorization: "Bearer ${CLASSIFIER_TOKEN}"
      timeout: 3s

    webhook_notify:
      type: http
      url: https://hooks.internal/voice-events
      method: POST
      headers:
        Authorization: "Bearer ${WEBHOOK_TOKEN}"
        Content-Type: application/json
      timeout: 5s
      retry:
        max_attempts: 2
        initial_backoff: 200ms

Service Registration

Backend services are registered in the configuration file under integration.services. Each service has a name (used in dialog call_hook actions) and connection details.

HTTP/Webhook Services

services:
  my_webhook:
    type: http
    url: https://api.internal/webhook
    method: POST                      # GET, POST, PUT, PATCH
    headers:
      Authorization: "Bearer ${API_TOKEN}"
      Content-Type: application/json
    timeout: 5s

Authentication

mTLS (Mutual TLS)

The recommended authentication method for service-to-service communication:

services:
  secure_backend:
    type: http
    url: https://backend.internal/api
    tls:
      cert: /etc/voice-gateway/certs/client.pem
      key: /etc/voice-gateway/certs/client-key.pem
      ca: /etc/voice-gateway/certs/ca.pem

API Key

For HTTP services that use API key authentication:

services:
  api_backend:
    type: http
    url: https://api.internal/v1
    headers:
      X-API-Key: "${API_KEY}"

OAuth2 Client Credentials

services:
  oauth_backend:
    type: http
    url: https://api.internal/v1
    auth:
      type: oauth2
      token_url: https://auth.internal/oauth/token
      client_id: "${OAUTH_CLIENT_ID}"
      client_secret: "${OAUTH_CLIENT_SECRET}"
      scopes:
        - voice.read
        - voice.write

Retry Logic

voicetyped implements configurable retry with exponential backoff and jitter to handle transient failures:

Attempt 1: immediate
Attempt 2: wait 100ms ± jitter
Attempt 3: wait 200ms ± jitter
Attempt 4: wait 400ms ± jitter
... up to max_backoff

Configuration

retry:
  max_attempts: 3                    # Total attempts (including first)
  initial_backoff: 100ms             # Wait before first retry
  max_backoff: 5s                    # Maximum wait between retries
  backoff_multiplier: 2.0            # Multiply backoff each attempt
  retryable_http_codes:              # HTTP codes to retry on
    - 429
    - 502
    - 503
    - 504

Non-Retryable Errors

These errors are never retried:

  • HTTP 400 — request is malformed
  • HTTP 401 / 403 — authentication/authorization failure
  • HTTP 404 — resource does not exist
  • HTTP 409 — conflict / duplicate request
  • All other HTTP 4xx (except 429)

Rate Limiting

Rate limiting prevents overwhelming backend services:

rate_limit:
  requests_per_second: 50           # Sustained rate
  burst: 10                          # Burst allowance above sustained rate

When rate limited, requests are queued (up to queue_depth) rather than immediately failed. If the queue fills, requests receive a 429 Too Many Requests error.

Per-Call Rate Limiting

You can also limit the rate per individual call session:

rate_limit:
  requests_per_second: 50           # Global limit
  per_call_rps: 5                   # Per-call limit

Circuit Breaker

The circuit breaker prevents cascading failures by stopping requests to unhealthy backends:

     ┌──────────────────────────────────────────┐
     │                                          │
     ▼                                          │
  CLOSED ──── failures >= threshold ───→ OPEN   │
     ▲                                    │     │
     │                                    │     │
     │          timeout expires           │     │
     │                ▼                   │     │
     └─── success ── HALF-OPEN           │     │
                        │                 │     │
                        └── failure ──────┘     │
                                                │
                        └── all succeed ────────┘

States

StateBehavior
ClosedRequests flow normally. Failures are counted.
OpenAll requests immediately fail with UNAVAILABLE. No backend calls.
Half-OpenA limited number of requests are allowed through to test recovery.

Configuration

circuit_breaker:
  threshold: 5                       # Consecutive failures to open
  timeout: 30s                       # Time in open state before half-open
  half_open_requests: 3              # Requests allowed in half-open
  success_threshold: 3               # Successes needed to close from half-open

Request/Response Logging

All outbound requests are logged for auditing:

integration:
  logging:
    enabled: true
    level: info                      # debug, info, warn, error
    include_payload: false           # Include request/response bodies
    include_headers: false           # Include HTTP headers
    redact:                          # Fields to redact from logs
      - password
      - ssn
      - credit_card

Example log entry:

{
  "timestamp": "2024-01-15T10:30:45Z",
  "service": "ticketing",
  "method": "CreateTicket",
  "call_session_id": "abc-123-def",
  "duration_ms": 234,
  "status": "OK",
  "retry_count": 0
}

Implementing a Backend Service

To integrate with voicetyped, implement an HTTP server that handles webhook requests. voicetyped POSTs JSON to your endpoint and expects a JSON response. See the Dialog Hooks API for full webhook schemas and examples.

Metrics

MetricTypeDescription
vg_integration_requests_totalCounterTotal outbound requests
vg_integration_request_duration_secondsHistogramRequest latency
vg_integration_errors_totalCounterFailed requests
vg_integration_retries_totalCounterRetry attempts
vg_integration_circuit_breaker_stateGaugeCircuit breaker state (0=closed, 1=open, 2=half-open)
vg_integration_rate_limited_totalCounterRate-limited requests

Next Steps