Dialog Hooks API

Implement webhook endpoints that voicetyped calls during dialog execution.

Dialog Hooks are the integration point between voicetyped’s conversation runtime and your business logic. When a dialog state machine executes a call_hook action, the Integration Gateway sends an HTTP POST request with JSON to your webhook URL and uses the JSON response to continue the dialog. You implement standard HTTP endpoints that voicetyped calls — this is the primary way to add custom logic to voice flows.

How It Works

voicetyped acts as the HTTP client. You run an HTTP server that exposes one or more webhook endpoints. During dialog execution, voicetyped POSTs a JSON payload to your endpoint and expects a JSON response.

voicetyped                        Your Server
     |                                    |
     |--- POST /on-intent (JSON) -------->|
     |                                    |  (your business logic)
     |<---------- 200 OK (JSON) ---------|
     |                                    |

Webhook Endpoints

Your server implements up to three endpoints. Only /on-intent is required.

EndpointMethodDescription
{your_url}/on-intentPOSTCalled when the dialog FSM triggers a hook action. Required.
{your_url}/on-call-startPOSTCalled when a call starts. Optional.
{your_url}/on-call-endPOSTCalled when a call ends. Optional.

OnIntent

The primary hook. Called when a dialog state machine triggers a call_hook action.

Request (POST from voicetyped)

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "called_number": "+18001234567",
  "dialog_name": "helpdesk",
  "current_state": "process_request",
  "transcript": "I need a password reset",
  "confidence": 0.94,
  "language": "en",
  "variables": {"caller_name": "John"},
  "payload": {},
  "timestamp": "2024-01-15T10:30:45Z",
  "sip_headers": {}
}
FieldTypeDescription
session_idstringUnique call session ID
caller_idstringCaller phone number or SIP URI
called_numberstringDialed number
dialog_namestringActive dialog name
current_statestringCurrent FSM state
transcriptstringFull transcript of caller’s speech
confidencefloatASR confidence score (0.0 to 1.0)
languagestringDetected language code
variablesobjectVariables set during the dialog
payloadobjectCustom payload from call_hook action
timestampstringISO 8601 timestamp
sip_headersobjectSIP headers from the call

Response (your server returns)

{
  "response_text": "I've created a password reset ticket for you.",
  "next_state": "",
  "variables": {"ticket_id": "TK-45678", "issue_type": "password_reset"},
  "metadata": {},
  "send_dtmf": "",
  "transfer_to": "",
  "hangup": false
}
FieldTypeDescription
response_textstringText played as TTS to the caller
next_statestringForce a state transition. If empty, the dialog FSM uses its normal transitions.
variablesobjectVariables to set in the dialog context
metadataobjectMetadata to attach to the call session
send_dtmfstringDTMF digits to send (optional)
transfer_tostringSIP URI or number to transfer the call to (optional)
hangupboolEnd the call (optional)

Implementation Examples

Go

package main

import (
	"encoding/json"
	"fmt"
	"log"
	"net/http"
	"strings"
)

type IntentEvent struct {
	SessionID    string            `json:"session_id"`
	CallerID     string            `json:"caller_id"`
	CalledNumber string            `json:"called_number"`
	DialogName   string            `json:"dialog_name"`
	CurrentState string            `json:"current_state"`
	Transcript   string            `json:"transcript"`
	Confidence   float64           `json:"confidence"`
	Language     string            `json:"language"`
	Variables    map[string]string `json:"variables"`
	Payload      json.RawMessage   `json:"payload"`
	Timestamp    string            `json:"timestamp"`
	SipHeaders   map[string]string `json:"sip_headers"`
}

type DialogAction struct {
	ResponseText string            `json:"response_text"`
	NextState    string            `json:"next_state,omitempty"`
	Variables    map[string]string `json:"variables,omitempty"`
	Metadata     map[string]string `json:"metadata,omitempty"`
	SendDTMF     string            `json:"send_dtmf,omitempty"`
	TransferTo   string            `json:"transfer_to,omitempty"`
	Hangup       bool              `json:"hangup,omitempty"`
}

func onIntent(w http.ResponseWriter, r *http.Request) {
	var event IntentEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}

	log.Printf("OnIntent: session=%s state=%s transcript=%q",
		event.SessionID, event.CurrentState, event.Transcript)

	transcript := strings.ToLower(event.Transcript)
	var action DialogAction

	switch {
	case strings.Contains(transcript, "password reset"):
		ticketID, err := createTicket(event.CallerID, "password_reset")
		if err != nil {
			action = DialogAction{
				ResponseText: "I'm sorry, I couldn't create your ticket. " +
					"Let me transfer you to an agent.",
				TransferTo: "sip:support@pbx.internal",
			}
		} else {
			action = DialogAction{
				ResponseText: fmt.Sprintf(
					"I've created a password reset ticket for you. "+
						"Your ticket number is %s. "+
						"You should receive an email shortly.", ticketID),
				Variables: map[string]string{
					"ticket_id":  ticketID,
					"issue_type": "password_reset",
				},
			}
		}

	case strings.Contains(transcript, "check status"):
		ticketID := event.Variables["ticket_id"]
		if ticketID == "" {
			action = DialogAction{
				ResponseText: "I don't have a ticket number. " +
					"Could you please provide your ticket number?",
				NextState: "collect_ticket_number",
			}
		} else {
			status, _ := getTicketStatus(ticketID)
			action = DialogAction{
				ResponseText: fmt.Sprintf(
					"Your ticket %s is currently %s.", ticketID, status),
			}
		}

	default:
		action = DialogAction{
			ResponseText: "I understand you need help. " +
				"Could you describe your issue in a few words?",
		}
	}

	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(action)
}

func onCallStart(w http.ResponseWriter, r *http.Request) {
	var event CallStartEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}
	log.Printf("Call started: %s from %s", event.SessionID, event.CallerID)

	name, _ := lookupCaller(event.CallerID)

	w.Header().Set("Content-Type", "application/json")
	json.NewEncoder(w).Encode(CallStartAction{
		Variables: map[string]string{"caller_name": name},
	})
}

func onCallEnd(w http.ResponseWriter, r *http.Request) {
	var event CallEndEvent
	if err := json.NewDecoder(r.Body).Decode(&event); err != nil {
		http.Error(w, "bad request", http.StatusBadRequest)
		return
	}
	log.Printf("Call ended: %s (duration: %ds, reason: %s)",
		event.SessionID, event.DurationSeconds, event.Reason)

	logCallRecord(event)
	w.WriteHeader(http.StatusOK)
}

func main() {
	http.HandleFunc("/on-intent", onIntent)
	http.HandleFunc("/on-call-start", onCallStart)
	http.HandleFunc("/on-call-end", onCallEnd)

	log.Printf("Dialog hooks server listening on :8080")
	log.Fatal(http.ListenAndServe(":8080", nil))
}

Python

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route("/on-intent", methods=["POST"])
def on_intent():
    event = request.get_json()
    transcript = event["transcript"].lower()

    if "password" in transcript:
        ticket_id = create_ticket(event["caller_id"], "password_reset")
        return jsonify(
            response_text=f"Ticket {ticket_id} created for password reset.",
            variables={"ticket_id": ticket_id},
        )

    if "transfer" in transcript:
        return jsonify(
            response_text="Transferring you now.",
            transfer_to="sip:support@pbx.internal",
        )

    return jsonify(response_text="How can I help you today?")

@app.route("/on-call-start", methods=["POST"])
def on_call_start():
    event = request.get_json()
    caller_name = lookup_caller(event["caller_id"])
    return jsonify(variables={"caller_name": caller_name})

@app.route("/on-call-end", methods=["POST"])
def on_call_end():
    event = request.get_json()
    log_call_record(event)
    return "", 200

if __name__ == "__main__":
    app.run(host="0.0.0.0", port=8080)

OnCallStart

Optional webhook called when a new call is connected. Use it to look up caller information, set initial variables, or log the call start.

Request

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "called_number": "+18001234567",
  "dialog_name": "helpdesk",
  "sip_headers": {},
  "timestamp": "2024-01-15T10:30:00Z"
}

Response

{
  "variables": {"caller_name": "John Smith"},
  "override_dialog": "",
  "reject_call": false,
  "reject_reason": ""
}
FieldTypeDescription
variablesobjectInitial variables to set in the dialog context
override_dialogstringUse a different dialog for this call (optional)
reject_callboolReject the call (optional)
reject_reasonstringReason for rejection, logged for diagnostics

OnCallEnd

Optional webhook called when a call terminates. Use it for analytics, logging, or cleanup. Return an empty 200 OK response.

Request

{
  "session_id": "call-abc-123",
  "caller_id": "+15551234567",
  "dialog_name": "helpdesk",
  "final_state": "goodbye",
  "duration_seconds": 142,
  "reason": "caller_hangup",
  "variables": {"ticket_id": "TK-45678", "issue_type": "password_reset"},
  "state_transitions": 5
}

Response

Return an empty 200 OK. No response body is required.

Registration

Register your webhook in the voicetyped configuration:

integration:
  services:
    dialog_hooks:
      type: http
      url: https://hook-server.internal:50051/on-intent
      method: POST
      headers:
        Authorization: "Bearer ${WEBHOOK_TOKEN}"
      timeout: 5s
      retry:
        max_attempts: 2
        initial_backoff: 100ms

Then reference it in your dialog YAML:

states:
  process:
    on_enter:
      - action: call_hook
        service: dialog_hooks
        method: OnIntent
        payload:
          transcript: "{{ .Event.Transcript }}"

Best Practices

  1. Keep hooks fast — target < 200ms response time. The caller is waiting in silence while voicetyped waits for your webhook response.
  2. Return proper status codes — return 200 for success. voicetyped treats 4xx and 5xx responses as errors and will retry or fall back based on your retry configuration.
  3. Handle errors gracefully — always return a JSON DialogAction even on internal errors. Use response_text to inform the caller and transfer_to as a fallback.
  4. Authenticate requests — use the Authorization header (configured in registration) to verify that requests originate from your voicetyped instance. Reject requests without a valid token.
  5. Use variables — store context in dialog variables rather than external state. Variables survive state transitions and are available in templates.
  6. Set Content-Type — always return Content-Type: application/json. voicetyped will reject responses with other content types.
  7. Log everything — use the /on-call-end webhook to record call analytics in your data warehouse.
  8. Test with the Speech API — use the Speech API to simulate calls without SIP infrastructure.

Next Steps