Release Notes: v0.10.0¶
Release Date: 2026-06-13
Highlights¶
- Voice Gateway Interface: Provider-agnostic interfaces for PSTN and WebRTC voice communication
- Session Persistence: Call state storage with in-memory and Redis backends
- Barge-in Detection: User interruption handling during agent speech
- WAV Codec Support: Audio format conversion for TTS provider compatibility
New Features¶
Voice Gateway Interface¶
The new gateway package provides provider-agnostic interfaces for voice communication:
import "github.com/plexusone/omnivoice-core/gateway"
// Gateway handles PSTN phone calls (Twilio, Telnyx, Vonage, Plivo)
type Gateway interface {
Name() ProviderName
Start(ctx context.Context) error
Stop() error
OnCall(handler CallHandler)
MakeCall(ctx context.Context, to string) (Session, error)
GetSession(callID string) (Session, bool)
ListSessions() []Session
}
// WebRTCGateway handles browser/mobile voice (LiveKit, Daily)
type WebRTCGateway interface {
Name() ProviderName
Start(ctx context.Context) error
Stop() error
OnParticipantJoined(handler ParticipantHandler)
JoinRoom(ctx context.Context, roomName string) error
LeaveRoom() error
GetSession(participantID string) (WebRTCSession, bool)
ListSessions() []WebRTCSession
GenerateClientToken(roomName, identity, displayName string) (string, error)
}
Session Events¶
The gateway emits events for call lifecycle management:
| Event Type | Description |
|---|---|
SessionStarted |
Call connected |
SessionEnded |
Call terminated |
UserSpeechStart |
User began speaking |
UserSpeechEnd |
User stopped speaking |
UserTranscript |
STT transcription available |
AgentThinking |
LLM processing input |
AgentSpeechStart |
Agent began speaking |
AgentSpeechEnd |
Agent stopped speaking |
AgentTranscript |
Agent response text |
ToolCall |
Function/tool invocation |
Interruption |
User interrupted agent |
Error |
Error occurred |
Session Persistence¶
The storage package provides call state persistence for session recovery:
import "github.com/plexusone/omnivoice-core/storage"
// In-memory storage (default)
store := storage.NewMemoryStore()
// Redis storage (distributed)
store, err := storage.NewRedisStore("redis://localhost:6379",
storage.WithTTL(24 * time.Hour),
storage.WithPrefix("omnivoice:"),
)
// Save session state
err := store.Save(ctx, &storage.SessionState{
ID: callSID,
Provider: "twilio",
Direction: "inbound",
From: "+14155551234",
To: "+14155556789",
Status: storage.StatusActive,
History: turns,
Metrics: metrics,
})
// Recover active sessions after restart
sessions, err := store.ListActive(ctx)
for _, id := range sessions {
state, _ := store.Load(ctx, id)
// Reconnect to active call...
}
SessionState Fields¶
| Field | Type | Description |
|---|---|---|
ID |
string | Unique session identifier |
CallID |
string | Provider-specific call ID |
Provider |
string | Voice gateway provider name |
Direction |
string | "inbound" or "outbound" |
From |
string | Caller phone number |
To |
string | Called phone number |
Status |
SessionStatus | pending, active, ended, failed |
History |
[]Turn | Conversation transcript |
Metrics |
SessionMetrics | Performance metrics |
RecoveryData |
map[string]any | Provider-specific recovery data |
Barge-in Detection¶
The bargein package detects when users interrupt agent speech:
import "github.com/plexusone/omnivoice-core/bargein"
detector := bargein.New(bargein.Config{
Mode: bargein.ModeImmediate, // or ModeAfterSentence, ModeDisabled
MinSpeechDurationMs: 200, // Minimum speech to trigger interruption
SilenceThresholdMs: 500, // Silence before considering speech ended
})
// Connect to STT events and TTS pipeline
detector.AttachSTTEvents(sttEvents)
detector.AttachTTS(ttsPipeline)
// Handle interruptions
detector.OnInterrupt(func(event gateway.Event) {
log.Println("User interrupted agent")
// TTS automatically stopped
})
detector.Start(ctx)
Interruption Modes¶
| Mode | Behavior |
|---|---|
ModeImmediate |
Stop TTS immediately when user speaks |
ModeAfterSentence |
Wait for agent to finish current sentence |
ModeDisabled |
Never interrupt agent speech |
WAV Codec Support¶
The audio/codec package now includes WAV encoding for TTS provider compatibility:
import "github.com/plexusone/omnivoice-core/audio/codec"
// Convert raw μ-law audio to WAV format
// Required for providers like OpenAI Whisper that need WAV containers
wavData := codec.MulawToWAV(mulawAudio)
// WAV format: RIFF header + 8kHz mono μ-law data
// Compatible with most audio processing tools
API Reference¶
Gateway Package¶
| Type | Description |
|---|---|
Gateway |
PSTN voice gateway interface |
WebRTCGateway |
WebRTC voice gateway interface |
Session |
Active PSTN call session |
WebRTCSession |
Active WebRTC session |
Event |
Session lifecycle event |
EventType |
Event type constants |
Turn |
Conversation turn (user/agent) |
ToolCall |
LLM tool invocation |
Metrics |
Performance metrics |
CallInfo |
Incoming call information |
CallHandler |
Incoming call handler function |
Storage Package¶
| Type | Description |
|---|---|
SessionStore |
Storage interface |
MemoryStore |
In-process storage |
RedisStore |
Redis-backed storage |
SessionState |
Persisted session data |
SessionStatus |
Status constants |
Turn |
Conversation turn |
SessionMetrics |
Aggregated metrics |
Bargein Package¶
| Type | Description |
|---|---|
Detector |
Barge-in detector |
Config |
Detector configuration |
InterruptionMode |
Mode constants |
Codec Package (New)¶
| Function | Description |
|---|---|
MulawToWAV(data []byte) []byte |
Convert μ-law to WAV format |
Installation¶
Migration Guide¶
From v0.9.0¶
No breaking changes. New packages are additive:
- Update dependency:
- Import new packages as needed:
import (
"github.com/plexusone/omnivoice-core/gateway"
"github.com/plexusone/omnivoice-core/storage"
"github.com/plexusone/omnivoice-core/bargein"
)
- The Twilio example has moved to the
omni-twiliorepository.
Dependencies¶
- Added
github.com/redis/go-redis/v9for Redis session storage
Full Changelog¶
See CHANGELOG.md for the complete list of changes.