Skip to content

Release Notes: v0.3.0

Release Date: 2026-06-14

Highlights

Added OpenAI Realtime API support for native voice-to-voice processing with ~100ms latency.

What's New

OpenAI Realtime API

The new omnivoice/realtime package provides native voice-to-voice capabilities via the OpenAI Realtime API, eliminating the need for separate STT and TTS steps:

import "github.com/plexusone/omni-openai/omnivoice/realtime"

provider := realtime.NewProvider(os.Getenv("OPENAI_API_KEY"),
    realtime.WithVoice("alloy"),
    realtime.WithInstructions("You are a helpful assistant."),
)

audioIn := make(chan []byte, 100)
audioCh, transcriptCh, err := provider.ProcessAudioStream(ctx, audioIn, realtime.ProcessConfig{})

Features:

  • Ultra-low latency - ~100ms voice-to-voice response time
  • Native audio - Model handles audio directly, no intermediate STT/TTS
  • WebSocket streaming - Bidirectional audio via wss://api.openai.com/v1/realtime
  • Voice activity detection - Server-side VAD with configurable thresholds
  • Function calling - Execute tools during conversation with OnFunctionCall callback
  • Multiple voices - alloy, echo, shimmer, ash, ballad, coral, sage, verse

Audio Format

Property Value
Format PCM16 (signed 16-bit little-endian)
Sample Rate 24kHz
Channels Mono

Configuration Options

realtime.WithModel("gpt-4o-realtime-preview-2024-12-17")
realtime.WithVoice("alloy")
realtime.WithInstructions("System prompt here")
realtime.WithServerVAD()  // Enable voice activity detection
realtime.WithTools(...)   // Function calling
realtime.WithTemperature(0.8)

Gateway Integration

Implements realtime.Provider interface from omnivoice-core for seamless gateway integration:

import (
    "github.com/plexusone/omnivoice-core/gateway"
    openaiRealtime "github.com/plexusone/omni-openai/omnivoice/realtime"
)

// Use with RealtimeBridge
bridge := gateway.NewRealtimeBridgeForTwilio(provider, corereal.ProcessConfig{
    Instructions: "You are a helpful assistant.",
    Voice:        "alloy",
})

Dependencies

Package Change
github.com/gorilla/websocket Added v1.5.3
github.com/plexusone/omnivoice-core v0.9.0 → v0.12.1

Getting Started

go get github.com/plexusone/omni-openai@v0.3.0

See the Realtime API Guide for detailed usage.

Full Changelog

See CHANGELOG.md for the complete list of changes.