Skip to content

Release Notes: v0.9.0

Release Date: 2026-06-15

Highlights

  • RealtimeProvider Registry: Native voice-to-voice APIs with ~100-300ms latency
  • OpenAI Realtime & Gemini Live: Two providers registered out of the box

New Features

Realtime Provider Registry

Get native voice-to-voice providers by name at runtime:

import (
    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

// OpenAI Realtime API (~100ms latency)
rt, err := omnivoice.GetRealtimeProvider("openai-realtime",
    omnivoice.WithAPIKey(os.Getenv("OPENAI_API_KEY")))

// Google Gemini Live API (~200ms latency)
rt, err := omnivoice.GetRealtimeProvider("gemini-live",
    omnivoice.WithAPIKey(os.Getenv("GOOGLE_API_KEY")))

// List registered providers
fmt.Println(omnivoice.ListRealtimeProviders()) // [openai-realtime gemini-live]

Type Aliases

New types re-exported from omnivoice-core:

Type Description
RealtimeProvider Interface for voice-to-voice providers
ProcessConfig Configuration for realtime sessions
FunctionDeclaration Function the model can call
RealtimeAudioChunk Audio data from the model
RealtimeTranscript Transcript update during conversation
RealtimeClient Multi-provider client with failover

Registry Functions

Function Description
RegisterRealtimeProvider Register a provider factory
GetRealtimeProvider Create provider by name
ListRealtimeProviders List registered providers
HasRealtimeProvider Check if provider exists
NewRealtimeClient Create multi-provider client

Voice Architecture Comparison

Approach Latency Use Case
Native Voice-to-Voice 100-300ms Low latency, natural conversation
Traditional Pipeline (STT->LLM->TTS) 500-1500ms Custom voices, domain-specific STT

When to Use Native Voice-to-Voice

  • Real-time voice agents requiring <500ms response time
  • Natural conversational experiences
  • Applications where latency is critical

When to Use Traditional Pipeline

  • Custom voice synthesis (ElevenLabs voices)
  • Domain-specific STT models (medical, legal)
  • Multi-provider flexibility

Dependencies

Package Change
omni-twilio v0.5.0 -> v0.6.0 (migrated from twilio-go)
omni-openai v0.2.2 -> v0.3.0 (Realtime API support)
omni-google Added v0.5.0 (Gemini Live API)
omnivoice-core v0.9.0 -> v0.13.0 (Realtime package)

Installation

go get github.com/plexusone/omnivoice@v0.9.0

Migration Guide

From v0.8.0

No breaking changes. New features are additive:

  1. Realtime providers: Import from github.com/plexusone/omnivoice
  2. Provider registration: Use _ "github.com/plexusone/omnivoice/providers/all" (includes new providers)

Quick Start

package main

import (
    "context"
    "os"

    "github.com/plexusone/omnivoice"
    _ "github.com/plexusone/omnivoice/providers/all"
)

func main() {
    ctx := context.Background()

    // Get OpenAI Realtime provider
    rt, err := omnivoice.GetRealtimeProvider("openai-realtime",
        omnivoice.WithAPIKey(os.Getenv("OPENAI_API_KEY")))
    if err != nil {
        panic(err)
    }

    // Start a realtime session
    session, err := rt.Connect(ctx, omnivoice.ProcessConfig{
        Instructions: "You are a helpful voice assistant.",
    })
    if err != nil {
        panic(err)
    }
    defer session.Close()

    // Process audio...
}

Full Changelog

See CHANGELOG.md for the complete list of changes.