Release Notes: v0.9.0¶

Release Date: 2026-05-02

Highlights¶

Canonical Transcript Format: Standardized JSON output for STT transcription results
DurationMilliseconds Type: Durations serialize as integer milliseconds for JSON interoperability
Embedded JSON Schema: Validate transcripts with the embedded JSON Schema v1

New Features¶

Canonical Transcript Format¶

The stt package now includes a canonical Transcript type for consistent output across all STT providers:

import "github.com/plexusone/omnivoice-core/stt"

// Convert transcription result to canonical format
transcript := stt.NewTranscript(result, "deepgram", "nova-2", "audio.mp3", config)

// Save as JSON
err := transcript.SaveJSON("output.transcript.json")

// Load from JSON
loaded, err := stt.LoadTranscript("output.transcript.json")

Transcript Structure¶

The transcript format includes:

Field	Description
`$schema`	JSON Schema URL for validation
`version`	Format version (currently "1.0")
`text`	Complete transcription text
`language`	BCP-47 language code (e.g., "en-US")
`duration_ms`	Audio duration in milliseconds
`segments`	Array of transcript segments
`metadata`	Provider, model, and options used

Segment and Word Timing¶

// Access segment timing
for _, seg := range transcript.Segments {
    fmt.Printf("Segment: %s (%.1fs - %.1fs)\n",
        seg.Text,
        seg.Start.Duration().Seconds(),
        seg.End.Duration().Seconds())

    // Word-level timing (if enabled)
    for _, word := range seg.Words {
        fmt.Printf("  %s: %dms\n", word.Text, word.Start.Milliseconds())
    }
}

DurationMilliseconds Type¶

All duration fields use duration.DurationMilliseconds from github.com/grokify/mogo:

Go semantics: Full time.Duration functionality via .Duration() method
JSON interop: Serializes as integer milliseconds (not nanoseconds)
Type safety: Distinct type prevents mixing with raw integers

import "github.com/grokify/mogo/time/duration"

// Create from time.Duration
d := duration.FromDuration(5 * time.Second)

// Create from milliseconds
d := duration.FromMilliseconds(5000)

// Access as time.Duration
td := d.Duration()

// JSON serialization
data, _ := json.Marshal(d) // -> "5000"

Embedded JSON Schema¶

The schema package provides embedded JSON Schema for validation:

import "github.com/plexusone/omnivoice-core/schema"

// Get the transcript schema
schemaJSON := schema.TranscriptV1Schema

// Use with any JSON Schema validator
validator := jsonschema.MustCompile(schemaJSON)

JSON Format¶

Example transcript JSON:

{
  "$schema": "https://omnivoice.dev/schema/transcript-v1.json",
  "version": "1.0",
  "text": "Hello world",
  "language": "en-US",
  "duration_ms": 5000,
  "segments": [
    {
      "text": "Hello world",
      "start_ms": 0,
      "end_ms": 2500,
      "speaker": "speaker_1",
      "confidence": 0.98,
      "words": [
        {
          "text": "Hello",
          "start_ms": 0,
          "end_ms": 1000,
          "confidence": 0.99
        },
        {
          "text": "world",
          "start_ms": 1200,
          "end_ms": 2500,
          "confidence": 0.97
        }
      ]
    }
  ],
  "metadata": {
    "provider": "deepgram",
    "model": "nova-2",
    "created_at": "2026-05-02T12:00:00Z",
    "audio_file": "audio.mp3",
    "options": {
      "enable_punctuation": true,
      "enable_word_timestamps": true,
      "enable_speaker_diarization": true
    }
  }
}

API Reference¶

Types¶

Type	Description
`Transcript`	Complete transcription with metadata
`TranscriptSegment`	Segment (sentence/phrase) with timing
`TranscriptWord`	Word with timing and confidence
`TranscriptMetadata`	Provider and options provenance
`TranscriptOptions`	Transcription options record

Functions¶

Function	Description
`NewTranscript(result, provider, model, audioFile, config)`	Create Transcript from TranscriptionResult
`LoadTranscript(filePath)`	Load Transcript from JSON file
`(t *Transcript) ToJSON()`	Serialize to JSON bytes
`(t *Transcript) SaveJSON(filePath)`	Save to JSON file
`(t *Transcript) TotalDuration()`	Get duration as time.Duration
`(s *TranscriptSegment) SegmentDuration()`	Get segment duration
`(w *TranscriptWord) WordDuration()`	Get word duration

Constants¶

const TranscriptFormatVersion = "1.0"
const TranscriptSchemaURL = "https://omnivoice.dev/schema/transcript-v1.json"

Installation¶

go get github.com/plexusone/omnivoice-core@v0.9.0

Migration Guide¶

From v0.8.0¶

No breaking changes. To use the new Transcript format:

Update dependency:

go get github.com/plexusone/omnivoice-core@v0.9.0

Convert transcription results to canonical format:

transcript := stt.NewTranscript(result, "provider", "model", "file.mp3", config)

Use JSON serialization for storage or interop:

// Save
err := transcript.SaveJSON("transcript.json")

// Load
loaded, err := stt.LoadTranscript("transcript.json")

Full Changelog¶

See CHANGELOG.md for the complete list of changes.