Release Notes: v0.9.0¶
Release Date: 2026-05-02
Highlights¶
- Canonical Transcript Format: Standardized JSON output for STT transcription results
- DurationMilliseconds Type: Durations serialize as integer milliseconds for JSON interoperability
- Embedded JSON Schema: Validate transcripts with the embedded JSON Schema v1
New Features¶
Canonical Transcript Format¶
The stt package now includes a canonical Transcript type for consistent output across all STT providers:
import "github.com/plexusone/omnivoice-core/stt"
// Convert transcription result to canonical format
transcript := stt.NewTranscript(result, "deepgram", "nova-2", "audio.mp3", config)
// Save as JSON
err := transcript.SaveJSON("output.transcript.json")
// Load from JSON
loaded, err := stt.LoadTranscript("output.transcript.json")
Transcript Structure¶
The transcript format includes:
| Field | Description |
|---|---|
$schema |
JSON Schema URL for validation |
version |
Format version (currently "1.0") |
text |
Complete transcription text |
language |
BCP-47 language code (e.g., "en-US") |
duration_ms |
Audio duration in milliseconds |
segments |
Array of transcript segments |
metadata |
Provider, model, and options used |
Segment and Word Timing¶
// Access segment timing
for _, seg := range transcript.Segments {
fmt.Printf("Segment: %s (%.1fs - %.1fs)\n",
seg.Text,
seg.Start.Duration().Seconds(),
seg.End.Duration().Seconds())
// Word-level timing (if enabled)
for _, word := range seg.Words {
fmt.Printf(" %s: %dms\n", word.Text, word.Start.Milliseconds())
}
}
DurationMilliseconds Type¶
All duration fields use duration.DurationMilliseconds from github.com/grokify/mogo:
- Go semantics: Full
time.Durationfunctionality via.Duration()method - JSON interop: Serializes as integer milliseconds (not nanoseconds)
- Type safety: Distinct type prevents mixing with raw integers
import "github.com/grokify/mogo/time/duration"
// Create from time.Duration
d := duration.FromDuration(5 * time.Second)
// Create from milliseconds
d := duration.FromMilliseconds(5000)
// Access as time.Duration
td := d.Duration()
// JSON serialization
data, _ := json.Marshal(d) // -> "5000"
Embedded JSON Schema¶
The schema package provides embedded JSON Schema for validation:
import "github.com/plexusone/omnivoice-core/schema"
// Get the transcript schema
schemaJSON := schema.TranscriptV1Schema
// Use with any JSON Schema validator
validator := jsonschema.MustCompile(schemaJSON)
JSON Format¶
Example transcript JSON:
{
"$schema": "https://omnivoice.dev/schema/transcript-v1.json",
"version": "1.0",
"text": "Hello world",
"language": "en-US",
"duration_ms": 5000,
"segments": [
{
"text": "Hello world",
"start_ms": 0,
"end_ms": 2500,
"speaker": "speaker_1",
"confidence": 0.98,
"words": [
{
"text": "Hello",
"start_ms": 0,
"end_ms": 1000,
"confidence": 0.99
},
{
"text": "world",
"start_ms": 1200,
"end_ms": 2500,
"confidence": 0.97
}
]
}
],
"metadata": {
"provider": "deepgram",
"model": "nova-2",
"created_at": "2026-05-02T12:00:00Z",
"audio_file": "audio.mp3",
"options": {
"enable_punctuation": true,
"enable_word_timestamps": true,
"enable_speaker_diarization": true
}
}
}
API Reference¶
Types¶
| Type | Description |
|---|---|
Transcript |
Complete transcription with metadata |
TranscriptSegment |
Segment (sentence/phrase) with timing |
TranscriptWord |
Word with timing and confidence |
TranscriptMetadata |
Provider and options provenance |
TranscriptOptions |
Transcription options record |
Functions¶
| Function | Description |
|---|---|
NewTranscript(result, provider, model, audioFile, config) |
Create Transcript from TranscriptionResult |
LoadTranscript(filePath) |
Load Transcript from JSON file |
(t *Transcript) ToJSON() |
Serialize to JSON bytes |
(t *Transcript) SaveJSON(filePath) |
Save to JSON file |
(t *Transcript) TotalDuration() |
Get duration as time.Duration |
(s *TranscriptSegment) SegmentDuration() |
Get segment duration |
(w *TranscriptWord) WordDuration() |
Get word duration |
Constants¶
const TranscriptFormatVersion = "1.0"
const TranscriptSchemaURL = "https://omnivoice.dev/schema/transcript-v1.json"
Installation¶
Migration Guide¶
From v0.8.0¶
No breaking changes. To use the new Transcript format:
- Update dependency:
- Convert transcription results to canonical format:
- Use JSON serialization for storage or interop:
// Save
err := transcript.SaveJSON("transcript.json")
// Load
loaded, err := stt.LoadTranscript("transcript.json")
Full Changelog¶
See CHANGELOG.md for the complete list of changes.