tts
stableText-to-speech preparation utilities: build SSML markup for TTS engines, split text into sentences, normalize abbreviations, estimate audio duration, and count phonemes.
use plugin tts::{ssml_wrap, ssml_emphasis, ssml_break, …} Functions (16)
- ssml_wrap Wrap text in a `<speak>` root element
- ssml_emphasis Wrap text in an `<emphasis>` tag
- ssml_break Insert a timed `<break>` pause
- ssml_prosody Wrap text with rate/pitch/volume controls
- ssml_say_as Wrap text with an `interpret-as` hint
- ssml_sub Substitute a spoken alias for text
- ssml_voice Wrap text in a named `<voice>` tag
- ssml_lang Wrap text in a language tag
- ssml_phoneme Provide an explicit phoneme pronunciation
- ssml_audio Insert an `<audio>` element with a URL
- strip_ssml Strip all SSML tags from a string
- split_sentences Split text into a list of sentences
- estimate_duration_seconds Estimate spoken duration from word count
- normalize_text Expand common abbreviations for TTS
- phoneme_count_estimate Estimate phoneme count of text
- word_count Count whitespace-separated words
Wrap text in a `<speak>` root element
Wraps text inside <speak>...</speak>, which is the required root element for SSML documents submitted to TTS engines like Google Cloud TTS or Amazon Polly.
use plugin tts::{ssml_wrap, ssml_emphasis}
let inner = ssml_emphasis("Hello!", "strong")
let doc = ssml_wrap(inner)
print(doc)
Wrap text in an `<emphasis>` tag
Wraps text in <emphasis level="...">. Valid levels are "strong", "moderate" (default), and "reduced". Use to stress particular words in synthesized speech.
use plugin tts::{ssml_emphasis}
let s = ssml_emphasis("important", "strong")
print(s)
Insert a timed `<break>` pause
Inserts a <break time="Nms"/> element representing a pause of time_ms milliseconds. Returns a self-closing tag with no surrounding text.
use plugin tts::{ssml_wrap, ssml_break}
let pause = ssml_break(500)
let doc = ssml_wrap("Hello.{pause} How are you?")
print(doc)
Wrap text with rate/pitch/volume controls
Wraps text in a <prosody> element controlling speech rate, pitch, and volume. Each parameter accepts keyword values ("slow", "medium", "fast" for rate; "low", "medium", "high" for pitch/volume) or percentage strings like "+20%".
use plugin tts::{ssml_prosody, ssml_wrap}
let slow = ssml_prosody("Take your time.", "slow", "medium", "medium")
let doc = ssml_wrap(slow)
print(doc)
Wrap text with an `interpret-as` hint
Wraps text in <say-as interpret-as="..."> to tell the TTS engine how to interpret the content. Common values: "cardinal", "ordinal", "characters", "spell-out", "date", "time", "telephone".
use plugin tts::{ssml_say_as, ssml_wrap}
let phone = ssml_say_as("555-1234", "telephone")
let doc = ssml_wrap("Call us at {phone}.")
print(doc)
Substitute a spoken alias for text
Produces <sub alias="...">text</sub>, instructing the TTS engine to speak alias aloud while displaying text visually. Useful for acronyms and abbreviations.
use plugin tts::{ssml_sub, ssml_wrap}
let abbr = ssml_sub("TTS", "text to speech")
let doc = ssml_wrap("Welcome to {abbr}.")
print(doc)
Wrap text in a named `<voice>` tag
Wraps text in <voice name="...">, switching to the named TTS voice for that span. Voice names are engine-specific (e.g. "en-US-Wavenet-A" for Google).
use plugin tts::{ssml_voice, ssml_wrap}
let narrated = ssml_voice("Once upon a time...", "en-US-Wavenet-D")
let doc = ssml_wrap(narrated)
print(doc)
Wrap text in a language tag
Wraps text in <lang xml:lang="..."> to switch the language for that span. Use standard BCP-47 language codes such as "en-US", "fr-FR", or "de-DE".
use plugin tts::{ssml_lang, ssml_wrap}
let greeting = ssml_lang("Bonjour!", "fr-FR")
let doc = ssml_wrap("She said: {greeting}")
print(doc)
Provide an explicit phoneme pronunciation
Provides an explicit phoneme pronunciation for text using the given phonetic alphabet ("ipa" or "x-sampa") and phoneme string ph. Overrides the engine's default pronunciation.
use plugin tts::{ssml_phoneme, ssml_wrap}
let word = ssml_phoneme("tomato", "ipa", "təˈmeɪtoʊ")
let doc = ssml_wrap("I say {word}.")
print(doc)
Insert an `<audio>` element with a URL
Inserts an <audio src="..."> element pointing to an audio file URL. If alt text is provided it is placed inside the tag as a fallback for engines that do not support audio elements.
use plugin tts::{ssml_audio, ssml_wrap}
let sound = ssml_audio("https://example.com/chime.mp3", "chime")
let doc = ssml_wrap("{sound} Welcome back.")
print(doc)
Strip all SSML tags from a string
Removes all SSML XML tags from a string, returning only the plain text content. Useful for extracting readable text from an SSML document for logging or display.
use plugin tts::{ssml_wrap, ssml_emphasis, strip_ssml}
let doc = ssml_wrap(ssml_emphasis("Hello world", "strong"))
let plain = strip_ssml(doc)
print(plain)
Split text into a list of sentences
Splits plain text into a list of sentences by breaking on ., !, and ?. Ellipses are not treated as sentence boundaries. Returns an indexed list of sentence strings.
use plugin tts::{split_sentences}
let sentences = split_sentences("Hello world. How are you? Fine!")
print(sentences[1])
print(sentences[2])
print(sentences[3])
Estimate spoken duration from word count
Estimates how many seconds a TTS engine would take to speak text, based on a words-per-minute rate. wpm defaults to 150 if omitted. Useful for timing audio segments before rendering.
use plugin tts::{estimate_duration_seconds}
let text = "Welcome to our application. This is a short introduction."
let secs = estimate_duration_seconds(text, 150.0)
print("Estimated duration: {secs} seconds")
Expand common abbreviations for TTS
Expands common abbreviations (e.g. "Dr." → "Doctor", "etc." → "et cetera") and collapses multiple spaces. Improves pronunciation quality for TTS engines that handle abbreviations inconsistently.
use plugin tts::{normalize_text}
let raw = "Dr. Smith works at the Dept. of Health, etc."
let clean = normalize_text(raw)
print(clean)
Estimate phoneme count of text
Estimates the total number of phonemes in text using a simple vowel-group and consonant heuristic (3 phonemes per vowel group, 1 per consonant). Useful for rough timing estimates or batch processing.
use plugin tts::{phoneme_count_estimate}
let n = phoneme_count_estimate("Hello world")
print("Estimated phonemes: {n}")
Count whitespace-separated words
Counts whitespace-separated words in the text. A fast utility for estimating content length before passing text to a TTS engine.
use plugin tts::{word_count}
let n = word_count("The quick brown fox jumps over the lazy dog")
print("Words: {n}")