SSML and HTML Support
SSML (Speech Synthesis
Markup Language)
SSML consists of XML-like tags, for example:
Did you mean the <emphasis level="strong"><prosody pitch="75">green</prosody></emphasis> beans?
The following markup tags and attributes are recognised:
speak
- xml:base (the value is just passed back as a parameter with the
UriCallback() function)
- xml:lang
voice
- xml:lang
- name
- age
- variant
- gender
prosody
- rate (
x-slow
, slow
, medium
,
fast
, x-fast
or a percentage such as
125%
)
- volume (
silent
, x-soft
, soft
,
medium
, loud
, x-loud
,
+1dB
or -1dB
)
- pitch (a number, for example “75”)
- range (
default
, x-low
, low
,
medium
, high
, x-high
)
say-as
- interpret-as=“characters”
- interpret-as=“characters” format=“glyphs”
- interpret-as=“tts:key”
- interpret-as=“tts:char”
- interpret-as=“tts:digits”
mark
s
p
sub
tts:style
- field=“punctuation” mode=none,all,some
- field=“capital_letters” mode=no,spelling,icon,pitch
audio
emphasis
- level (
none
, reduced
,
moderate
, strong
or
x-strong
)
break
HTML
eSpeak can speak HTML text directly, or text containing both SSML and
HTML markup.
Any unrecognised tags are ignored.
The following tags cause a sentence break:
The following tags cause a paragraph break:
Text between the following tags is ignored:
References
SSML
- Speech
Synthesis Markup Language (SSML) Version 1.0. W3C Recommendation, 3
March 2009. W3C.
- Speech
Synthesis Markup Language (SSML) Version 1.1. W3C Recommendation, 7
September 2010. W3C.
- SSML
1.0 say-as attribute values. W3C NOTE, 26 May 2005. W3C.
HTML
- HTML
5.2. W3C Recommendation, 14 December 2017. W3C.
- HTML Living
Standard. Continually updated. WHATWG.