Phoneme Model
Evan Kirshenbaum’s feature set used in his ASCII transcription of the
International Phonetic Alphabet (IPA)[1],
[2] describes the phonemes in a way consistent
with how the phonemes are organised in the IPA code chart. That is the
approach used in the Phonemes document to
describe the phonemes in a phoneme definition file.
Those phoneme features often represent the action of more than one
articulatory mechanism used to produce speech, or affect the same area.
Internally, espeak-ng makes use of the articulatory model, not the IPA
descriptions. This document describes how the feature-based IPA model is
mapped to the articulatory model.
People working on adding new voices or languages do not need to read
this document, but should instead read the Phonemes document. This is intended for people
working on the espeak-ng codebase, or people interested in how espeak-ng
works internally.
NOTE: This model is in the process of being
implemented. As such, the current implementation does not reflect this
document.
Manner of Articulation
The manner of articulation is described in terms of several distinct
feature types. The possible manners of articulation are:
nasal |
nas |
pmc egs nsl occ |
plosive (stop) |
stp |
pmc egs orl occ |
affricate |
afr |
pmc egs orl occ frr |
fricative |
frc |
pmc egs orl frv |
tap/flap |
flp |
pmc egs orl fla |
trill |
trl |
pmc egs orl tri |
approximant |
apr |
pmc egs orl app |
click |
clk |
vlc igs orl |
ejective |
ejc |
vlc igs orl occ |
implosive |
imp |
gtc igs |
vowel |
vwl |
pmc egs orl vow |
For imp
consonants, they use the features of the base
phoneme except for the pmc
and egs
features.
Thus, a nas imp
is a gtc igs nsl occ
.
The vwl
phonemes are described using vowel height and
backness features, while consonants (the other manners of articulation)
are described using place of articulation features.
Additionally, the manner of articulation can be refined using the
following features:
lat |
lateral |
The air flow is directed along the sides of the tongue. |
sib |
sibilant |
The air flow is directed through the teeth with the tongue. |
Air Flow
egs |
egressive |
The air flow is moving outwards from the initiator to the
target. |
igs |
ingressive |
The air flow is moving inwards from the target to the
initiator. |
Initiator
pmc |
pulmonic |
The diaphragm and lungs are used to generate the airstream. |
gtc |
glottalic |
The glottis is used to generate the airstream. |
vlc |
velaric |
The velum is closed and the tongue is used to generate the
airstream. |
pcv |
percussive |
There is no airstream used to produce this sound. |
Target
nsl |
nasal |
The air flows through the nose. |
orl |
oral |
The air flows through the mouth. |
Co-articulation
Manner
occ |
occlusive |
The air flow is blocked within the vocal tract. |
frv |
fricative |
The air flow is constricted, causing turbulence. |
fla |
flap |
A single tap of the tongue against the secondary articulator. |
tri |
trill |
A rapid vibration of the primary articulator against the secondary
articulator. |
app |
approximant |
The vocal tract is narrowed at the place of articulation without
being turbulant. |
vow |
vowel |
The phoneme is articulated as a vowel instead of a consonant. |
Place of Articulation
The place of articulation is described in terms of an active
articulator and one or more passive
articulators[9]. The possible places of
articulation are:
bilabial |
blb |
lbl |
ulp |
|
|
linguolabial |
lgl |
lmn |
ulp |
|
|
labiodental |
lbd |
lbl |
|
utt |
|
bilabial-labiodental |
bld |
bld |
ulp |
utt |
|
interdental |
idt |
lmn |
|
utt |
|
dental |
dnt |
apc |
|
utt |
|
denti-alveolar |
dta |
lmn |
|
utt |
alf |
alveolar |
alv |
lmn |
|
|
alf |
apico-alveolar |
apa |
apc |
|
|
alf |
palato-alveolar |
pla |
lmn |
|
|
alb |
apical retroflex |
arf |
sac |
|
|
alb |
retroflex |
rfx |
apc |
|
|
hpl |
alveolo-palatal |
alp |
dsl |
|
|
alb |
palatal |
pal |
dsl |
|
|
hpl |
velar |
vel |
dsl |
|
|
spl |
labio-velar |
lbv |
dsl |
ulp |
|
spl |
uvular |
uvl |
dsl |
|
|
uvu |
pharyngeal |
phr |
rdl |
|
|
prx |
epiglotto-pharyngeal |
epp |
lyx |
|
|
prx |
(ary-)epiglottal |
epg |
lyx |
|
|
egs |
glottal |
glt |
lyx |
|
|
gts |
Active Articulators
lbl |
labial |
lower lip |
lmn |
laminal |
tongue blade |
apc |
apical |
tongue tip |
sac |
subapical |
underside of the tongue |
dsl |
dorsal |
tongue body |
rdl |
radical |
tongue root |
lyx |
laryngeal |
larynx |
Passive Articulators
ulp |
upper lip |
utt |
upper teeth |
alf |
alveolar ridge (front) |
alb |
alveolar ridge (back) |
hpl |
hard palate |
spl |
soft palate (velum) |
uvu |
uvular |
prx |
pharynx |
egs |
epiglottis |
gts |
glottis |
Co-articulation
pzd |
palatalized |
hpl |
vzd |
velarized |
spl |
fzd |
pharyngealized |
prx |
nzd |
nasalized |
nsl |
rzd |
rhoticized |
apc hpl |
Phonation
The phonation features describe the degree to which the glottis
(vocal chords) are open or closed.
vls |
voiceless |
The glottis is fully open, such that the vocal chords do not
vibrate. |
brv |
breathy voice |
The glottis is closed slightly, to produce a whispered or murmured
sound. |
slv |
slack voice |
The glottis is opened wider than mdv , but not enough to
be brv . |
mdv |
modal voice |
The glottis is opened to provide the optimal vibration of the vocal
chords. |
stv |
stiff voice |
The glottis is closed narrower than mdv , but not enough
to be crv . |
crv |
creaky voice |
The glottis is closed to produce a vocal or glottal fry. |
glc |
glottal closure |
The glottis is fully closed. |
Voice
voiceless |
vls |
vls |
voiced |
vcd |
mdv |
Vowel Height
hgh |
close (high) |
smh |
near-close (semi-high) |
umd |
close-mid (upper-mid) |
mid |
mid |
lmd |
open-mid (lower-mid) |
sml |
near-open (semi-low) |
low |
open (low) |
Vowel Backness
fnt |
front |
cnt |
center |
bck |
back |
Rounding and Labialization
unr |
unrounded |
No |
Close to the jaw. |
ptr |
protruded |
Yes |
Protrude outward from the jaw. |
cmp |
compressed |
Yes |
Close to the jaw. |
The degree of rounding/labialization is specified using the following
features:
mrd |
more rounded |
lrd |
less rounded |
Vowel Rounding
unrounded |
unr |
unr |
rounded |
rnd |
ptr if bck or cnt ;
cmp if fnt . |
Syllabicity
syl |
syllabic |
nsy |
non-syllabic |
Consonant Release
frr |
fricative release |
asp |
aspirated |
nrs |
nasal release |
lrs |
lateral release |
unx |
no audible release (unexploded) |
Tongue Root
The tongue root position can be specified using the following
features:
atr |
◌̘ |
advanced tongue root |
rtr |
◌̙ |
retracted tongue root |
Fortis and Lenis
Stress
st1 |
primary stress |
st2 |
secondary stress |
st3 |
extra stress |
Length
est |
extra short |
hlg |
half-long |
lng |
long |
Rhythm
sbr |
syllable break |
lnk |
linked (no break) |
Intonation
fbr |
minor (foot) break |
ibr |
major (intonation) break |
glr |
global rise |
glf |
global fall |
Tone Stepping
Tones
Tones are defined using the following 3 properties:
tone_start <value>
tone_middle <value>
tone_end <value>
The <value>
field for these properties is a number
with one of the following values:
extra high (top) |
5 |
high |
4 |
mid |
3 |
low |
2 |
extra low (bottom) |
1 |
A level tone can be specified by just using the
tone_start
value. A raising or falling
tone can be specified using the tone_start
and
tone_end
values. A raising-falling
(peaking) or falling-raising (dipping) tone
can be specified using all three values.
References
Kirshenbaum, Evan, Representing
IPA phonetics in ASCII (HTML). 1993.
Kirshenbaum, Evan, Representing
IPA phonetics in ASCII (PDF). 2001.
International Phonetic Association, The
International Phonetic Alphabet and the IPA Chart. 2015. Creative
Commons Attribution-Sharealike 3.0 Unported License (CC-BY-SA).
Wikipedia. International
Phonetic Alphabet. 2017. Creative Commons Attribution-Sharealike 3.0
Unported License (CC-BY-SA).
Dunn, R. H., Cainteoir
Text-to-Speech Phoneme Features. 2013-2015.
Wikipedia. Voiced
glottal fricative. 2017, Creative Commons Attribution-Sharealike 3.0
Unported License (CC-BY-SA).
Wikipedia. Extensions
to the International Phonetic Alphabet. 2017, Creative Commons
Attribution-Sharealike 3.0 Unported License (CC-BY-SA).
Wikipedia. Fortis and
lenis. 2017, Creative Commons Attribution-Sharealike 3.0 Unported
License (CC-BY-SA).
Wikipedia. Place of
articulation. 2017, Creative Commons Attribution-Sharealike 3.0
Unported License (CC-BY-SA).