Next: Identifying Spoken Speech in
Up: Identifying Speakers in Children's
Previous: Introduction
In order to narrate a children's story using a variety of synthesized
voices, ESPER steps through a number of stages to identify the speaker
for each piece of quoted speech in the story. It is necessary to
first identify all the pieces of spoken speech in the story, as well
as all the characters in the story who are potential speakers. Then
an association must be made between each piece of quoted speech and the
appropriate story character who has spoken it. At each processing
step, ESPER encapsulates all the acquired speech information in a
markup format such as HTML, Sable (an XML-based speech synthesis
markup language) [3], and CSML, (Childrens Story Markup
Language), a specially-created Markup language for speech
information in children's stories.
ESPER is implemented within the Festival Speech Synthesis framework
[4]. Although ESPER itself does not speak, it will be a
component of the larger storyteller system. Festival also provides
much of the infrastructure that detailed text analysis requires: such as
punctuation and tokenization, part of speech tagging,
utterance representation, and extraction of data for machine
learning techniques. In addition, we make use of Festival's XML support.
Subsections
Next: Identifying Spoken Speech in
Up: Identifying Speakers in Children's
Previous: Introduction
Alan W Black
2003-10-20