It is not enough to merely identify the pieces of speech in a story. In order to model each piece of speech using appropriately different voices, it is vital that we also be able to identify the characters in the story, all of whom are potential speakers. This, in a sense, is a named-entity extraction task, since we first need to identify all the proper names in a particular story, and extract from these Named Entities, only those which are names of characters in the story. For this purpose, we have considered using a Named-Entity Extraction System for the task. Here we evaluated the performance of one of the most commonly-used named-entity extraction systems, the BBN IdentiFinder [5], which can scan through a body of text and locate the names of people, places, and other named entities of interest to the user, and output these entities in a markup format. We tested the BBN IdentiFinder on two manually-labeled children's stories selected for their contrasting stylistic differences. The number of characters in each story is within the range of 14-16. Results are shown in Table 3 and 4.
However, it is not sufficient to confine the scope of character identification to only proper names in the story. For instance, the works of Hans Christian Andersen contain a considerable number of characters who are not named, but merely referred to by descriptions; for example, the peasant's wife, the man with the sheep. In these cases we would need additional linguistic information to make the proper identification.
To this end, we have created a character identification module within ESPER. This module uses pattern matching to extract proper names from the story, similar to the functionality of the BBN IdentiFinder. In addition, it uses the part-of-speech information derived from the probabilistic POS tagger in the Festival speech synthesis system to extract non-proper names, such as definite noun phrases, as potential character names. Similar to the BBN IdentiFinder, our character identification module also encapsulates the extracted entities in the text using a markup language, specifically the CSML language, where each each character is also assigned an ID for reference purposes, as well as a CLASS to represent its type (i.e., the character name could be a proper name or a definite NP, etc.), which is useful in the speaker identification stage. An example CSML:
<CHARACTER ID="LITTLE_TUK" CLASS="properName"> Little Tuk</CHARACTER> sprang out of bed quickly and read over his lesson in the book... <CHARACTER ID="THE_OLD_WASHERWOMAN" CLASS="defNP"> The old washerwoman </CHARACTER> put her head in at the door, and nodded to him quite kindly..We tested the performance of both the BBN IdentiFinder and ESPER's character identification module and the results are as follows: