Word endpoint correction techniques for a text-to-multimedia composition system

TitleWord endpoint correction techniques for a text-to-multimedia composition system
Publication TypeConference Paper
Year of Publication2002
AuthorsTurkowski, K., B. Hamidzadeh, and R. Ward
Conference NameTools with Artificial Intelligence, 2002. (ICTAI 2002). Proceedings. 14th IEEE International Conference on
Pagination325 - 332
Keywordsaudio concatenation, endpoint detection, multimedia computing, phoneme based, speech recognition, text-to-multimedia, uttered audio words, word detection, word endpoint detection
Abstract

In concatenative Text-to-Multimedia composition systems, accurate endpoint detection of the uttered audio words, used as input into such systems, is needed to ensure optimal audio concatenation. In this paper, we outline a method of endpoint detection that takes advantage of the fact that the text of the audio word is known before endpoint detection actually takes place. From the text, the associated phonemes for an uttered audio word can be determined and used to fine tune the endpoint detection for a specific given phoneme.

URLhttp://dx.doi.org/10.1109/TAI.2002.1180821
DOI10.1109/TAI.2002.1180821

a place of mind, The University of British Columbia

Electrical and Computer Engineering
2332 Main Mall
Vancouver, BC Canada V6T 1Z4
Tel +1.604.822.2872
Fax +1.604.822.5949
Email:

Emergency Procedures | Accessibility | Contact UBC | © Copyright 2021 The University of British Columbia