Enlighten Publications

In this section

A Data Driven Approach to Audiovisual Speech Mapping

Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B. , Derleth, P. and Hussain, A. (2016) A Data Driven Approach to Audiovisual Speech Mapping. In: 8th International Conference on Brain Inspired Cognitive Systems (BICS 2016), Beijing, China, 28-30 Nov 2016, pp. 331-342. ISBN 9783319496856 (doi: 10.1007/978-3-319-49685-6_30)

Preview

Text
151711.pdf
379kB

Abstract

The concept of using visual information as part of audio speech processing has been of significant recent interest. This paper presents a data driven approach that considers estimating audio speech acoustics using only temporal visual information without considering linguistic features such as phonemes and visemes. Audio (log filterbank) and visual (2D-DCT) features are extracted, and various configurations of MLP and datasets are used to identify optimal results, showing that given a sequence of prior visual frames an equivalent reasonably accurate audio frame estimation can be mapped.

Item Type:	Conference Proceedings
Status:	Published
Refereed:	Yes
Glasgow Author(s) Enlighten ID:	Whitmer, Dr William
Authors:	Abel, A., Marxer, R., Barker, J., Watt, R., Whitmer, B., Derleth, P., and Hussain, A.
College/School:	College of Medical Veterinary and Life Sciences > School of Health & Wellbeing > MRC/CSO SPHSU
ISSN:	0302-9743
ISBN:	9783319496856
Published Online:	13 November 2016
Copyright Holders:	Copyright © 2016 Springer International Publishing
First Published:	First published in Lecture Notes in Computer Science 10023:331-342
Publisher Policy:	Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record

Altmetric

Deposit and Record Details

ID Code:	151711
Depositing User:	Ms Mary Anne Meyering
Datestamp:	15 Nov 2017 11:12
Last Modified:	19 Feb 2018 11:24
Date of first online publication:	13 November 2016
Data Availability Statement:	No