Continuous Interaction With a Smart Speaker via Low-Dimensional Embeddings of Dynamic Hand Pose

Xu, S., Kaul, C., Ge, X. and Murray-Smith, R. (2023) Continuous Interaction With a Smart Speaker via Low-Dimensional Embeddings of Dynamic Hand Pose. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023), Rhodes, Greece, 4-10 June 2023, ISBN 9781728163277 (doi: 10.1109/ICASSP49357.2023.10096097)

[img] Text
298426.pdf - Accepted Version

1MB

Abstract

This paper presents a new continuous interaction strategy with visual feedback of hand pose and mid-air gesture recognition and control for a smart music speaker, which utilizes only 2 video frames to recognize gestures. Frame-based hand pose features from MediaPipe Hands, containing 21 landmarks, are embedded into a 2 dimensional pose space by an autoencoder. The corresponding space for interaction with the music content is created by embedding high-dimensional music track profiles to a compatible two-dimensional embedding. A PointNet-based model is then applied to classify gestures which are used to control the device interaction or explore music spaces. By jointly optimising the autoencoder with the classifier, we manage to learn a more useful embedding space for discriminating gestures. We demonstrate the functionality of the system with experienced users selecting different musical moods by varying their hand pose.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Murray-Smith, Professor Roderick and Ge, Ms Xuri and Xu, Ms Songpei and Kaul, Dr Chaitanya
Authors: Xu, S., Kaul, C., Ge, X., and Murray-Smith, R.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781728163277
Published Online:05 May 2023
Copyright Holders:Copyright © 2023 IEEE
First Published:First published in IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP 2023)
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
300982Exploiting Closed-Loop Aspects in Computationally and Data Intensive AnalyticsRoderick Murray-SmithEngineering and Physical Sciences Research Council (EPSRC)EP/R018634/1Computing Science
190841UK Quantum Technology Hub in Enhanced Quantum ImagingMiles PadgettEngineering and Physical Sciences Research Council (EPSRC)EP/M01326X/1P&S - Physics & Astronomy