VILT: Video Instructions Linking for Complex Tasks

Fischer, S. , Gemmell, C., Mackie, I. and Dalton, J. (2022) VILT: Video Instructions Linking for Complex Tasks. In: 2nd International Workshop on Interactive Multimedia Retrieval (IMuR’22), 14th October 2022, Lisbon, Portugal, pp. 41-47. ISBN 9781450394970 (doi: 10.1145/3552467.3554794)

[img] Text
277092.pdf - Accepted Version
Available under License Creative Commons Attribution.

2MB

Abstract

This work addresses challenges in developing conversational assistants that support rich multimodal video interactions to accomplish real-world tasks interactively. We introduce the task of automatically linking instructional videos to task steps as "Video Instructions Linking for Complex Tasks" (VILT). Specifically, we focus on the domain of cooking and empowering users to cook meals interactively with a video-enabled Alexa skill. We create a reusable benchmark with 61 queries from recipe tasks and curate a collection of 2,133 instructional "How-To" cooking videos. Studying VILT with state-of-the-art retrieval methods, we find that dense retrieval with ANCE is the most effective, achieving an NDCG@3 of 0.566 and P@1 of 0.644. We also conduct a user study that measures the effect of incorporating videos in a real-world task setting, where 10 participants perform several cooking tasks with varying multimodal experimental conditions using a state-of-the-art Alexa TaskBot system. The users interacting with manually linked videos said they learned something new 64% of the time, which is a 9% increase compared to the automatically linked videos (55%), indicating that linked video relevance is important for task learning.

Item Type:Conference Proceedings
Additional Information:This work is supported by the Engineering and Physical Sciences Research Council grant EP/V025708/1. It was also supported by the Amazon Alexa Prize TaskBot challenge.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Gemmell, Mr Carlos and Fischer, Ms Sophie and Mackie, Iain and Dalton, Dr Jeff
Authors: Fischer, S., Gemmell, C., Mackie, I., and Dalton, J.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781450394970
Copyright Holders:Copyright © 2022 The Authors
First Published:First published in IMuR '22: Proceedings of the 2nd International Workshop on Interactive Multimedia Retrieval
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
310549Dalton-UKRI-Turing FellowJeff DaltonEngineering and Physical Sciences Research Council (EPSRC)EP/V025708/1Computing Science