Query Embedding Pruning for Dense Retrieval

Tonellotto, N. and Macdonald, C. (2021) Query Embedding Pruning for Dense Retrieval. In: 30th ACM International Conference on Information and Knowledge Management, Virtual Event Queensland, Australia, 01-05 Nov 2021, pp. 3453-3457. ISBN 9781450384469 (doi: 10.1145/3459637.3482162)

[img] Text
249267.pdf - Accepted Version

875kB

Abstract

Recent advances in dense retrieval techniques have offered the promise of being able not just to re-rank documents using contextualised language models such as BERT, but also to use such models to identify documents from the collection in the first place. However, when using dense retrieval approaches that use multiple embedded representations for each query, a large number of documents can be retrieved for each query, hindering the efficiency of the method. Hence, this work is the first to consider efficiency improvements in the context of a dense retrieval approach (namely ColBERT), by pruning query term embeddings that are estimated not to be useful for retrieving relevant documents. Our proposed query embeddings pruning reduces the cost of the dense retrieval operation, as well as reducing the number of documents that are retrieved and hence require to be fully scored. Experiments conducted on the MSMARCO passage ranking corpus demonstrate that, when reducing the number of query embeddings used from 32 to 3 based on the collection frequency of the corresponding tokens, query embedding pruning results in no statistically significant differences in effectiveness, while reducing the number of documents retrieved by 70%. In terms of mean response time for the end-to-end to end system, this results in a 2.65x speedup.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Macdonald, Dr Craig and Tonellotto, Dr Nicola
Authors: Tonellotto, N., and Macdonald, C.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781450384469
Copyright Holders:Copyright © 2021 Association for Computing Machinery
First Published:First published in CIKM '21: Proceedings of the 30th ACM International Conference on Information & Knowledge Management
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
300982Exploiting Closed-Loop Aspects in Computationally and Data Intensive AnalyticsRoderick Murray-SmithEngineering and Physical Sciences Research Council (EPSRC)EP/R018634/1Computing Science