Wang, X., Macdonald, C. and Ounis, I. (2022) Improving zero-shot retrieval using dense external expansion. Information Processing and Management, 59(5), 103026. (doi: 10.1016/j.ipm.2022.103026)
![]() |
Text
274807.pdf - Published Version Available under License Creative Commons Attribution. 1MB |
Abstract
Pseudo-relevance feedback (PRF) is a classical technique to improve search engine retrieval effectiveness, by closing the vocabulary gap between users’ query formulations and the relevant documents. While PRF is typically applied on the same target corpus as the final retrieval, in the past, external expansion techniques have sometimes been applied to obtain a high-quality pseudo-relevant feedback set using the external corpus. However, such external expansion approaches have only been studied for sparse (BoW) retrieval methods, and its effectiveness for recent dense retrieval methods remains under-investigated. Indeed, dense retrieval approaches such as ANCE and ColBERT, which conduct similarity search based on encoded contextualised query and document embeddings, are of increasing importance. Moreover, pseudo-relevance feedback mechanisms have been proposed to further enhance dense retrieval effectiveness. In particular, in this work, we examine the application of dense external expansion to improve zero-shot retrieval effectiveness, i.e. evaluation on corpora without further training. Zero-shot retrieval experiments with six datasets, including two TREC datasets and four BEIR datasets, when applying the MSMARCO passage collection as external corpus, indicate that obtaining external feedback documents using ColBERT can significantly improve NDCG@10 for the sparse retrieval (by upto 28%) and the dense retrieval (by upto 12%). In addition, using ANCE on the external corpus brings upto 30% NDCG@10 improvements for the sparse retrieval and upto 29% for the dense retrieval.
Item Type: | Articles |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Macdonald, Professor Craig and Ounis, Professor Iadh and Wang, Ms Xiao |
Creator Roles: | Wang, X.Conceptualization, Data curation, Formal analysis, Investigation, Methodology, Project administration, Resources, Software, Validation, Visualization, Writing – original draft Macdonald, C.Writing – review and editing, Project administration, Investigation, Validation, Supervision, Methodology, Software, Resources, Conceptualization Ounis, I.Writing – review and editing, Project administration, Investigation, Supervision, Methodology, Conceptualization |
Authors: | Wang, X., Macdonald, C., and Ounis, I. |
College/School: | College of Science and Engineering > School of Computing Science |
Journal Name: | Information Processing and Management |
Publisher: | Elsevier |
ISSN: | 0306-4573 |
ISSN (Online): | 1873-5371 |
Published Online: | 02 August 2022 |
Copyright Holders: | Copyright © 2022 The Authors |
First Published: | First published in Information Processing and Management 59(5): 103026 |
Publisher Policy: | Reproduced under a Creative Commons License |
University Staff: Request a correction | Enlighten Editors: Update this record