A relative information gain-based query performance prediction framework with generated query variants

Datta, S., Ganguly, D. , Mitra, M. and Greene, D. (2023) A relative information gain-based query performance prediction framework with generated query variants. ACM Transactions on Information Systems, 41(2), 38. (doi: 10.1145/3545112)

[img] Text
273315.pdf - Accepted Version

3MB

Abstract

Query performance prediction (QPP) methods, which aim to predict the performance of a query, often rely on evidences in the form of different characteristic patterns in the distribution of Retrieval Status Values (RSVs). However, for neural IR models, it is usually observed that the RSVs are often less reliable for QPP because they are bounded within short intervals, different from the situation for statistical models. To address this limitation, we propose a model-agnostic QPP framework that gathers additional evidences by leveraging information from the characteristic patterns of RSV distributions computed over a set of automatically-generated query variants, relative to that of the current query. Specifically, the idea behind our proposed method - Weighted Relative Information Gain (WRIG), is that a substantial relative decrease or increase in the standard deviation of the RSVs of the query variants is likely to be a relative indicator of how easy or difficult the original query is. To cater for the absence of human-annotated query variants in real-world scenarios, we further propose an automatic query variant generation method. This can produce variants in a controlled manner by substituting terms from the original query with new ones sampled from a weighted distribution, constructed either via a relevance model or with the help of an embedded representation of query terms. Our experiments on the TREC-Robust, ClueWeb09B and MS MARCO datasets show that WRIG, by the use of this relative changes in QPP estimate, leads to significantly better results than a state-of-the-art baseline method which leverages information from (manually created) query variants by the application of additive smoothing [64]. The results also show that our approach can improve the QPP effectiveness of neural retrieval approaches in particular.

Item Type:Articles
Additional Information:This work was partially supported by Science Foundation Ireland (SFI) under Grant Number SFI/12/RC/2289_P2.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Ganguly, Dr Debasis
Authors: Datta, S., Ganguly, D., Mitra, M., and Greene, D.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:ACM Transactions on Information Systems
Publisher:ACM Press
ISSN:1046-8188
ISSN (Online):1558-2868
Published Online:23 June 2022
Copyright Holders:Copyright © 2022 Association for Computing Machinery
First Published:First published in ACM Transactions on Information Systems 41(2): 38
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record