A study of the Dirichlet Priors for term frequency normalisation

He, B. and Ounis, I. (2005) A study of the Dirichlet Priors for term frequency normalisation. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador, Brazil, 15-19 August 2005, pp. 465-471. ISBN 1595930345 (doi: 10.1145/1076034.1076114)

[img]
Preview
Text
he3549.pdf

196kB

Publisher's URL: http://doi.acm.org/10.1145/1076034.1076114

Abstract

In Information Retrieval (IR), the Dirichlet Priors have been applied to the smoothing technique of the language modeling approach. In this paper, we apply the Dirichlet Priors to the term frequency normalisation of the classical BM25 probabilistic model and the Divergence from Randomness PL2 model. The contributions of this paper are twofold. First, through extensive experiments on four TREC collections, we show that the newly generated models, to which the Dirichlet Priors normalisation is applied, provide robust and effective performance. Second, we propose a novel theoretically-driven approach to the automatic parameter tuning of the Dirichlet Priors normalisation. Experiments show that this tuning approach optimises the retrieval performance of the newly generated Dirichlet Priors-based weighting models.

Item Type:Conference Proceedings
Additional Information:© ACM, 2005. This is the author's version of the work. It is posted here by permission of ACM for your personal use. Not for redistribution. The definitive version was published in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, (2005) http://doi.acm.org/10.1145/1076034.1076114
Keywords:Term frequency normalisation, weighting model, Dirichlet Priors.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:He, Mr Ben and Ounis, Professor Iadh
Authors: He, B., and Ounis, I.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
Q Science > QA Mathematics > QA76 Computer software
College/School:College of Science and Engineering > School of Computing Science
Publisher:IEEE
ISBN:1595930345
Copyright Holders:Copyright © 2005 IEEE
First Published:First published in Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher.

University Staff: Request a correction | Enlighten Editors: Update this record