A Python Interface to PISA!

MacAvaney, S. and Macdonald, C. (2022) A Python Interface to PISA! In: SIGIR 2022: 45th International ACM SIGIR Conference on Research and Development in Information Retrieval, Madrid, Spain, 11-15 Jul 2022, pp. 3339-3344. ISBN 9781450387323 (doi: 10.1145/3477495.3531656)

[img] Text
268397.pdf - Accepted Version



PISA (Performant Indexes and Search for Academia) provides very efficient implementations of various retrieval algorithms over sparse inverted indices. The highly-optimized C++ implementation, however, has previously only been accessible via command line tools. From indexing to retrieval, 5--6 commands need to be executed in sequence, making the process relatively involved. Further complications when using PISA include a lengthy build process and minimal interoperability with other tools. In this work, we demonstrate a new tool that provides a native Python wrapper around PISA. The wrapper features a simplified interface that adheres to the PyTerrier API, making it easy to use (e.g., via Pandas DataFrames), apply to a multitude of datasets (e.g., those from the ir_datasets package) and combine with other methods (e.g., neural re-ranking and dense retrieval methods).

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:MacAvaney, Dr Sean and Macdonald, Professor Craig
Authors: MacAvaney, S., and Macdonald, C.
College/School:College of Science and Engineering > School of Computing Science
Copyright Holders:Copyright © 2022 Association for Computing Machinery
First Published:First published in SIGIR 2022: 45th International ACM SIGIR Conference on Research and Development in Information Retrieval: 3339-3344
Publisher Policy:Reproduced in accordance with the publisher copyright policy
Related URLs:

University Staff: Request a correction | Enlighten Editors: Update this record

Project CodeAward NoProject NamePrincipal InvestigatorFunder's NameFunder RefLead Dept
300982Exploiting Closed-Loop Aspects in Computationally and Data Intensive AnalyticsRoderick Murray-SmithEngineering and Physical Sciences Research Council (EPSRC)EP/R018634/1Computing Science