Bermejo, P., Hopfgartner, F. , Gamez, J., Callejon, J. and Jose, J. (2009) Comparison of Balancing Techniques for Multimedia IR over Imbalanced Datasets. In: 24th International Symposium on Computer and Information Sciences, 2009. ISCIS 2009, Guzelyurt, Turkey, 14-16 Sep 2009, pp. 674-679. ISBN 9781424450213 (doi: 10.1109/ISCIS.2009.5291904)
|
Text
39579.pdf - Accepted Version 295kB |
Abstract
A promising method to improve the performance of information retrieval systems is to approach retrieval tasks as a supervised classification problem. Previous user interactions, e.g. gathered from a thorough log file analysis, can be used to train classifiers which aim to inference relevance of retrieved documents based on user interactions. A problem in this approach is, however, the large imbalance ratio between relevant and non-relevant documents in the collection. In standard test collection as used in academic evaluation frameworks such as TREC, non-relevant documents outnumber relevant documents by far. In this work, we address this imbalance problem in the multimedia domain. We focus on the logs of two multimedia user studies which are highly imbalanced. We compare a naiinodotve solution of randomly deleting documents belonging to the majority class with various balancing algorithms coming from different fields: data classification and text classification. Our experiments indicate that all algorithms improve the classification performance of just deleting at random from the dominant class.
Item Type: | Conference Proceedings |
---|---|
Status: | Published |
Refereed: | Yes |
Glasgow Author(s) Enlighten ID: | Jose, Professor Joemon and Hopfgartner, Dr Frank |
Authors: | Bermejo, P., Hopfgartner, F., Gamez, J., Callejon, J., and Jose, J. |
Subjects: | Z Bibliography. Library Science. Information Resources > Z665 Library Science. Information Science Q Science > QA Mathematics > QA75 Electronic computers. Computer science |
College/School: | College of Arts & Humanities > School of Humanities > Information Studies College of Science and Engineering > School of Computing Science |
ISBN: | 9781424450213 |
Copyright Holders: | Copyright © 2009 IEEE |
First Published: | First published in Computer and Information Sciences, 2009: |
Publisher Policy: | Reproduced in accordance with the copyright policy of the publisher |
University Staff: Request a correction | Enlighten Editors: Update this record