Optimised Scheduling of Online Experiments

Kharitonov, E., Macdonald, C. , Serdyukov, P. and Ounis, I. (2015) Optimised Scheduling of Online Experiments. In: 38th Annual ACM SIGIR Conference (SIGIR 2015), Santiago, Chile, 9-13 Aug 2015, pp. 453-462. ISBN 9781450336215 (doi: 10.1145/2766462.2767706)

105811.pdf - Accepted Version


Publisher's URL: http://www.sigir2015.org/


Modern search engines increasingly rely on online evaluation methods such as A/B tests and interleaving. These online evaluation methods make use of interactions by the search engine's users to test various changes in the search engine. However, since the number of the user sessions per unit of time is limited, the number of simultaneously running on-line evaluation experiments is bounded. In an extreme case, it might be impossible to deploy all experiments since they arrive faster than are proccessed. Consequently, it is very important to efficiently use the limited resource of the user's interactions. In this paper, we formulate the novel problem of schedule optimisation for the queue of the online experiments: given a limited number of the user interactions available for experimentation, we want to re-order the queue so that the number of successful experiments is maximised. In order to build a schedule optimisation algorithm, we start by formulating a model of an online experimentation pipeline. Next, we propose to reduce the task of finding the optimal schedule to a learning-to-rank problem, where we require the most promising experiments to be ranked first in the schedule. To evaluate the proposed approach, we perform an evaluation study using two datasets containing 82 interleaving and 35 A/B test experiments, performed by a commercial search engine. We measure the quality of a schedule as the number of successful experiments executed under the limited resource of the user interactions. Our proposed schedulers obtain improvements of up to 342% compared to the un-optimised baseline schedule on the dataset of interleaving experiments and up to 43% on the dataset of A/B tests.

Item Type:Conference Proceedings
Glasgow Author(s) Enlighten ID:Macdonald, Professor Craig and Ounis, Professor Iadh
Authors: Kharitonov, E., Macdonald, C., Serdyukov, P., and Ounis, I.
College/School:College of Science and Engineering > School of Computing Science
Copyright Holders:Copyright © 2015 ACM
Publisher Policy:Reproduced in accordance with the copyright policy of the publisher

University Staff: Request a correction | Enlighten Editors: Update this record