Enlighten Publications

In this section

Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation

Wu, Y., Macdonald, C. and Ounis, I. (2021) Partially Observable Reinforcement Learning for Dialog-based Interactive Recommendation. In: 15th ACM Conference on Recommender Systems (RecSys21), Amsterdam, The Netherlands, 27 Sep - 01 Oct 2021, pp. 241-251. (doi: 10.1145/3460231.3474256)

Text
246701.pdf - Accepted Version
2MB

Abstract

A dialog-based interactive recommendation task is where users can express natural-language feedback when interacting with the recommender system. However, the users’ feedback, which takes the form of natural-language critiques about the recommendation at each iteration, can only allow the recommender system to obtain a partial portrayal of the users’ preferences. Indeed, such partial observations of the users’ preferences from their natural-language feedback make it challenging to correctly track the users’ preferences over time, which can result in poor recommendation performances and a less effective satisfaction of the users’ information needs when in presence of limited iterations. Reinforcement learning, in the form of a partially observable Markov decision process (POMDP), can simulate the interactions between a partially observable environment (i.e. a user) and an agent (i.e. a recommender system). To alleviate such a partial observation issue, we propose a novel dialog-based recommendation model, the Estimator-Generator-Evaluator (EGE) model, with Q-learning for POMDP, to effectively incorporate the users’ preferences over time. Specifically, we leverage an Estimator to track and estimate users’ preferences, a Generator to match the estimated preferences with the candidate items to rank the next recommendations, and an Evaluator to judge the quality of the estimated preferences considering the users’ historical feedback. Following previous work, we train our EGE model by using a user simulator which itself is trained to describe the differences between the target users’ preferences and the recommended items in natural language. Thorough and extensive experiments conducted on two recommendation datasets – addressing images of fashion products (namely dresses and shoes) – demonstrate that our proposed EGE model yields significant improvements in comparison to the existing state-of-the-art baseline models.

Item Type:	Conference Proceedings
Additional Information:	The authors acknowledge support from EPSRC grant EP/R018634/1 entitled Closed-Loop Data Science for Complex, Computationallyand Data-Intensive Analytics.
Status:	Published
Refereed:	Yes
Glasgow Author(s) Enlighten ID:	Macdonald, Professor Craig and Wu, Mr Yaxiong and Ounis, Professor Iadh
Authors:	Wu, Y., Macdonald, C., and Ounis, I.
College/School:	College of Science and Engineering > School of Computing Science
Copyright Holders:	Copyright © 2021 Association for Computing Machinery
Publisher Policy:	Reproduced in accordance with the copyright policy of the publisher
Related URLs:	Organisation Publisher

University Staff: Request a correction | Enlighten Editors: Update this record

Funder and Project Information

Project Code	Award No	Project Name	Principal Investigator	Funder's Name	Funder Ref	Lead Dept
300982		Exploiting Closed-Loop Aspects in Computationally and Data Intensive Analytics	Roderick Murray-Smith	Engineering and Physical Sciences Research Council (EPSRC)	EP/R018634/1	Computing Science

Deposit and Record Details

ID Code:	246701
Depositing User:	Mr Alastair Arthur
Datestamp:	13 Jul 2021 08:40
Last Modified:	07 Nov 2022 13:31
Date of acceptance:	6 July 2021
Date of first online publication:	13 September 2021
Date Deposited:	3 August 2021
Data Availability Statement:	No