Contrastive State Augmentations for Reinforcement Learning-Based Recommender Systems

Ren, Z., Huang, N., Wang, Y., Ren, P., Ma, J., Lei, J., Shi, X., Luo, H., Jose, J. M. and Xin, X. (2023) Contrastive State Augmentations for Reinforcement Learning-Based Recommender Systems. In: 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR23), Taipei, Taiwan, 23-27 July 2023, pp. 922-931. ISBN 9781450394086 (doi: 10.1145/3539618.3591656)

[img] Text
299230.pdf - Accepted Version

1MB

Abstract

Learning reinforcement learning (RL)-based recommenders from historical user-item interaction sequences is vital to generate high-reward recommendations and improve long-term cumulative benefits. However, existing RL recommendation methods encounter difficulties (i) to estimate the value functions for states which are not contained in the offline training data, and (ii) to learn effective state representations from user implicit feedback due to the lack of contrastive signals. In this work, we propose contrastive state augmentations (CSA) for the training of RL-based recommender systems. To tackle the first issue, we propose four state augmentation strategies to enlarge the state space of the offline data. The proposed method improves the generalization capability of the recommender by making the RL agent visit the local state regions and ensuring the learned value functions are similar between the original and augmented states. For the second issue, we propose introducing contrastive signals between augmented states and the state randomly sampled from other sessions to improve the state representation learning further. To verify the effectiveness of the proposed CSA, we conduct extensive experiments on two publicly accessible datasets and one dataset collected from a real-life e-commerce platform. We also conduct experiments on a simulated environment as the online evaluation setting. Experimental results demonstrate that CSA can effectively improve recommendation performance.

Item Type:Conference Proceedings
Additional Information:This research was funded by the Natural Science Foundation of China (62272274, 61972234, 62072279, 62102234, 62202271), Meituan, the Natural Science Foundation of Shandong Province (ZR2021QF129), the Key Scientific and Technological Innovation Program of Shandong Province (2019JZZY010129), Shandong University multidisciplinary research and innovation team of young scholars (No. 2020QNQT017), the Tencent WeChat Rhino-Bird Focused Research Program (JR-WXG2021411), the Fundamental Research Funds of Shandong University.
Keywords:Recommender system, reinforcement learning, contrastive learning, data augmentation, sequential recommendation.
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Jose, Professor Joemon
Authors: Ren, Z., Huang, N., Wang, Y., Ren, P., Ma, J., Lei, J., Shi, X., Luo, H., Jose, J. M., and Xin, X.
College/School:College of Science and Engineering > School of Computing Science
ISBN:9781450394086
Copyright Holders:Copyright © 2023 held by the owner/author(s)
First Published:First published in SIGIR '23: Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval
Publisher Policy:Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record