Weakly supervised deep metric learning on discrete metric spaces for privacy-preserved clustering

Biswas, C., Ganguly, D. , Roy, D. and Bhattacharya, U. (2022) Weakly supervised deep metric learning on discrete metric spaces for privacy-preserved clustering. Information Processing and Management, 60(1), 103109. (doi: 10.1016/j.ipm.2022.103109)

Full text not currently available from Enlighten.

Abstract

With an increase in the number of data instances, data processing operations (e.g. clustering) requires an increasing amount of computational resources, and it is often the case that for considerably large datasets such operations cannot be executed on a single workstation. This requires the use of a server computer for carrying out the operations. However, to ensure privacy of the shared data, a privacy preserving data processing workflow involves applying an encoding transformation on the set of data points prior to applying the computation. This encoding should ideally cater to two objectives—first, it should be difficult to reconstruct the data, second, the results of the operation executed on the encoded space should be as close as possible to the results of the same operation executed on the original data. While standard encoding mechanisms, such as locality sensitive hashing, caters to the first objective, the second objective may not always be adequately satisfied. In this paper, we specifically focus on ‘clustering’ as the data processing operation. We apply a deep metric learning approach to learn a parameterized encoding transformation function with an objective to maximize the alignment of the clusters in the encoded space to those in the original data. We conduct experimentation on four standard benchmark datasets, particularly MNIST, Fashion-MNIST (each dataset contains 70K grayscale images), CIFAR-10 consisting of 60K color images and 20-Newsgroups containing 18K news articles. Our experiments demonstrate that the proposed method yields better clusters in comparison to approaches where the encoding process is agnostic of the clustering objective.

Item Type:Articles
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Ganguly, Dr Debasis
Authors: Biswas, C., Ganguly, D., Roy, D., and Bhattacharya, U.
College/School:College of Science and Engineering > School of Computing Science
Journal Name:Information Processing and Management
Publisher:Elsevier
ISSN:0306-4573
ISSN (Online):1873-5371
Published Online:19 October 2022

University Staff: Request a correction | Enlighten Editors: Update this record