Enlighten Publications

In this section

Is Deep Learning a Valid Approach for Inferring Subjective Self-Disclosure in Human-Robot Interactions?

Powell, H., Laban, G. , George, J.-N. and Cross, E. S. (2022) Is Deep Learning a Valid Approach for Inferring Subjective Self-Disclosure in Human-Robot Interactions? In: 2022 ACM/IEEE International Conference on Human-Robot Interaction (HRI '22), Sapporo, Japan, 07-10 Mar 2022, pp. 991-996. ISBN 9781665407311 (doi: 10.5555/3523760.3523921)

Text
266887.pdf - Accepted Version
329kB

Abstract

One limitation of social robots has been the ability of the models they operate on to infer meaningful social information about people's subjective perceptions, specifically from non-invasive behavioral cues. Accordingly, our paper aims to demonstrate how different deep learning architectures trained on data from human-robot, human-human, and human-agent interactions can help artificial agents to extract meaning, in terms of people's subjective perceptions, in speech-based interactions. Here we focus on identifying people's perceptions of their subjective self-disclosure (i.e., to what extent one perceives to be sharing personal information with an agent). We approached this problem in a data-first manner, prioritizing high quality data over complex model architectures. In this context, we aimed to examine the extent to which relatively simple deep neural networks could extract non-lexical features related to this kind of subjective self perception. We show that five standard neural network architectures and one novel architecture, which we call a Hopfield Convolutional Neural Network, are all able to extract meaningful features from speech data relating to subjective self-disclosure.

Item Type:	Conference Proceedings
Keywords:	Ai, conversational agents, deep learning, hri, social robots, conversational ai, affective computing, affective science.
Status:	Published
Refereed:	Yes
Glasgow Author(s) Enlighten ID:	Powell, Mr Henry and Cross, Professor Emily and Laban, Mr Guy and George, Mr Jean-Noël
Authors:	Powell, H., Laban, G., George, J.-N., and Cross, E. S.
Subjects:	B Philosophy. Psychology. Religion > BF Psychology Q Science > QA Mathematics Q Science > QA Mathematics > QA75 Electronic computers. Computer science Q Science > QA Mathematics > QA76 Computer software T Technology > T Technology (General) T Technology > TJ Mechanical engineering and machinery T Technology > TK Electrical engineering. Electronics Nuclear engineering
College/School:	College of Medical Veterinary and Life Sciences > School of Psychology & Neuroscience
Research Group:	Social Brain in Action Lab
ISBN:	9781665407311
Copyright Holders:	Copyright © 2022 IEEE
First Published:	First published in HRI '22: Proceedings of the 2022 ACM/IEEE International Conference on Human-Robot Interaction: 991-996
Publisher Policy:	Reproduced in accordance with the publisher copyright policy

University Staff: Request a correction | Enlighten Editors: Update this record

Funder and Project Information

Project Code	Project Name	Principal Investigator	Funder's Name	Funder Ref	Lead Dept
303930	SOCIAL ROBOTS	Emily Cross	European Research Council (ERC)	677270	Centre for Neuroscience
304215	Philip Leverhulme Prize - EC	Emily Cross	Leverhulme Trust (LEVERHUL)	PLP-2018-152	Centre for Neuroscience
306871	The European Training Network on Informal Care	Emily Cross	European Commission (EC)	814072	Centre for Neuroscience

References

[1] E. S. Cross, R. Hortensius, and A. Wykowska, “From social brains to social robots: applying neurocognitive insights to human-robot interaction,” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 374, no. 1771, p. 20180024, 2019. [2] A. Henschel, G. Laban, and E. S. Cross, “What Makes a Robot Social? A Review of Social Robots from Science Fiction to a Home or Hospital Near You,” Current Robotics Reports, no. 2, pp. 9–19, 2021. [3] C. Catmur, E. S. Cross, and H. Over, “Understanding self and others: from origins to disorders,” Philosophical Transactions of the Royal Society B: Biological Sciences, vol. 371, no. 1686, p. 20150066, 2016. [4] D. Premack and G. Woodruff, “Does the chimpanzee have a theory of mind?,” Behavioral and Brain Sciences, vol. 1, no. 4, pp. 515–526, 1978. [5] L. J. Byom and B. Mutlu, “Theory of mind: mechanisms, methods, and new directions,” Frontiers in human neuroscience, vol. 7, p. 413, aug 2013. [6] A. Kappas, R. Stower, and E. J. Vanman, “Communicating with robots: What we do wrong and what we do right in artificial social intelligence, and what we need to do better,” 2020. [7] C. Antaki, R. Barnes, and I. Leudar, “Diagnostic formulations in psychotherapy,” Discourse Studies, vol. 7, no. 6, pp. 627–647, 2005. [8] H. Kreiner and Y. Levi-Belz, “Self-Disclosure Here and Now: Combining Retrospective Perceived Assessment With Dynamic Behavioral Measures,” Frontiers in Psychology, vol. 10, p. 558, 2019. [9] J. Omarzu, “A Disclosure Decision Model: Determining How and When Individuals Will Self-Disclose,” Pers Soc Psychol Rev, vol. 4, no. 2, pp. 174–185, 2000. [10] G. Laban, J.-N. George, V. Morrison, and E. S. Cross, “Tell me more! assessing interactions with social robots from speech,” Paladyn, Journal of Behavioral Robotics, vol. 12, no. 1, pp. 136–159, 2021. [11] G. Laban, V. Morrison, and E. S. Cross, “Let’s talk about it! subjective and objective disclosures to social robots,” p. 328–330, Association for Computing Machinery, 2020. [12] S. M. Jourard, Self-disclosure: An experimental analysis of the transparent self. Oxford, England: John Wiley, 1971. [13] H. Meng, T. Yan, F. Yuan, and H. Wei, “Speech emotion recognition from 3d log-mel spectrograms with deep learning network,” IEEE Access, vol. 7, pp. 125868–125881, 2019. [14] C. Etienne, G. Fidanza, A. Petrovskii, L. Devillers, and B. Schmauch, “Speech emotion recognition with data augmentation and layer-wise learning rate adjustment,” CoRR, vol. abs/1802.05630, 2018. [15] F. Eyben, K. R. Scherer, B. W. Schuller, J. Sundberg, E. Andre,´ C. Busso, L. Y. Devillers, J. Epps, P. Laukka, S. S. Narayanan, and K. P. Truong, “The geneva minimalistic acoustic parameter set (gemaps) for voice research and affective computing,” IEEE Transactions on Affective Computing, vol. 7, no. 2, pp. 190–202, 2016. [16] “Anonymized for peer-review process - anonymized version of the paper available upon request,” [17] N. Jaitly and E. Hinton, “Vocal tract length perturbation (vtlp) improves speech recognition,” 2013. [18] C. Kim, M. Shin, A. Garg, and D. Gowda, “Improved vocal tract length perturbation for a state-of-the-art end-to-end speech recognition system,” pp. 739–743, 09 2019. [19] I. Rebai, Y. BenAyed, W. Mahdi, and J.-P. Lorre,´ “Improving speech recognition using data augmentation and acoustic model fusion,” Procedia Computer Science, vol. 112, pp. 316 – 322, 2017. KnowledgeBased and Intelligent Information Engineering Systems: Proceedings of the 21st International Conference, KES-20176-8 September 2017, Marseille, France. [20] J. H. Ahrens and U. Dieter, “Sequential random sampling,” ACM Trans. Math. Softw., vol. 11, p. 157–169, June 1985. [21] J. Byrd and Z. C. Lipton, “Weighted risk minimization & deep learning,” CoRR, vol. abs/1812.03372, 2018. [22] F. Eyben, M. Wollmer, ¨ and B. Schuller, “Opensmile: The munich versatile and fast open-source audio feature extractor,” in Proceedings of the 18th ACM International Conference on Multimedia, MM ’10, (New York, NY, USA), p. 1459–1462, Association for Computing Machinery, 2010. [23] Y. LeCun, B. Boser, J. S. Denker, D. Henderson, R. E. Howard, W. Hubbard, and L. D. Jackel, “Backpropagation applied to handwritten zip code recognition,” Neural Computation, vol. 1, no. 4, pp. 541–551, 1989. [24] N. Kalchbrenner, E. Grefenstette, and P. Blunsom, “A convolutional neural network for modelling sentences,” CoRR, vol. abs/1404.2188, 2014. [25] M. Schmitt and B. Schuller, “Deep recurrent neural networks for emotion recognition in speech,” in Fortschritte der Akustik - DAGA 2018: Proceedings der 44. Jahrestagung fur¨ Akustik, Munchen, ¨ Deutschland, 19-22 Marz ¨ 2018 (B. Seeber, ed.), 2018. [26] T. N. Sainath, O. Vinyals, A. Senior, and H. Sak, “Convolutional, long short-term memory, fully connected deep neural networks,” in 2015 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 4580–4584, 2015. [27] R. Pascanu, T. Mikolov, and Y. Bengio, “Understanding the exploding gradient problem,” CoRR, vol. abs/1211.5063, 2012. [28] D. Bahdanau, K. Cho, and Y. Bengio, “Neural machine translation by jointly learning to align and translate,” 2014. cite arxiv:1409.0473Comment: Accepted at ICLR 2015 as oral presentation. [29] Y. Wang, M. Huang, X. Zhu, and L. Zhao, “Attention-based lstm for aspect-level sentiment classification,” in Proceedings of the 2016 conference on empirical methods in natural language processing, pp. 606–615, 2016. [30] M. Yang, W. Tu, J. Wang, F. Xu, and X. Chen, “Attention-based lstm for target-dependent sentiment classification,” in Proceedings of the thirtyfirst AAAI conference on artificial intelligence, pp. 5013–5014, 2017. [31] Y. Xie, R. Liang, Z. Liang, C. Huang, C. Zou, and B. Schuller, “Speech emotion classification using attention-based lstm,” IEEE/ACM Transactions on Audio, Speech, and Language Processing, vol. 27, no. 11, pp. 1675–1685, 2019. [32] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. u. Kaiser, and I. Polosukhin, “Attention is all you need,” in Advances in Neural Information Processing Systems 30 (I. Guyon, U. V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, and R. Garnett, eds.), pp. 5998–6008, Curran Associates, Inc., 2017. [33] D. Kingma and J. Ba, “Adam: A method for stochastic optimization,” International Conference on Learning Representations, 12 2014.

Deposit and Record Details

ID Code:	266887
Depositing User:	Mr Guy Laban
Datestamp:	11 Mar 2022 09:12
Last Modified:	18 Mar 2022 10:03
Date of acceptance:	17 January 2022
Date of first online publication:	7 March 2022
Date Deposited:	11 March 2022
Data Availability Statement:	No