Personal name resolution of web people search

Balog, K., Azzopardi, L. and de Rijke, M. (2008) Personal name resolution of web people search. In: Workshop NLP Challenges in the Information Explosion Era (NLPIX 2008), Beijing, China, 22 April 2008,

Full text not currently available from Enlighten.

Abstract

Disambiguating personal names in a set of documents (such as a set of web pages returned in response to a person name) is a dicult and challenging task. In this paper, we ex- plore the extent to which the \cluster hypothesis" for this task holds (i.e., that similar documents tend to represent the same person). We explore two clustering techniques which used either (1) term based matching (single pass clustering) or (2) semantic based matching (Probabilistic Latent Semantic Analysis). We compare and contrast these strategies and provide strong evidence to suggest that the hypothesis holds for the former. And in fact, on the new evaluation platform of the SemEval 2007 Web People Search task, we show that using single pass clustering with a standard IR document representations ts well with the assumptions about the data and the task which yields state-of-the-art performance.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Azzopardi, Dr Leif
Authors: Balog, K., Azzopardi, L., and de Rijke, M.
Subjects:Q Science > QA Mathematics > QA75 Electronic computers. Computer science
College/School:College of Science and Engineering > School of Computing Science

University Staff: Request a correction | Enlighten Editors: Update this record