Skill set profile clustering: the empty K-means algorithm with automatic specification of starting cluster centers

Nugent, R., Dean, N. and Ayers, E. (2010) Skill set profile clustering: the empty K-means algorithm with automatic specification of starting cluster centers. In: EDM2010: 3rd International Conference on Educational Data Mining, Pittsburgh, USA, 11-13 June 2010,

[img]
Preview
Text
47659.pdf

108kB

Abstract

While students’ skill set profiles can be estimated with formal cognitive diagnosis models [8], their computational complexity makes simpler proxy skill estimates attractive [1, 4, 6]. These estimates can be clustered to generate groups of similar students. Often hierarchical agglomerative clustering or k-means clustering is utilized, requiring, for K skills, the specification of 2^K clusters. The number of skill set profiles/clusters can quickly become computationally intractable. Moreover, not all profiles may be present in the population. We present a flexible version of k-means that allows for empty clusters. We also specify a method to determine efficient starting centers based on the Q-matrix. Combining the two substantially improves the clustering results and allows for analysis of data sets previously thought impossible.

Item Type:Conference Proceedings
Status:Published
Refereed:Yes
Glasgow Author(s) Enlighten ID:Dean, Dr Nema
Authors: Nugent, R., Dean, N., and Ayers, E.
Subjects:H Social Sciences > HA Statistics
College/School:College of Science and Engineering > School of Mathematics and Statistics > Statistics
Copyright Holders:Copyright © 2010 The Authors
Publisher Policy:Reproduced with the permission of the authors

University Staff: Request a correction | Enlighten Editors: Update this record