Multiple choice and true/false tests: reliability measures and some implications of negative marking

Burton, R. F. (2004) Multiple choice and true/false tests: reliability measures and some implications of negative marking. Assessment and Evaluation in Higher Education, 29(5), pp. 585-595. (doi: 10.1080/02602930410001689153)

Full text not currently available from Enlighten.


The standard error of measurement usefully provides confidence limits for scores in a given test, but is it possible to quantify the reliability of a test with just a single number that allows comparison of tests of different format? Reliability coefficients do not do this, being dependent on the spread of examinee attainment. Better in this regard is a measure produced by dividing the standard error of measurement by the test's ‘reliability length’, the latter defined as the maximum possible score minus the most probable score obtainable by blind guessing alone. This, however, can be unsatisfactory with negative marking (formula scoring), as shown by data on 13 negatively marked true/false tests. In these the examinees displayed considerable misinformation, which correlated negatively with correct knowledge. Negative marking can improve test reliability by penalizing such misinformation as well as by discouraging guessing. Reliability measures can be based on idealized theoretical models instead of on test data. These do not reflect the qualities of the test items, but can be focused on specific test objectives (e.g. in relation to cut‐off scores) and can be expressed as easily communicated statements even before tests are written.

Item Type:Articles
Glasgow Author(s) Enlighten ID:Burton, Dr Richard
Authors: Burton, R. F.
College/School:College of Medical Veterinary and Life Sciences > School of Life Sciences
Journal Name:Assessment and Evaluation in Higher Education
Publisher:Taylor & Francis
ISSN (Online):1469-297X

University Staff: Request a correction | Enlighten Editors: Update this record