Research on multiple choice questions

Posted on 30-10-2013

Since my last posts on multiple choice questions (here and here), Kris Boulton and Joe Kirby have pointed me in the direction of Robert Bjork’s work on remembering and forgetting. Here’s an extract from a paper titled ‘Multiple-Choice Tests Exonerated, at Least of Some Charges: Fostering Test-Induced Learning and Avoiding Test-Induced Forgetting’. The authors accept that multiple choice questions are often not good tests, but argue that this is because they are done badly rather than because of any intrinsic flaw.

The present work demonstrates that properly constructed multiple- choice practice tests can be important learning events for students. Achieving “proper construction” of such tests— which requires that incorrect alternatives be plausible, but not so plausible that they are unfair—is, however, a challenge. As any teacher who has used multiple-choice tests can testify, writing good multiple-choice items is very hard work, whereas writing poor ones is relatively easy. Thus, when people accuse multiple-choice tests of being bad tests, that accusation, statistically, has some truth to it. We argue, however, that the statistical accuracy of such accusations has more to do with human nature than with the multiple-choice format per se.

Having designed a lot of multiple choice questions, I would completely agree with this. They are very difficult to write. But of course, once you have made the up front investment of time and effort you can use them over and again. If you use them formatively as a class activity, then they can generate great class discussions. If you use them summatively, they are easier to mark than essay questions, which have the reverse effort profile: ie, relatively easy to set, relatively effortful and time-consuming to mark.

Then there is more about them here by Dylan Wiliam. Wiliam notes that another way to avoid the problem of pupils guessing is to have more than one right answer.

Most (although not all) questions are in a multiple-choice format, although many of the items have multiple correct answers. The advantage of using items with multiple correct answers is that the outcome space is much larger than with traditional multiple-choice items. For example, if the item asks students which of a list of six things is living, there are 26 possible outcomes, only one of which is correct, so that a student’s chance of guessing correctly is less than 2%, compared with approximately 17% with a traditional multiple-choice item.

I would add that such a question would have a very high cognitive load – you would have to remember the features of the question and how it applied to all 6 options. This would make it a good test not just of knowledge, but of how fragile or secure such knowledge was. For that reason, whilst it would be an excellent test for someone who had been studying the topic for a while, it might not be an ideal for a relative novice.

Then there is this powerpoint, also by Wiliam, which has some excellent examples of good multiple choice ‘hinge’ questions which target key misconceptions.  Here’s a nice example.

What can you say about the means of the following two datasets?
Set 1: 10, 12, 13, 15
Set 2: 10, 12, 13, 15, 0

A The two sets have the same mean.
B The two sets have different means.
C It depends on whether you choose to count the zero.

Finally, Macmillan Publishing have designed an interesting computer adaptive test site called Learning Curve for use with their university level economics textbooks. It contains questions like the following:

If the value of another unit of a good in its next best use is greater than the opportunity cost of producing another unit of the good:

  • in a free market, the production of another unit of this good would be considered economically inefficient.
  • in a free market, at least one more unit of the good will be produced.
  • under central planning, this information easily gathered.
  • under central planning, the amount of the good produced would decrease.

Macmillan have also put together an excellent page summarising the research that such an approach is effective – it contains references to some of Bjork’s work and some of the classic research on the retrieval effect and massed and distributed practice.