Are specific types of exam question formats biased towards one gender?: an examination of Medical School records
Shona Kelly (School of Community Health Sciences).
There is some considerable debate, but little empirical research, about gender differences in school performance. Gender differences have variously been attributed to gender preference for type of question format, innate skills, and tendency by females to avoid taking risks. Questions that use the true/false/abstain (T/F) format have been targeted as a specific example of gender bias.
Data, provided by the Medical Education Unit at the University of Nottingham, consisted of all final course grades in the first two years of the undergraduate programme between 1995 and 2002. For each exam there was also information on the subject-area/content (theme), and the number and format of questions which were categorized as course work, essay, in-class assessment, lab studies, OSCE, short answer, single phrase, spotter, single word answer, T/F questions, or Viva. In addition, another variable was created to indicate when an exam consisted solely of an essay format, in-class assessments, or T/F questions. Within each year ANOVA was used to identify statistically significant differences between the mean male and female score for each course. A data file was then created that indicated, for each course within each year, the presence of a statistically significant gender difference, the magnitude of the difference, the theme and the format of the exam questions. Logistic regression was then used to test whether the theme, calendar year or each exam format, individually, predicted that 1) males or 2) females do better.
There was data available from 359 course offerings. Statistically significant differences between the genders, after correcting for multiple comparisons, were found in 111 (31%) of assessments with females doing better than males in 85 (77% of the assessments with a gender difference) and males better in 26 (23%). Univariate analysis showed that the exam question formats most likely to differ between genders were inclass assessments (females do better) and T/F questions (males do better). But, in multivariate analysis the female advantage with in-class assessments was 'explained' by the theme and calendar year rather than by the fact that the course was assessed in-class. The final model showed that if an exam consisted of at least SOME T/F questions then males were 16.7 times more likely to score higher, and if the exam consisted on T/F ONLY then males were 10.9 times more likely to do better than females with no other explanatory variables adding significantly to the model.
This analysis suggests that true/false/abstain questions on examinations can considerably bias results in favour of males. There was no support for the hypothesis that in-class assessments, or any other type of exam question format, would benefit females. Given that 1) T/F questions only tap on the two lowest level’s of Bloom’s taxonomy of cognitive domains and, 2) the US National Board of Medical Examiners no longer uses T/F questions on their certification exams, I recommend that T/F questions not be used in examinations.