Better than Multiple Choice? Knowledge assessment with alternative answer and scoring procedures

Applicant Dr. Birk Diedenhofen

Subject Area General, Cognitive and Mathematical Psychology

Term since 2022

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 498045313

Project Description

Among the well-known and often lamented disadvantages of multiple-choice tests, which have become popular because of their cost-effectiveness, are the insufficient assessment of partial knowledge and the problem of contamination by variance due to guessing. Alternative and nowadays more easily available computer-based answer and scoring procedures promise to solve these problems. They include the Pick-N format, the axiomatically derived test format of Zapechelnyuk (2015), and the Answer-Until-Correct (AUC) format. What these three methods have in common, however, is that their validity has not yet been sufficiently investigated, and only in correlative designs that are often difficult to interpret. Furthermore, important procedural aspects of these alternative test formats have not yet been clarified. The planned studies will address these questions. To this end, an experimental knowledge induction will be used for the first time to provide a meaningful external validaton criterion. The principal advantages of this approach have already been demonstrated for empirical option weighting (Diedenhofen & Musch, 2017). Unfortunately, for legal reasons, empirical option weights cannot be used in examination contexts due to their dependency on the answer behavior of other test takers. The experimental approach can however be used to investigate which of the new legally safe formats can provide an improvement in terms of reliability and validity compared to conventional multiple-choice tests.Experiments 1 and 2 will determine the optimal scoring procedure for the Pick-N format and investigate the theoretically and empirically unresolved question of whether disclosure of the number of correct answer options moderates the validity of Pick-N test scores.Experiments 3 and 4 will test for the first time whether the innovative test format recently proposed by Zapechelnyuk (2015) allows for more valid measurements, and whether its validity is reduced by individual differences in response threshold and ticking propensity, which will be manipulated as an experimental factor.Finally, in light of inconsistent correlational results previously obtained, Experiments 5 and 6 will experimentally test whether the partial knowledge revealing AUC format improves the validity of test scores and whether this is dependent on properties of the items used, which would then need to be taken into account in the construction of such tests.Based on our experiments, test developers will be able to make a more informed decision when choosing among currently available test formats. The results will be relevant to all areas where standardized instruments are needed to measure achievement and learning outcomes.

DFG Programme Research Grants

Co-Investigator Professor Dr. Jochen Musch

Servicenavigation

Hauptnavigation

Better than Multiple Choice? Knowledge assessment with alternative answer and scoring procedures

Additional Information

Servicenavigation

Hauptnavigation

Better than Multiple Choice? Knowledge assessment with alternative answer and scoring procedures

Additional Information

Textvergrößerung und Kontrastanpassung