Optimizing the selection and training of raters: How do rater characteristics and training features affect the quality of judgments of teaching quality?

Applicants Professor Dr. Aiso Heinze; Professor Dr. Thilo Kleickmann; Professorin Dr. Mirjam Steffensky

Subject Area General and Domain-Specific Teaching and Learning

Term since 2022

Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 490658944

Project Description

Classroom observations are frequently used in teacher evaluation, training, professional development, and research. Ensuring that the observations are reliable and valid is therefore of great importance. Studies have shown however, that they are often neither, even when the raters are trained. There is limited systematic and causal evidence about ways of enhancing the reliability and validity of such ratings. One promising avenue of research is looking at the characteristics of the raters and how well they are trained.The aim of this project is to systematically investigate fundamental rater characteristics and aspects of rater training and their effect on the quality (i.e., reliability and validity) of judgments of teaching quality. We aim to generate evidence for how the selection and training of raters can be optimized, taking into account feasibility issues. The project focuses on the Three Basic Dimensions of teaching quality (Praetorius, Klieme, Herbert, & Pinger, 2018): classroom management, student support, and cognitive activation. We will investigate the relevance of the following aspects of observations on teaching quality: (RQ 1) the raters’ content-related characteristics (content knowledge and pedagogical content knowledge with respect to the content of the observed lessons, their study major) and their generic characteristics (personality, pedagogical knowledge, beliefs, and teaching experience) and (RQ 2) specific rater constellations based on typically used numbers of raters (n = 2-4) in a heterogeneous pool of raters, (RQ 3) experimental manipulations of raters’ content knowledge and pedagogical content knowledge of the topic of the observed lessons, (RQ 4) experimental manipulations of the practice phase (guided discussion vs. self-directed reflection) of rater training, and (RQ 5) experimental manipulations of training duration (12h versus 6h). We will maximize the variance of the characteristics of the project’s rater pool by selecting participants with a range of educational attainments, who are on different tracks, and specializing in a variety of subjects. We will also take advantage of the uniquely heterogeneous nature of education settings in Germany and Switzerland. The lessons that will be rated comprise two grade levels and two core subjects, science (focus on floating and sinking, grades 3-4) and mathematics (focus on the Pythagorean theorem, grades 8-9). RQ 1 and RQ 2 will be investigated using the control groups in four experiments. RQ 3-5 are investigated using experimental manipulations of each of the areas of interest. The central dependent variables are inter-rater reliability, rater agreement with reference ratings, and predictive validity with respect to students’ conceptual understanding as well as content-specific interest. For RQ 3 and RQ 4, think-alouds and cognitive interviews with subsamples will be conducted to gain further insight into the differences between the experimental conditions.

DFG Programme Research Grants

International Connection Switzerland

Cooperation Partner Professorin Dr. Anna-Katharina Praetorius

Servicenavigation

Hauptnavigation

Optimizing the selection and training of raters: How do rater characteristics and training features affect the quality of judgments of teaching quality?

Additional Information

Servicenavigation

Hauptnavigation

Optimizing the selection and training of raters: How do rater characteristics and training features affect the quality of judgments of teaching quality?

Additional Information

Textvergrößerung und Kontrastanpassung