Examining Rater Reliability When Using an Analytical Rubric for Oral Presentation Assessments
The assessment of English speaking in EFL environments can be inherently subjective and influenced by various factors beyond linguistic ability, including choice of assessment criteria, and even the rubric type. In classroom assessment, the type of rubric recommended for English speaking tasks is the analytical rubric. Driven by three aims, this study analyzes the scores and comments from two raters on 28 video-recorded Thai engineering students’ oral presentations using a detailed analytical rubric that covers content, delivery, and visuals. First, it investigates rater reliability by comparing raters’ scores using Intraclass Correlation Coefficient (ICC) and ANOVA. Second, applying generalizability theory (G-theory), the correlations between the scores are examined to understand the relationships between different assessment dimensions and how they contribute to a comprehensive evaluation of speaking proficiency. Third, a thematic analysis is performed on raters’ comments to gain a deeper understanding of raters’ rationale. The findings suggested that a higher number of raters increases the reliability of the ratings, although diminishing returns are observed above a certain threshold. Also, several key themes emerged in relation to the criteria. The study highlights the crucial role of detailed analytical rubrics and cooperation sessions between raters in improving the reliability of oral EFL assessments.
