Technical Report

The ACASE online Assessment Information System permits users to check the degree to which their ratings concur with other users or with a standard set of ratings provided by their institution or ACASE. What follows is a report on recent studies of reliability of ratings on Cubes & Liquids V.4.

Reliability of ratings is calculated as the percent match of ratings to a standard as follows:

Match
When someone rates samples of student performance on a given learning goal for a particular rating event, any single rating is said to ‘match’ if it is the same as a standard rating.  In some cases the standard rating is provided by ACASE. In other cases the class instructor’s rating or ratings provided by the developer of an assessment can also serve as the standard rating.

Percent Match
Percent Match can be calculated for any set of ratings by counting the number of ratings in the set that “match” and dividing by the total number of ratings in the set.

Reliable ratings of the 6 learning goals underlying the C&L v. 4 assessment activity can be attained at levels of 90% agreement or greater by thoughtful users of the system through a simple training process described in the Summer 2004 Reliability study below. In the most recent study the following levels of agreement were obtained in comparing three raters:

Learning Goal Percent Match
Distinguishes Observation from Inference 97.6%
Technical Description 97.6%
Density of Solid Objects (Coordinates Mass and Volume of Solid Objects) 95.2%
Density of Liquids  (Coordinates Mass and Volume of Liquid) 90.5%
Uses a 2×2 Classification Scheme to Organize Relevant Factors 100.0%
Proportional Reasoning 100.0%

Summer 2004 Validity and Reliability Study

In the summer of 2004 a study was conducted to estimate the reliability of rating of the 6 Cubes & Liquids v.4  learning goals.

Participants were 1) an ACASE representative who had played a lead role in the development and refinement of Cubes & Liquids, 2) a middle school science teacher who had familiarity with the system and 3) a professor of physics and science education in a State University of New York college center who also had familiarity with ASID and Cubes & Liquids.

Two sets of 20 complete samples of student responses to the task were made available to the participants.  The first set of samples was used to orient participants to the rating task. Participants worked together to rate each sample of student performance (6 learning goals for each student sample). During this process many weaknesses were discovered, not only in instructions for rating but in the levels of performance on the learning goals themselves.  Revisions were made in both.

Participants then took the second set of samples home with them and rated the remaining examples of student performance independently. The ACASE member’s ratings served as a provisional standard.

The ASID system generates reliability values as soon as ratings are entered into the system. When ratings were examined, participants were asked to check the non-matches and to change them in cases where they perceived that their rating was a data-entry error.

The reliability values resulting from Phase II were as follows:

Learning Goal Percent Match
Describes Experimental Actions 100.0%
Presents Experimental Results 100.0%
Differentiates critical experimental objects 100.0%
Conceptualizes Density of Solid Objects  60.0%
Conceptualizes Density of Liquids 100.0%
Uses the Word ‘density’ 100.0%
Coordinates Densities in Floating & Sinking 60.0%
Distinguishes Observation from Inference 80.0%
 Uses a classification scheme to organize relevant objects 80.0%
Technical Description 100.0%

Conclusion of 2004 Summer Reliability Study

Conversations with participants led to the conclusion that

Several learning goals were deemed to be unnecessary or trivial:

  • Describes Experimental Actions
  • Presents Experimental Results,
  • Differentiates critical experimental objects
  • Uses the Word ‘density’ .

These were eliminated from the next version of Cubes & Liquids.

Coordinates Densities in Floating & Sinking was recognized to be fundamentally flawed in its conception and eliminated from the system

These eliminations resulted in a greatly simplified and more efficient assessment package for Cubes & Liquids.

The rating instructions for Conceptualizes Density of Solid Objects,  and Distinguishes Observation from Inference were recognized as flawed and were revised.

The task administration instructions for Uses a classification scheme to organize relevant objects were found to be flawed and were revised.

This experience brought to our attention the fact that Cubes & Liquids contained an opportunity to identify student competence in applying ratios and proportions to the task at hand. A new learning goal, proportional reasoning, was added to Cubes & Liquids.

Fall 2004 Reliability Study & Current Status of Reliability

A second reliability study was conducted in the Fall of 2004 with the Cubes & Liquids v. 4.

Student samples were taken from a  7th grade physical science class.  The teacher’s ratings (the same teacher as in the first study) were taken as the standard. Two members of ACASE also rated student performance, the previous participant and newer member.

The study was conducted in two phases as above and the reliability values attained were as follows:

Learning Goal Percent Match
Distinguishes Observation from Inference 97.6%
Technical Description 97.6%
Density of Solid Objects (Coordinates Mass and Volume of Solid Objects) 95.2%
Density of Liquids  (Coordinates Mass and Volume of Liquid) 90.5%
Uses a 2×2 Classification Scheme to Organize Relevant Factors 100.0%
Proportional Reasoning 100.0%

Reliable rating procedure have been verified for the following learning goals on Cubes & Liquids v. 4:

  • Distinguishes Observation from Inference
  • Technical Description
  • Density of Solid Objects — Coordinates Mass and Volume of Solid Objects
  • Density of Liquids — Coordinates Mass and Volume of Liquid

Reliability studies are being conducted for the following learning goals in the same Activity

  • Uses a 2×2 Classification Scheme to Organize Relevant Factors
  • Proportional Reasoning   — Coordinating Solid and Liquid Densities