U.S. Department of Education: Promoting Educational Excellence for all Americans - Link to ED.gov Home Page
OSEP Ideas tha Work-U.S. Office of Special Education Programs
Ideas that work logo
  Home  Contact us
Technical Assistance Products: Assessment
Instructional Practices
 Information About PDF

 Printer Friendly Version (pdf, 1.9MB)

Massachusetts: One State's Approach to Setting Performance Levels on the Alternate Assessment

Defining Performance Levels

The task force recommended that performance levels be identical to performance levels on standard MCAS tests; but that the lowest performance level, called "Warning/Failing at Grade 10" for tested students, would be sub-divided into three distinct levels in order to provide more meaningful descriptions of performance at these lower levels. Figure 1 illustrates the performance levels and definitions used by Massachusetts to report assessment results on the standard and the alternate assessments, and the relationship between the two reporting scales.

Figure 1

MCAS Performance Levels

Test Warning (Failing at Grade 10) Needs Improvement Proficient Advanced
Standard MCAS Tests Students at this level demonstrate a minimal understanding of subject matter and do not solve even simple problems. Students at this level demonstrate a partial under-
standing of subject matter, ans solve some simple problems.
Students at this level demonstrate a solid under-
standing of challenging subject matter and solve a wide variety of problems.
Students at this level demonstrate a compre-
hensive and in-depth understanding of subject matter and provide sophisticated solutions to complex problems
MCAS Alternate Assessments Awareness
Students at this level demonstrate very little under-
standing of learning standards in the content area.
Students at this level demonstrate a rudimentary under-
standing of a limited number of learning standards in the content area, and have addressed these at below grade level expectations.
Students at this level demonstrate a partial under-
standing of some learning standards in the content area, and have addressed these at below grade level expectations.
(same as above) (same as above) (same as above)

Counting Scores Toward an Overall Performance Level

On several occasions, the task force revisited the question of which scores to count in calculating the overall level of performance. In reviewing the goals, methods, and purpose of the general assessment, they realized, in essence, that regular MCAS tests measure the ability of a student to respond to test items accurately, with no assistance from peers or from the adult(s) administering the test, and that test results are based solely on the correctness of the student’s responses.

In the end, their recommendation was to "parallel the goals, methods, and purpose of the general assessment, where possible," when no other solution is obvious. With this advice, the task force established a foundation for future decision-making, and returned to this guidance frequently.

With these assumptions about the general assessment, and the advice of the task force to parallel the general assessment where possible, the Department decided it would base alternate assessment performance levels on raw numerical portfolio scores given in the areas of completeness, complexity, accuracy, and independence only; but not on self-evaluation or generalized performance, since scores in these last two areas depended on opportunities provided to the student, not on the student’s direct performance of the skill being assessed. Scores in all rubric areas, however, would be reported to schools and parents in order to provide those who work most closely with the student detailed information on his or her performance as shown in Figure 2.

Separate scores are reported for each strand in Level of Complexity, Demonstration of Skills and Concepts (accuracy), and Independence, while scores in the secondary areas of Self-Evaluation and Generalized Performance are combined for the entire content area.

Figure 2

Excerpt of Sample Parent/Guardian Report

MCAS Alternate Assessment
Parent/Guardian Report

Performance Level: EMERGING

Strand Level of Complexity Demo of Skills Independence Self Eval Generalized Performance
Number Sense and Operations
Patterns, Relations and Algebra
Data Analysis, Statistics and Probability

How Will Numerical Scores be Combined to Yield a Performance Level?

The Massachusetts Department of Education consulted with Ed Roeber of Measured Progress to assist in developing a strategy or formula for combining scores to obtain an overall performance level for each content area. Over time, Dr. Roeber recommended several options for calculating a numerical score total in each content area of a portfolio. The following were two mathematical formulas considered by the Department:

Method #1 - Calculate the sum of scores in three rubric areas:
LC + DSC + Ind = Total Score

Method #2 - Multiply LC by the sum of the other two rubric areas:
LC x (DSC + Ind) = Total Score

LC = Level of Complexity
DSC = Demonstration of Skills and Concepts
Ind = Independence

Consider the following scenario using both scoring methods:

Student A

Student B

Raw Scores:

Raw Scores:







Student A Total Score (Method #1) = 9

Student B Total Score (Method #1) = 10

Student A Total Score (Method #2) = 18

Student B Total Score (Method #2) = 16

Using Method #1, Student A scored lower (9) than Student B (10), although Student A worked on more challenging subject matter (LC=3) than Student B (LC=2). Using Method #2, on the other hand, Student A scored higher (18) than Student B (16), thereby rewarding Student A for attempting more challenging material. For certain score combinations, Method #1 appeared to create a disincentive for students to attempt increasingly complex skills and content, and discouraged teachers from providing more challenging instruction to their students, which was certainly not the intent of the alternate assessment.

Because the LC score is used as a multiplier in Method #2, scores also were spread over a wider range (1-40), avoiding the possibility of overlapping totals. Method #1, on the other hand, spreads scores across a narrow range (1-13) since scores are simply added together. It was agreed that Method #2 would be explored further for its effectiveness, impact, and unintended consequences, if any.


 Previous  |  Next