[From an article written by Diana Go for the March 10, 2000 issue of Nixon News]
What do those numbers mean?
When you received your child’s Standardized Testing and Reporting (STAR) Test results, were you confused about the meaning of the numbers? If so, you were not alone. Jim Cox, of JK Educational Associates, spent a few days in our school district to help our community interpret the STAR test results.
SAT 9 Test
The STAR Program currently uses the Stanford Achievement Test – Ninth Edition (SAT 9) for standardized testing. Cox assures us that the team who developed the SAT 9 test is very competent. They carefully analyzed state curricula and recent textbooks used across the country and the test questions they developed underwent extensive review and revision.
Norm Group
In 1995, a very large norm group of 250,000 students across the country took the test. Your child’s test scores are a comparison to that norm group. The test scores are NOT a comparison to their peers in their classroom, school, district, or even in their state.
Number Correct versus Percentiles
Cox asked: “If I ran a race with 100 people and came in last, would you think that I was a slow runner?” Most people nodded their heads. “What I told you it was the Olympic trials and we were the 100 top runners in the country? What if I said: ‘I ran the mile in 10 seconds.’ Do you still think I was a slow runner? To understand how fast I am, you need to understand my time, my peer group, AND my place in my peer group.” The test results you received in the mail include number of items asked on the test, the number your child answered correctly, and the national percentile. The national percentile reported in the test results is how your childâs score compared to the peer group, namely the 250,000 students in the 1995 norm group.
Definition of Grade Level
To determine grade level, the SAT 9 test publishers sorted all results the from the 1995 norm group for each grade level and test category. They sorted all 250,000 tests in sequence starting with the tests with the least correct answers and working up to the most correct answers. The test publishers then divided the stack into 100 even stacks. They call the lowest scoring stack the 0th and the highest scoring stack the 99th percentile. Grade level, or the fiftieth percentile, is the exact middle of this sorted stack. If your child received a fifty-percentile on the test, it does not mean he is below average, it means he performed at grade level. With this definition of grade level, it means that half of the test participants are below grade level and the other half is above grade level. Naturally, scoring at grade level is NOT necessarily the same as functioning at grade level in the classroom.
Amazing Scores
Cox admitted he did not understand why our school district hired him because our scores were so high. Cox said that the Palo Alto Unified School District has “amazing scores” with averages consistently in the 80-percentile range, and in some cases, even higher. Customarily, districts with scores in the 30-percentile range call in Cox for his expertise. What does 80 percentile mean? If your child received an 80 percentile on the test, it means she outperformed 80 percent of the norm group, or more precisely, she scored better than 200,000 of the 250,000 students who took the test in 1995. With these scores, it is time to celebrate our children’s successes. This does not mean we should rest on our laurels. Cox asked the room full of educators: “If you were Michael Jordan’s coach two and a half years ago, would you expect him to be able to improve his game?” The room answered with a resounding “yes!” Just as we can expect a top athlete to improve his game, we can expect to improve student performance in the Palo Alto Unified School District.
Standard Error of Measurement
Cox enjoys using clever “real-life” examples to make his points in the seminar. He asked: “If Tiger Woods played the same round of golf under the same conditions four days in a row, would he get the same score each day?” Statisticians call the difference between these scores the Standard Error of Measurement (SEM). The SEM for fifth grade language on the SAT 9 test is plus or minus 3.0. If a child scored 30, then, statistically, he could score anywhere between 27 and 33 if he took the test again. A raw score of 27 correct answers translates to the “at risk” 35th percentile and a score of 33 correct answers is “above grade level”, between the 65th and 70th percentile. How could there be such a big difference in percentiles with only a variation of plus or minus three correct answers? Let us examine the stack of sorted tests again. Many of the one hundred percentile stacks represent the same score because many students had the same score. The most common scores are located towards the middle of the stacks. In the example above, there were many duplicates within the stacks representing 27 and 33 correct answers. Since each stack represents one percentile, there was a great percentile variance between 27 and 33 correct answers.
Conclusion
To understand how your child is performing in school, combine the results of the STAR tests with other measures available, including the input of the school staff.

