Is any test reliable? CRCT? SAT? NAEP? ACT? Pick one.

I have to tell you that after writing about education for 12 years, I am still baffled by whether any test results can be trusted, whether  in Gwinnett County  (see comments on Gwinnett and the Broad Award below) or on the national level.

The New York Times has a chilling story on its state tests, noting that it found that even random guessing could produce a passing score.

Says the Times:  “A side effect of the adjustments in scoring is that on 5 of the 12 math and English tests this year, a student had a better-than-even chance of earning a Level 2 mark— a failing grade that reflects “partially meeting learning standards” — simply by guessing. On the sixth-grade English test, for instance, a student had an 89 percent chance of reaching Level 2 this year by randomly guessing, according to an analysis by The New York Times.”

But can we go back to relying purely on teacher assessments to measure student achievement and progress? After all, we have been reporting for years that kids get promoted onto high school without even being able to read.

But for every test, there are heavily armed critics pointing out  flaws in how the tests are designed, scored or applied.  (Nobody does it better than FairTest.)

Even the significance of the highly regarded NAEP tests has come under fire.

I will say this: I have found that high SAT and ACT scorers are, in fact, smart kids. I have found that kids  who score at the very highest levels on the Georgia CRCT are smart kids.  Individual NAEP scores aren’t available, but I presume the same would hold true.

In other words, these tests seem fair reflections of the students operating at the highest proficiencies.  Is their integrity in doubt with lower performing students?

32 comments Add your comment

Ernest

September 16th, 2009
2:03 pm

I had the greatest confidence in the ITBS, which is a nationally normed test. I was disappointed when this was eliminated in favor of the CRCT. I question the value of the results we get from the CRCT.

I believe we need to combination of measures (teacher assessments, standardized test results, parental assessment, etc.) to help determine student performance. Each measure has its good and bad aspects. There are many variables involved, most importantly how motivated the individual student is.

DeKalb Conservative

September 16th, 2009
2:12 pm

“The New York Times has a chilling story on its state tests, noting that it found that even random guessing could produce a passing score.”

Of course this is true. Its no different than hitting the same color many times over on the roulette wheel.

I’ll speculate the small percentage that random guessing creates a passing score is insignificant.

ConcernedMom

September 16th, 2009
2:13 pm

What about the students who may know the material but have never had good results on standadized tests? Some students knowledge is more hands on. I think students who have difficulty passing standardized tests may become discouraged about graduating from highschool since a test is the determining factor.

jim d

September 16th, 2009
2:46 pm

Only one test is reliable, Gwinnetts Infamous 8+ million dollar Gateway.

Here’s an example.

10th grade Student is in top 4% of his class, GPA 3.89, gifted and AP clases only and failed part of the test.

just think–without this wonderful tool this student may have gone on without ever knowing he wasn’t up to par.

SallyB

September 16th, 2009
2:55 pm

I , like Ernest, believe that nationally normed tests can be valuable tools,especially when comparing different systems across the country, as well as when trying to evaluate the student’s knowledge.
CONCERNED MOM: There are ,indeed, those students whose scores on standardized tests do not give a true picture of their knowledge. However, and it’s a big HOWEVER….if these students are plannning to enter any of the professional fields, and many that do not fall in that category, they must learn how to take and pass these standardized tests. Almost every profession requires one or more, either to enter a program of study, to graduate, and/or to be licensed.
I actually think that the number of students for whom this is the case is minimal.

SallyB

September 16th, 2009
2:57 pm

Monster post eater is back!!!

oldtimer

September 16th, 2009
3:08 pm

Years ago Clayton county gave a pre and post ITBS to each and every student 1-9 or 10. I found it to be a good measure.
The CRCT is tooooo easy and for those with diagnosed learning problems it can be read orally. Not really a test of reading ability!

just browsing

September 16th, 2009
3:30 pm

Hi Sally,
The sad part is, learning to take tests is an art. As a product of the Georgia School System of the 80’s and 90’s, it is interesting that the rigor that once existed is no longer there. How can we expect students to rise to standards which are subpar in many areas. The reading and language arts CRCT for middle grades students was too easy, while the math and science portions too hard (at least in 8th grade). We have gone overboard with the whole concept of standards based classrooms minus the rigor. Rigorous teachers are often deemed unfavorable, paticularly if they hold students accountable for the learning opportunities provided. We have to move towards more rigor, and with that an understanding that students (and some parents) will not always be “happy” with the responsibilities it entails. CRCT Tests will not accurately reflect student understanding. The lexiles on the 8th grade tests, are all over the place and different for each subject. It lacked consistency and the students commented about it.

Tony

September 16th, 2009
4:14 pm

No test in and of itself is reliable enough to use it as the sole basis for placement of students, evaluation of teachers, or determining the AYP status of schools. Yet we continue to push multiple choice style exams as indicators for these things.

There are some things that can be measured reliably and there are some things that can’t be. Blood sugar can be measured reliably and consistently because there is a chemical standard that is used for the test. Student learning can not be measured with the same degree of reliability because, believe it or not, all children are different.

While many steps are involved in the development of test items for the CRCT, ITBS, SAT, and other multiple choice style assessments, there are still four choices for students to pick. That is why random guessing can produce passing results on these tests. Some tests, like SAT, use a formula to account for the guessing factor, but it is still not perfect.

Another reason these tests are unreliable is because they do not really allow children to explain themselves. Children from rural areas have different experiences than urban children. These cultural difference have an impact on test taking. The item review process is supposed to reduce the number of items where this might be a problem, but that process is not perfect.

Reading the questions and understanding what is being asked for within the item can also be a problem. Adults who develop the questions try really hard to use the language from the Georgia Performance Standard, but truth is the can be many words that mean the same thing. Because of this, all portions of the CRCT become reading tests.

Good assessments provide students with a means to show how they arrive at an answer. This is impossible with multiple choice tests.

The short of it is that our politicians have placed too much emphasis on the standardized testing programs and our learning opportunities are being reduced. If it’s not on the test, many teachers are not allowed to teach it in the classroom.

Finally, when the voting public speaks, the politicians will listen. Right now, nearly everyone in the US has been convinced that our schools are horrible. We have heard it for years. Some of you have had bad experiences with schools that reinforce the belief. Yet, what you are not told is that US schools actually do quite well in comparison to other countries in the world when the full analysis is given. Since this does not provide a crisis for politicians to solve, it is not newsworthy.

Let’s take back our schools from the pundits and politicians. Become active with your local school. Expect your child to complete assignments, do homework, stop goofing off in class, and take school seriously. When you get right down to the brass tacks of what’s wrong in our schools, it is in the values that we place on learning. The rigor can be there. The content expertise can be there. We have to stop making excuses for our children and we have to stop bowing down to the test gods.

Mom in Gwinnett

September 16th, 2009
4:31 pm

I agree with Tony and Just Browsing.

Maureen's accountability metric

September 16th, 2009
4:49 pm

Let me start by saying some students who do great on the CRCT are doing great on national norm tests, and doing great period, and will continue to do great on ACTs and SATs. Fair enough?

But I think if people knew the number of students who “exceeded” on the CRCT but scored below average on a national norm assessment, they’d be appalled.

And if they knew the number of students who “met expectations” on the CRCT and scored below average, or worse, way below average, on a national norm assessment, they’d be absolutely terrified.

Reality 2

September 16th, 2009
4:59 pm

I’m not a psychometrician, but a big part of the issue implied in this blog is the poor understanding of what “reliability” of a test from a psychometric perspective – and resulting confusion with “reliability” in more everyday concept. I think from a psychometric perspective, reliability simply means that if the same kid takes the same test in similar settings several times, his/her scores would be close to each other. All those tests that were named are probably reliable psychometrically. The issue is how we use the results, and that has nothing to do with the quality of the test.

A normed test, like ITBS, has its own issues. They simply compare someone’s score to the score of those students in the “norm group.” But, how that group is picked is also an issue. A normed test like ITBS may be a poor choice to see if students are learning what they are expected to learn in a particular grade level if the test items don’t align with the set of expectations. You can still get the scores, but the results don’t mean a thing about whether or not students mastered the expected content. Normed tests – I think SAT is one – is fine for some purposes where there may be a reasonable agreement about what should be known/understood. I think it may be easier to do so with a test like SAT (by the end of HS, or preparing to go to colleges, students should know this set of ideas) than for a particular grade level – what all 3rd graders should know. A normed test can be meaningful if we actually end up having a national standards. Otherwise, ITBS scores don’t tell us whether or not students mastered what they are expected.

To me, the problem with testing is more often than not what we do with the results – in other words, they are more about policies than the instruments themselves.

Katy Johnston

September 16th, 2009
7:14 pm

You can learn standardized tests. I sent me son to SAT tutoring at C2, which rose his SAT score by 600 points. It’s not a test of innate knowledge – it’s a test that can be prepped for.

catlady

September 16th, 2009
7:49 pm

The CRCT is not reliable OR valid. Period. When you have a kid who doesn’t speak/read/understand English (in country for less than 6 months) that can nearly pass the CRCT reading, it says something about the test.

When the state sets cut scores at less than 60% correct to “pass”, you have a problem.

When the test asks the same question 2-3 times (state bird of GA), you have a problem.

When that important information–SO important it is included on the CRCT–cannot be recalled by 4th or 5th graders (for whom it is no longer a GPS)–you have a problem.

When a math test really tests reading skills above 3rd grade, you have a problem.

I could go on and on.

SAT is okay, but more difficult for lower than middle class kids.. NAEP is a good test, but not for Georgia’s kids (we don’t have that high an expectation of them). ITBS: good, but dated. Does more closely show important skill achievement. No mention of ANY state bird on it!

ScienceTeacher671

September 16th, 2009
8:53 pm

I agree with Tony and MAM on this issue…and most particularly, with MAM on the issue of how students who “meet expectations” perform on nationally normed tests – it’s pathetic. Also, I think Reality 2 is correct about the psychometric definition of “reliable” – but I think what we (also) want to know is whether or not these tests are VALID – do they measure what they are supposed to measure? When a student who is working at a 4th grade level can pass a test saying s/he has mastered the 8th grade standards, I would say that the CRCT is not valid, although it might be reliable.

I would put more stock in any of the nationally normed tests, although as has been stated already, they don’t necessarily measure whether the student has met the Georgia curriculum standards.

And ConcernedMom, I don’t think I have ever seen a child who did poorly on the CRCT but still “knew the material” – some of them have gotten good grades, but that was because their teachers graded them on effort, rather than on mastery of the material.

ScienceTeacher671

September 16th, 2009
9:20 pm

BTW, there was another article about these NY tests in the Times a few weeks ago. Apparently “Christmas-treeing” the tests will get a passing score.

AND New York has the old tests, the answer keys, and the scoring guides including cut scores online so that anyone can see them, according to the article a few weeks ago. I bet you can’t find that sort of transparency with any Georgia tests.

NumbNutz

September 16th, 2009
9:45 pm

Year after year, people complain about tests being biased, unfair, blah, blah, blah, and year after year the powers that be feel they have to dumb down the tests so little Johnny, who is disrepectful, and plays around all day, will fit into their little bell curve.

Let our K-12 teachers teach and to hell with all these standardized tests. If your child gets held back a grade because they do not understand the material, then so be it. Not every child is meant to be an astronaut, engineer, scientist, and on, and on.

What we can do for our children is to give them a good educational foundation to build upon. Our school system does them a great disservice in emphasizing these damned tests.

Reality 2

September 16th, 2009
10:24 pm

catlady ScienceTeacher;

Again, “passing scores” aren’t about the test itself. It’s how the results are interpreted. Just as a particular score is picked to be the “average” in a norm-referenced test, those cut off scores are picked by policy makers. We can’t fault a test just because it is being used by people incorrectly.

Just a parent

September 17th, 2009
8:57 am

All teachers are not reliable. As parents we cann’t give all the power to the teachers. Over the past 18 years, my children have been exposed to many substandard teachers. My experience is that teachers do not report other teachers who are not doing their jobs. Their has to be checks and balances in the system, but the CRCT and EOCT tests in Georgia are a joke. Its sad that Georgia educators feel the need to cheat on tests that are already watered down. Nationalized tests may not be perfect, but they are necessary since we cann’t trust our Board of Education.

philosopher

September 17th, 2009
10:55 am

First- let me say that the ITBS is still given (as well as the CRCT) here in Georgia…at least in the schools my child attends. Also, I guess I don’t get all the hoopla about the testing results. My kids have always tested consistently on any of the standardized tests. And…I remember from very young about how to christmas-tree a test in order to pass…nothing new…I do have a problem with wasting my child’s education time teaching material over and over just for that CRCT….I feel like either a child gets the material or he/she doesn’t And I want to know if my child doesn’t get it so I can follow up with measures to help him/her get it.

philosopher

September 17th, 2009
11:11 am

Addendum to prior statement- my kids tested consistently on the standardized tests AND the tests were consistent with their performance academically. However…as those scores and performances were on the high end, as opposed to the low, I cannot say if it works the same way in the other direction… I guess I’m just trying to say that I don’t mean to belittle the concerns…it just hasn’t affected me personally…yet.

Just a parent

September 17th, 2009
11:53 am

Philosopher – One problem with georgia is that the standards are low. I have a daughter who graduated from high school (in Georgia) with a great GPA and good scores on her SAT and ACT. She goes to one the the better colleges in georgia (UGA) and is having to work twice as hard to make up for some missing material her high school teachers did not teach. I do believe their are many teachers in georgia who do not teach and nothing is ever done about it.

philosopher

September 17th, 2009
12:03 pm

I agree- I had to go to the curriculum coordinator at my older daughter’s high school just a few years ago because her English teacher was giving them work sheets to do that my kindergartener could do…no lie! What I have noticed is that my kids who were (are) in the gifted program and accelerated classes are getting an excellent education- I know because my 7th grader and my college kid are helping each other with math and science…the same material.
But I still wish our public schools were up to snuff…got lots of ideas about why they’re not but…a subject for another time.

J.Dewey

September 17th, 2009
12:06 pm

Kudos to Tony, Reality2 and MAMS for excellent explanations. Here is a textbook definition of reliability and validity.
Test Reliability and Validity Defined
Reliability
Test reliablility refers to the degree to which a test is consistent and stable in measuring what it is intended to measure. Most simply put, a test is reliable if it is consistent within itself and across time. To understand the basics of test reliability, think of a bathroom scale that gave you drastically different readings every time you stepped on it regardless of whether your had gained or lost weight. If such a scale existed, it would be considered not reliable.
Validity
Test validity refers to the degree to which the test actually measures what it claims to measure. Test validity is also the extent to which inferences, conclusions, and decisions made on the basis of test scores are appropriate and meaningful. The Hoover Study presents evidence that OPT is not valid, that the conclusions and decisions that are made on the basis of OPT performance are not based upon what the test claims to be measuring.
The Relationship of Reliability and Validity
Test validity is requisite to test reliability. If a test is not valid, then reliability is moot. In other words, if a test is not valid there is no point in discussing reliability because test validity is required before reliability can be considered in any meaningful way. Likewise, if as test is not reliable it is also not valid. The Hoover Study does not examine or make any claims about OPT reliability.
http://cc.ysu.edu/~rlhoover/OPTISM/reliability_validity.html

J.Dewey

September 17th, 2009
12:12 pm

Check out the NY Times article for the full story.
http://www.nytimes.com/2009/09/14/education/14scores.html?_r=1&ref=education

The key to understanding is that there were fewer questions which made random questioning more efficient. The more questions per domain or strand the harder it is to get a proficient score by guessing. The rule of thumb is there should a least 6 items per tested element to insure accurate measure of understanding.

philosopher

September 17th, 2009
1:54 pm

Thanks, J.Dewey- appreciate the info!

Gregg Williams

September 17th, 2009
3:45 pm

No test is perfect, and standardized tests will always be with us one way or another. These tests offer a quick and fairly accurate way to filter potential candidates for college. In a society as large and as complex as ours we need an efficient system albeit imperfect. For that reason, we shouldn’t have tests be the end all be all determining factor, which thankfully it is no. We also look at GPA, rigors of class schedule, personal statements, recommendations, and activities. But, to make testing more equitable, we all ultimately must be able to provide a standard for education to our kids. That is what is challenging and that is what runs into serious policy debates and accusations of socialism and cults of personality. There is an interesting debate online with David Kim of C2. You can find it on their website http://www.c2educate.com.

ScienceTeacher671

September 17th, 2009
8:44 pm

I’d be a little worried if my college student was doing the same work as a 7th grader….

ScienceTeacher671

September 17th, 2009
8:51 pm

If the same grade from different teachers always equated to equivalent amounts of material mastered, we would not need standardized tests. The trouble with Georgia state tests is that they don’t give any real indication of whether the material has been mastered either.

philosopher

September 17th, 2009
9:23 pm

Not if they’re both doing precal.

ScienceTeacher671

September 17th, 2009
10:22 pm

Wow, where do they offer precal in 7th grade?

Tom James

September 18th, 2009
12:22 pm

i know the best digital textbooks online ,ranging from Kindergarten to the 12th grade. Get the best material for children curriculum online. Teachers can submit their education grade level content online too.