Subject:
|
Standardized tests (was: Yummy!)
|
Newsgroups:
|
lugnet.off-topic.debate
|
Date:
|
Sun, 7 May 2000 01:31:03 GMT
|
Viewed:
|
343 times
|
| |
| |
Sorry for the length of this folks, I so rarely get to use my Masters of
Education ;-)
In lugnet.off-topic.debate, Shiri Dori writes:
> Oh, and I absolutely agree with this article, and also what they say about the
> SATs (take the link to their homepage):
> http://www.fairtest.org/facts/whatwron.htm
Hi Shiri. Which parts of the article do you agree with?
I think that whether one agrees or disagrees depends largely on what they think
they are agreeing to. That page says a fair many things that are correct as
isolated points, but they are strung together in a poor context that suggests
that standardized test are a bad thing. This is pretty clearly not so.
For example:
<<_Are standardized tests fair and helpful evaluation tools?_
Not really. >>
BS. This shallow analysis clearly shows the authors' agenda. Let's
continue...
<<Standardized tests are tests on which all students answer the same questions,
usually in multiple-choice format, and each question has only one correct
answer. They reward the ability to quickly answer superficial questions that do
not require real thought.>>
These claims tend to be true, but need not. Many standardized tests have such
lengthy time allotments that virtually no one uses it all. And they
high-stakes tests (SAT, and ACT) allow people with qualified learning
disabilities special testing circumstances and extra time.
<<They do not measure the ability to think or create in any field.>>
Somewhat true. Lukily, this is not what the producers of these instruments are
working toward. They arguably do measure the subject's ability to think in
certain ways.
<<Their use encourages a narrowed curriculum, outdated methods of instruction,
and harmful practices such as retention in grade and tracking.>>
No. This is simply not true. Perhapse their _mis_use encourages those things.
It is certainly foolish of the system to encourage schools to teach to those
tests. This is most true in more centrally controlled areas. I just moved
(eight months ago) to New Jersey from Missouri. When hurricane Floyd came
sputtering up the coast, the governor cancelled all public schools in the
state. I don't think the governor in Missouri has the authority to do that,
and I'm certain that schools would ignore such a declaration anyway. "Teaching
to the tests" was a problem in MO, and I suspect it's worse out here, but I
don't know that.
I think it's just generally a silly blame-asignment exercise to say that these
tests are at fault for poor curriculum decisions, poor instruction, or grade
retention and tracking. Also, it has not been conclusively demonstrated that
retention is a bad thing. (I happen to think it's moot since the whole concept
of grades (either kind!) is fatally flawed, but that's beside the point.)
<<They also assume all test-takers have been exposed to a white, middle-class
background.>>
Tests are biased against people who don't know the answers. Tough. As long as
we consider it appropriate to evaluate knowledge, progress, ability, etc., this
will be true.
These tests are generally fair and helpful when used appropriately. I have
heard stories of teachers using these tests (generally not the HS-level tests,
but more like the ITBS and other grade-school tests) for a variety of bad
things. But the tests are not at fault. The ignorant teachers, the schools,
the school districts, and the teacher education system share the fault.
<<_Are standardized tests objective?_
The only objective part of most standardized tests is the scoring, when it is
done by machine. What items to include on the test, the wording and content of
the items, the determination of the "correct" answer, choice of test, how the
test is administered, and the uses of the results are all decisions made by
subjective human beings.>>
Pray tell, how would the alternatives to standardized tests improve on this?
Oh wait, I know the answer to that...they wouldn't. Evaluating student
performance through a portfolio presentation would be even more subjective. I
love the way that the word correct appears in double quotes, implying that
there could be several reasonable answers to each question. Bzzzzt, thanks for
playing. It's really, really, really, OK for us to acknowledge that some
answers are right and some are wrong. That's how we learn.
<<_Are test scores "reliable"?_
A test is completely reliable if you would get exactly the same results the
second time you administered it.>>
This is simplified, but basically correct. For assessment instruments, there
are two basic types of measures that test makers are concerned with:
reliability and validity. Each have several forms, but they can be boiled down
thus: A test is valid if it measures what it was engineered to measure. A
test is reliable if it provides the same evaluation repeatedly. One form of
reliability is test-retest reliability and it's a good thing to shoot for. The
ACT, SAT, and LSAT (at least) all have very good reliabilities.
<<All existing tests have "measurement error." This means an
individual's score may vary from day to day due to testing conditions or the
test-taker's mental or emotional state. As a result, many individual's scores
are frequently wrong.>>
This is technically correct, but misleading. Sort of. All test scores carry a
certain (knowable and generally known) Standard Error due to Measurement. So a
score of 162 on some exam might actually mean 162 +/- 7 points. I hope that
guidance folks all know how to interpret these and understand the various
implications. In my example above, this means that a student with a 162 can
not be definitely claimed to be better at taking that test than a student with
a 152, but that assumption can be made when comparing to a student with a score
of 147. (And even then, statistics (as a field) is based on probability, so
sometimes you'll still make incorrect assumptions.)
If we understand that, then we realize that the scores aren't wrong, just
imprecise. And we realize that they provide a great diagnostic, that just
happens to not be the end-all be-all of student evaluation. And typically a
subject's score does not vary widely from day to day on a given instrument, for
the big well researched tests.
<<Test scores of young children and scores on sub-sections of tests are much
less reliable than test scores on adults or whole tests.>>
So what?
<<_Do test scores reflect real differences among people?_
Not necessarily. To construct a norm-referenced test (a test on which half the
test-takers score above average, the other half below), test makers must make
small differences among people appear large.>>
Appear large to whom? Anyone familiar with the bell-shaped curve of a normal
distribution (all teachers, I would hope) knows that those clustered around the
averages are basically the same as each other, regardless of score. And, the
raw scores can be translated into a scoring system that takes into account the
standard deviation and thuse emphasizes significant differences, rather than
paltry ones.
<<Because item content differs from one test to another, even tests that claim
to measure the same thing often produce very different results.>>
Nope. One of the forms of reliability that I mention above is the reliability
between equivalent (but different) forms of a given instrument. If a test can
advertise a high degree of reliability, than the above statement doesn't apply.
I would further posit that many of these arguments -- and this one in
particular -- apply more aptly to the alternate forms of assessment that the
author(s) favor.
<<Because of measurement error, two people with very different scores on one
test administration might get the same scores on a second administration.>>
See above. (But basically, my argument comes down to "nuh-uh.")
<<On the SAT, for example, the test-makers admit that two students' scores must
differ by at least 144 points (out of 1600) before they are willing to say the
students' measured abilities really differ.>>
I like that...they admit...suggesting that this were a thing to be ashamed of.
Poppycock. It is simply a reality of standardized tests. That number, 144,
probably includes padding to be safe. But if it doesn't, that means that a
student's "real" score could be up to 72 points higher or lower than that
reported. OK, the reason they announce that information is so that people
don't misuse the numbers. If a college is basing admittance largely on the SAT
scores (a practice, of which I approve) then they need to make sure not to
assume that a 1500 is actually better than a 1450. As long as they process the
statistics appropriately, the SAT is a good diagnostic.
<<_Don't test-makers remove bias from tests?_
Most test-makers review items for obvious biases, such as offensive words. But
this is inadequate, since many forms of bias are not superficial. Some
test-makers also use statistical bias-reduction techniques.>>
Understand this: There are differences in in how well certain ethnic groups,
socio-economic classes, etc. perform on certain tests.
Now, rather than trying to make it so that these groups all perform at the same
level on a test that is being developed, what the test makers rightly do is
analyze every question and the responses to assure that the ethnic, sex, and
SES biases don't vary from that of the instrument as a whole. And they work to
match the bias to a previous instrument that is considered an exemplar of good
assessment. Only when trying to enact a specific change in bias do they alter
the overal biases of the instrument.
Remember validity from above? If the test is assessing the things that the
makers think is important (e.g. things that predict success in college) then it
is valid. The only real bias is against folks who don't know the answers. If
there is inappropriate bias, it is at a higher level - like what the colleges
want their incomming freshmen to be able to do.
<<However, these techniques cannot detect underlying bias in the test's form or
content. As a result, biased cultural assumptions built into the test as a
whole are not exposed or removed by test-makers.>>
I'm not willing to ridicule the concept of bias in test form, just yet, but I'd
want more compelling evidence than a brief suggestion made by people who are
obviously willing to mislead the uninformed to further their personal agenda.
<<_Do IQ tests measure intelligence?_>>
I almost skipped this entry, but since I'm on a roll, I figured I'd include it.
:-)
IQ test are very different and it is hard to generalize about them. They
certainly measure how good someone is at taking that test. They also have
valid uses, such as one of various measures used for diagnosing giftedness.
<<IQ tests assume that intelligence is one thing that can be easily measured
and put on a scale>>
Bzzzt, thanks for playing. Most IQ test researchers acknowledge that it's a
hugely complex issue and they seek to evaluate a variety of forms of
intelligence and assign multiple vectors which get rolled into one as a central
score.
<<rather than a variety of abilities. They also assume intelligence is fixed
and permanent.>>
No they don't. The research points to a variable decrease in raw intelligence
as people age.
<<However, psychologists cannot agree whether there is one thing that can be
called intelligence, or whether it is fixed, let alone meaningfully measure
"it.">>
This in meaningless in light of my corrections of the authors' earlier
statements.
<<Studies have shown that IQ scores can be changed by training, nutrition,>>
Not much. Not on reliable tests.
<<or simply by having more friendly people administer the test.>>
Maybe. There is a fair chance for misdiagnosis if the psychometrist
administering the test isn't completely unbiassed, scrupulous, and thoroughly
trained.
<<In reality, IQ tests are nothing more than a type of achievement test which
primarily measures knowledge of standard English and exposure to the cultural
experiences of middle class whites.>>
This is one of the many statements that are interesting. Too bad they don't
back up these assertions with pointers to research. This may be true for some
IQ tests. Maybe even for some of the well respected ones. On the other hand,
there are IQ tests written in "ebonics" and IQ tests that use nothing but
logical symbols.
<<_Do tests reflect what we know about how students learn?
No. Standardized tests are based in behaviorist psychological theories from the
nineteenth century. While our understanding of the brain and how people learn
and think has progressed enormously, tests have remained the same. Behaviorism
assumed that knowledge could be broken into separate bits and that people
learned by passively absorbing these bits.>>
Learning, like intelligence, is an enormously complex process that we know
little about. It is clearly true that the behaviorists had an even more
simplified view of it than we do today, they also had the foundation right.
Much of our knowledge is learned in exactly the way described. What they
really fail to account for is complex learning and processing at higher
cognitive levels.
<<Today, cognitive and developmental psychologists understand that knowledge is
not separable bits and that people (including children) learn by connecting
what they already know with what they are trying to learn. If they cannot
actively make meaning out of what they are doing, they do not learn or
remember.>>
Completely agreed.
<<But most standardized tests do not incorporate the modern theories and are
still based on recall of isolated facts and narrow skills.>>
And are still valid as a wildly cheap way of processing millions of students
and providing one of the many diagnostics that the educational system must use
to help all students excel.
<<_Do multiple-choice tests measure important student achievement?_
Multiple-choice tests are a very poor yardstick of student performance.>>
Right...grades work better for that. And grades "are a very poor yardstick of
student" knowledge. Tests more accurately reflect that which they are
examining, and grades more accurately reflect what the students have done
during a class. It's good that we have both.
<<They do not measure the ability to write,>>
They could measure the ability to critically judge writing, or to edit, or
recognize good technique.
<<to use math,>>
Excuse me? This is so patently wrong that I don't know what to say. If
someone disagrees, maybe we'll get into it. Otherwise I'll assume you agree.
<<to make meaning from text when reading,>>
I disagree. I think it's easy to devise MC questions that determine whether or
not a student understood a passage.
<<to understand scientific methods or reasoning>>
Disagree.
<<or to grasp social science concepts.>>
Disagree.
<<Nor do these tests adequately measure thinking skills or assess what people
can do on real-world tasks.>>
Well, they assess some kinds of thinking skills. But certainly not all of
them. And they don't really work for performance of "real-world" tasks at all.
<<_Are test scores helpful to teachers?_
Standardized, multiple choice tests were not originally designed to provide
help to teachers.>>
Right, so who cares? Duh! If the tests aren't meant to provide teachers with
data, then how can you call the fact that they don't, a negative attribute?
<<Classroom surveys show teachers do not find scores from standardized tests
very helpful, so they rarely use them.>>
What survey? How was it phrased. Through my two full degrees in education, I
encountered several teachers who used the test scores and found them helpful.
I also encountered too many teachers who didn't know how to use them, but
thought they did. And if they rarely use them, then why do we complain that
teachers are forced to teach to these tests? Obviously this is just false.
<<The tests do not provide information that can help a teacher understand what
to do next in working with a student because they do not indicate how the
student learns or thinks.>>
They can quickly and cheaply indicate that a student already knows subject
matter x, and needn't go over it once again. Or that student Y is behind in
spelling. What they can't tell the teacher is how to help student Y who is
behind in spelling (or whatever). That's why the teachers hang out with the
students for 35 hours a week, so that they get to know them and their learning
styles, and can provide individual attention.
<<Good evaluation would provide helpful information to teachers.>>
What a great comment. Not only is it simply untrue - because not all
evaluation tools are meant to assist the teacher - but the implication that
standardized tests are poor evaluation is also false. See above.
<<_Are readiness or screening tests helpful?_
Readiness tests, used to determine if a child is ready for school, are very
inaccurate and unsound.>>
I don't know. I've never read anything substantial about these. My first
reaction was negative, but that's just because this source is clearly
untrustworthy. They may have this one spot on.
<<They encourage overly academic, developmentally inappropriate primary
schooling.>>
Again, I think it's inappropriate to blam an instrument for this type of
behavior. The misuse of the test is the problem, not the test itself. I do
side pretty heavily with the Montessori method for early childhood education,
in that self-paced and self-guided learning seems more likely to promote
stronger intellectual skills later in life. (I hope so anyway. I pay $800 a
month to send my kindergardener to a private Montessori instead of "free"
public school.)
<<Screening tests for disabilities are often not adequately validated; they
also promote a view of children as having deficits to be corrected, rather than
having individual differences and strengths on which to build.>>
Maybe. On the other hand a learning disability is a good thing to
identify...so that you can identify their strengths and build on them. It
seems that every complaint these folks have with standardized testing is more
correctly a complaint about the misuse of the tools by humans. I think that's
the real problem.
<<_Are there better ways to evaluate student achievement or ability?_
Yes.>>
Are there more cost effective ways? Someone has to pay for the new, fancy and
time-consuming evaluations that are probably more comprehensive. Who should?
It could raise your property tax by 100%
<<Good teacher observation,>>
Shouldn't this already be happening? Show me the school where the teachers
aren't observing the students and I'll show you a school where the entire
educational staff should be fired -- and quite possibly cast into the streets
and stoned.
<<documentation of student work, and performance-based assessment, all of which
involve the direct evaluation of student effort on real learning tasks,>>
And still contain biases, cost more, aren't reliable, are of dubious validity,
and are still not really indicative of what the student can do. And they
certainly don't evaluate what the student knows. Traditional tests are better
at getting to that kind of info. These evaluation techniques should be used to
whatever extent is practicable in conjunction with standardized tests.
<<provide useful material for teachers, parents, the community and the
government.>>
Just like standardized, multiple choice, paper-pencil tests.
Whew!
Chris
|
|
Message has 2 Replies: | | Re: Standardized tests (was: Yummy!)
|
| (...) As a side comment just on this part of your reply, Chris (I can't speak to the rest): I was interested by the fact that you referred to the standardised mutiple choice tests as "traditional tests". When I was in high school in the UK in the (...) (25 years ago, 7-May-00, to lugnet.off-topic.debate)
| | | Re: Standardized tests (was: Yummy!)
|
| (...) Whew. That was a long post. Thoroughly enjoyable reading, it's always a joy to see a well written refutation that skewers something point by point. You have more patience than I do, I never could have stuck with that tripe all the way to the (...) (25 years ago, 7-May-00, to lugnet.off-topic.debate)
|
Message is in Reply To:
| | Yummy!
|
| xfut .o-t.debate [Mmmm, Debate Topics] I just noticed the [new?] links on the front of .off-topic.debate... Cool! Thanks to whoever put them on! Oh, and I absolutely agree with this article, and also what they say about the SATs (take the link to (...) (25 years ago, 6-May-00, to lugnet.off-topic.debate, lugnet.admin.general)
|
11 Messages in This Thread:
- Entire Thread on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
This Message and its Replies on One Page:
- Nested:
All | Brief | Compact | Dots
Linear:
All | Brief | Compact
|
|
|
|