The Rotting of the College Board
Testing is necessary. The SAT’s creator is not
In 1947, the
College Board opened an office in Berkeley, California. Previously, from the
turn of the century onward, the organization had been administering entrance
examinations for schools in the Northeast, and in 1926 it created and began
using the original Scholastic Aptitude Test, or SAT. The Board’s western
expansion after World War II was a sign, according to the writer Nicholas
Lemann, of its “national aspirations and of the University of California’s high
status in public higher education.” But it would take another two decades for
the University of California to begin requiring its applicants to take the
SAT—and therein hangs a tale about the changes in American education in which
testing and the College Board itself have played a decisive and highly
problematic role.
It is a tale
about how the testing regime of the College Board began to dominate the
higher-education admissions system on its way to becoming a behemoth that now
grosses $1 billion annually. More important, it is a tale about how
universities in the United States changed the way they interact with the
primary and secondary educational institutions that feed into them—a change
that had, and continues to have, parlous consequences for America’s youth.
After World War
II, President Truman formed a commission to “chart the future of higher
education,” Lemann explains in his new book, Higher Admissions:
The Rise, Decline, and Return of Standardized Testing. In 1947, the
commission issued a report that “recommended the democratization of access to
college. It called for a massive financial aid program at the undergraduate and
graduate levels, free tuition for those attending two-year schools, and a
program of continuing education.” It also called for the end of segregation in
the South and quotas that were employed against Jews.
But while the
commission’s purpose was to open higher education to all, there was a competing
idea, says Lemann—one that “came to value and champion a select number of
private and public institutions that would nurture a kind of talented elite.”
How to both manage a large influx of students from across the country with
different educational backgrounds and also funnel them into a small number of
slots was a question that might be solved by standardized testing.
The SAT began as
an IQ test, a way for higher education to locate students with “aptitude” at
educational institutions not necessarily known for producing college-level
talent. It was modeled on an IQ test that was given to U.S. Army inductees for
World War I. But as such tests became more associated with racial sorting and
eugenics, the test eventually changed too—even altering its name from the
Scholastic Aptitude Test to the Scholastic Achievement Test.
The story of
California’s journey to the testing regime overseen by a third-party
institution is instructive. In 1947, the University of California was
responsible for accrediting public high schools in the Golden State—meaning it
oversaw curriculum and testing practices. The same was true for public
universities in states elsewhere in the country as well. “This process, rather
than standardized testing, was the primary way California managed the fit
between high school applicants and university entrants,” Lemann writes. It was
perfectly logical: If a college knew what high-school applicants were learning
and got reports in the form of grades showing how they had done in their
courses, there was no need for any other measure of their performance.
But over the
decades following the Second World War and with the general glorification of
the paradisial life in the Golden State, the University of California became a
destination for students across the country. This made it more difficult for
the school to understand the level of preparation of its growing pool of
applicants. And once the public-university system was divorced from K–12
education, which happened in 1963 when UC stopped accrediting high schools, its
leaders lost their insight into what the students in California knew.
This was the case
everywhere, especially with the growth in the number of students due to the
baby boom and the number of people who followed the commission’s
recommendations and began attending college. At the highest level, the old
pipeline from Exeter and Andover to Harvard and Yale had once meant that
university administrators had a pretty good idea of what they were getting when
students matriculated. Maybe some were mediocre students with important last
names, but even so, their grades in biology could easily be compared with those
of their peers. Once the applicant pool expanded, depending on a high-school
record alone became more of a crapshoot.
And well beyond
the Holden Caulfields and the kids in A Separate Peace, there were
thousands of high schools across the country. Without some uniform method of
evaluating them, it would have been left to a few admissions officers in each
college to familiarize themselves with the quality and requirements of
America’s wildly expanding secondary educational institutions serving the Baby
Boom generation.
Standardized
tests were designed to help solve this problem. And the College Board was,
after a time, not the only third-party group to create and administer one. In
1958, E.F. Lindquist of the University of Iowa created the ACT (which used to
be an acronym for American College Testing but now is just a three-letter
brand). And though it is often thought of as interchangeable with the SAT, it
was based on an entirely different premise. Lemann describes it as “a direct
challenge to the SAT and everything it stood for.” Lindquist, Lemann writes,
“believed that the tests standing between high school and college should
measure academic achievement, something not akin to IQ, which was a concept he
rejected, and that they should be used for placement, not selection.”
From the outset,
the tests had a peculiar regional quality to them: “The SAT’s territory was
more on the coasts and the ACT’s more in the middle of the country, the SAT’s
more private universities and the ACT’s more public universities, and the SAT’s
more selective research universities and the ACT’s more open admissions,
teaching oriented universities.”
The original
distinctions between the two were lost as a growing movement against
standardized testing gathered steam. The push to do away with testing began to
find traction in the late 1990s and early 2000s. An organization called
FairTest, now the National Center for Fair and Open Testing, started lobbying
the NCAA to drop a minimum SAT requirement. Then it campaigned against
school-based assessments. It filed briefs in the Michigan affirmative-action
case in 2003 and lobbied against the No Child Left Behind bill, which sought to
measure school performance through testing and which became law in 2001.
In 2003, as
Lemann writes, “a retired ETS [Educational Testing Service] researcher named
Roy Freed published a much-discussed article in the Harvard Educational
Review arguing that the SAT is, in fact, racially biased in the
technical sense—especially the verbal section of the test.” Divvying up the
test into easier and more difficult parts, he found that “Black test takers
overperformed on the hard part and underperformed on the easy part.” He
concluded that “Black students underperformed on easier items because they
often don’t live in the same culture as white students,” and he advocated for a
revision of the test, called the R-SAT.
Leaders of the
movement managed to persuade mostly middle-class and wealthy parents both that
standardized tests were discriminatory and also that they were doing more harm
than good—stressing out children in good schools for no reason, distracting
from more interesting and important parts of school, and not actually revealing
any useful information about a student’s achievement or potential for
achievement in the process. In addition, the fact that extra studying and
tutoring at parental expense meant that a kid could improve his SAT score
significantly (which wasn’t supposed to be possible, theoretically, since it is
designed to measure a kid’s ability to learn, not what he knows) suggested to
many people that the whole system was rigged in favor of the well-to-do.
In 2003,
University of California President Richard Atkinson gave a public address
discussing the trend among wealthier families to pay for test prep, suggesting
that it was not only racially discriminatory but also unfair to students coming
from poorer backgrounds. Atkinson raised the issue of dropping the SAT entirely
or considering the use of the SAT II (subject tests no longer offered) as a
better way of sorting students. But neither this idea nor the R-SAT took off.
_____________
The idea that
testing is either unfair or designed not for the benefit of students but to
generate dollars for educators (a school with good overall scores can be
rewarded for them in many states) has had significant effects on the views of
parents. In 2023, for instance, 200,000 kids in New York State—mostly from
middle- and upper-class families—refused to take the state’s standardized
reading and math exams for grades 3–8.
The anti-testing
movement reached its zenith thanks to the combination of the 2020 racial
awakening and the Covid lockdowns. No one was learning anything in school, no
one could show up to a testing site, and the whole project was basically an
extension of Jim Crow—so why bother? Colleges across the country went
test-optional. And some, as in the case of the University of California, went
test-blind—that is, the schools refused to look at test scores at all.
This happened
despite the objections by the University of California’s faculty—whose leaders
in 2020 issued a report recommending the continued use of the SAT and ACT as
admission requirements, citing data showing the tests may actually increase
enrollment of disadvantaged students. The report noted that the faculty “did
not find evidence that UC’s use of test scores played a major role in worsening
the effects of disparities already present among applicants and did find
evidence that UC’s admissions process helped to make up for the potential
adverse effect of score differences between groups.” Teachers in classrooms,
who play no role in the selection of the people they instruct, did not want to
be stuck with students who didn’t know how to do the work. Or they didn’t want
to be pressured to lower their standards to meet the expectations of such
students.
Nonetheless, in
2021, as with most other elite schools, UC ended their testing regimen. Lemann,
for one, largely rejects the idea that the tests themselves are racist, taking
great pains to note that disparities in scoring do not automatically mean that there
is structural racism going on. He is even one of the few mainstream authors
willing actually to acknowledge not only the small number of black students who
score well on the SATs, but also the huge gaps among racial groups when it
comes to college admissions. And Lemann acknowledges that SATs do have some
predictive value when it comes to college (and particularly freshman year)
performance.
That being said,
he also argues that tests simply tend to reinforce the class system that they
were meant to undermine because they so closely reflect students’ family
background and schooling. He cites recent research by Raj Chetty showing that
“of currently enrolled students at Ivy Plus schools, the top 10 percent of the
family income distribution accounts for 98 percent of the students with
combined SAT scores of 1500 or above.” But Chetty’s work also found that
wealthy students had a leg up in admissions, beyond what their SAT scores
should have offered them. Indeed, the test scores were the most equalizing measure—when
compared with things like extracurricular activities or athletics or essays
that were guided by highly paid college counselors.
And, as it turned
out, the strategy of going test-optional had the opposite of the intended
effect for some schools. A study by Dartmouth College researchers, for
instance, found that its test-optional policy resulted in lower-income kids
actually withholding scores that would have helped them get in. And this
doesn’t even cover the long-term implications. Evidence from both public and
private universities has found that admitting kids with lower scores (no matter
what their race) made it less likely that they would graduate and more likely
that they would switch to an easier major.
By the end of
2023, with the pandemic gone and the racial awakening in full swing, many
assumed that the last pieces of dirt were being shoveled onto the corpse of
standardized testing. Colleges had admitted two years of classes without the
tests and seemed content to continue down that path. And something else was
happening as well. The cases testing the constitutionality of affirmative
action at Harvard and the University of North Carolina were about to hit the
Supreme Court—and the key piece of data used by the plaintiffs to argue they
had been discriminated against was test scores. Without testing, schools could
adopt an entirely subjective approach to assembling their ideal student bodies,
a subjective approach that could use race without providing any evidence that
they had done so.
And then,
shockingly, there came a vibe shift.
_____________
In January 2024,
David Leonhardt published a long piece in the New York Times called
“The Misguided War on the SAT.” According to the subtitle,
“colleges have fled standardized tests, on the theory that they hurt diversity.
That’s not what the research shows.” And suddenly it was OK to say what
everyone knew:
[A] growing number of experts and
university administrators wonder whether the switch [away from testing] has
been a mistake. Research has increasingly shown that standardized test scores
contain real information, helping to predict college grades, chances of
graduation and post-college success. Test scores are more reliable than high
school grades, partly because of grade inflation in
recent years.
Without test scores, admissions
officers sometimes have a hard time distinguishing between applicants who are
likely to do well at elite colleges and those who are likely to struggle.
Researchers who have studied the issue say that test scores can be particularly
helpful in identifying lower-income students and underrepresented minorities
who will thrive. These students do not score as high on average as students
from affluent communities or white and Asian students. But a solid score for a
student from a less privileged background is often a sign of enormous
potential.
Dartmouth decided
to reinstitute its testing requirement a few weeks later, a move that sent
shock waves through the world of higher education. University leaders told me
they had had no advanced notice of Dartmouth’s decision. Though MIT had already
reinstituted testing, it was seen as something of an outlier, given its focus
on empirical fields of study. But Dartmouth?
All it took was
one, though. And within a few months, dozens of schools, including Harvard and
Yale, reinstituted their testing requirement. But, as Lemann notes, the court’s
decision barring affirmative action will “make it much harder for universities to
pursue their preferred course of embracing both standardized tests and racial
integration. They will have to choose one or the other.”
It is clear that
a number of schools are already trying to cheat the system. How else to explain
a drop in the share of black students from 15 percent to 5 percent at MIT while
at the University of North Carolina—one of the defendants in the Supreme Court
litigation—the black population has remained stable? Can anyone explain why
Yale’s percentage of black students was unchanged even as the percentage of
Asian admissions fell, while at Harvard the percentage of black students
dropped while the share of Asian students remained the same?
Assuming all this
corrupt and loathsome gamesmanship is eventually sorted out with additional
lawsuits and enforcement—of the kind that was necessary after Brown v.
Board of Education was decided—how are schools going to use
standardized tests going forward? While all this has been going on, the people
who create and sell college admissions tests have been scrambling. A few years
ago, they thought they were about to go the way of the buggy whip. They have
been spared, it appears, from obsolescence. But now they have to find
innovative ways to sell and promote their products in a changing world.
In the spring of
2024, private-equity firm Nexus Capital Management purchased the ACT. In a
somewhat embarrassing display of trash-talking, the CEO of the College Board
peevishly argued that the acquisition of its competitor was a bad sign for the
test because it showed how hidebound and backward the ACT was. “The most
striking difference between us and the ACT is that the ACT is essentially the
same test as it was 15 years ago,” David Coleman told Inside Higher Ed.
“Innovating is hard. We receive criticism every time we innovate.… But leaving
things unchanged is a perilous position to take as an institution. If there’s
one thing to take from this acquisition, it’s that.”
Sure, David. The
fact of the matter is, private-equity firms don’t buy companies that are
worthless. And the truth is that the ACT has been growing in popularity,
including in the very places that the SAT used to dominate. The college
counselors and tutors I’ve spoken to all say they have been much more likely in
recent years to recommend the ACT than the SAT for students. Part of the reason
is the “innovation” Coleman mentions.
Students and
teachers and college counselors want tests to be predictable from one year to
the next. The SAT keeps altering itself and the way it works. Recently, it
became “adaptive,” in a manner that has entirely unnerved everyone in secondary
education. Under its new system, the test can be taken only on a computer. This
gives the College Board the ability to change the test as it’s being
taken. If students do well in the early questions of the math section, for
example, the test will adapt itself and force those kids into harder questions
for the second half. But if they do badly, they will be sent to an easier part.
The College Board says it has done this mostly to shorten test-taking time from
a little over three hours to a little over two. (Since a significant percentage
of students now get extra time on these tests, many students were actually
spending upward of four and a half hours on the test.)
But what on earth
could this mean? The theory behind the validity of standardized tests is that
everyone in the country is given the same questions on the same day so that
they all compete with one another under the same rules. Under this new system,
the scores are adjusted so that students who get the easy section on the second
half of the test will have a ceiling on their score; they won’t end up with a
ruinous score, but they also won’t be able to get into the highest echelons.
This alters the very definition of the word “standard.” It makes an
apples-to-apples comparison impossible. It has also already created unwarranted
and unpleasant anxiety for kids taking the test who have found themselves
suddenly confronted with a schizophrenic test that is easy in the early going
and increasingly impossible as it progresses. They leave the test thinking they
have failed and have to wait weeks to learn otherwise, just as they are trying
to decide what colleges they might want to apply to. The College Board is imposing
psychological duress on the people who are forced to pay to take their test.
Given the novelty
of “adaptive testing,” educators and parents alike are going to have to parse
the exact distribution of the scores on the new tests to understand whether the
College Board is trying to use this new test to rearrange things for its own benefit
and for some idea of what will be best for society as a whole. Are they doing
this to boost the scores of mediocre students and thereby implicitly aid
colleges in reorienting themselves in a world without affirmative action?
Over the years,
the SAT has futzed around with itself. It got rid of the notorious “analogies”
section (cat:dog as teleology:_____). At one point, it added an essay—and then
removed it. And scores have been “recentered” multiple times as American
students perform worse and worse, “recentering” simply being a euphemism for
score inflation.
The ACT, on the
other hand, has remained mostly true to its roots. It is designed to figure out
what students have learned and where they should be placed and is even
expanding its market beyond higher education. Janet Godwin, ACT’s CEO since
2020, told Inside Higher Ed that the company is interested “college and career
readiness,” where the results could be useful not only to admissions officers
but also employers.
Godwin explains,
“We’ve been working in the four-year college domain since our inception, but
we’re increasingly also helping learners who are going from high school into
community college, or the 40 percent of learners going straight to the
workforce…. We want to double down in those areas.”
And, as Lemann
argues, that’s exactly what this next phase of standardized testing should do.
He thinks it’s time to stop focusing so much on what tests can do for the elite
and think about how they can help everyone else. He writes: “The number of
students who come from disadvantaged backgrounds and who went to
underperforming high schools, but who nevertheless get superior scores on
standardized tests is tiny.” He concludes, “It’s not worth it to design a whole
system just to catch a few people who, in the twenty-first century, will be
going to college anyway.”
Instead, he
argues, “if, today, we define the problem that testing is meant to solve not as
improving selection for a few elite universities but as improving the too-low
graduation rates and other aspects of the student learning experience at a
large number of relatively unselective universities, we would be drawn to
diagnostic rather than predictive tests, to achievement, rather than aptitude
tests.”
Many large
schools have already done this. In his book The College Dropout Scandal,
David Kirp describes how Georgia State increased its graduation rate to 54
percent from roughly 33 percent in 15 years. “Rather than blaming the
students,” one of its administrators told Kirp, “we took a hard look in the
mirror.” The school now monitors grades in particular introductory classes
after determining that students who do badly are much more likely to drop out
of school than seek help. It offers tutoring and other supports before a bad
grade has even been registered. Georgia State also uses test scores to help
students find majors for which they are qualified and steer them into such
areas early on before they have wasted two years on a path they won’t be able
to complete.
_____________
The gap between
what is necessary for a high-school diploma and what a decent university might
expect of its incoming class is vast for many large public schools. More than
40 percent of students entering the California State University system now
require remedial education. In other words, Cal State freshmen are not ready
for freshman classes at Cal State. More tests can help colleges identify which
majors students might be able to succeed at and at least give the kids fair
warning of the likelihood they can graduate with a degree in a particular
field.
Tests along the
lines of the ACT will aid in this process. Maybe it’s of greater importance to
find out where millions of kids are and what they actually know than what their
potential is. All the years of using the SAT to find diamonds in the rough have
effectively given a pass to a lot of mediocre elementary and secondary schools.
You want our smart kids? Come and find them for yourself.
Of course, there
are all sorts of structural reasons why the signals sent to leaders in K–12
education may not move the needle. And Lemann is startlingly naive about this
process when he lists the “broad educational goals that tests should serve.” He
notes that “every American child should have a decent public education that
leaves him or her truly literate, numerate, and able to think and act as an
empowered citizen.” To accomplish that, he says, “would require significantly
reforming the school system so that the all-poor, usually all-minority,
severely underresourced schools that occupy the system’s bottom tier would
become much better.”
Oh, is that all?
Nevertheless,
Lemann is correct that testing sends important signals to students, parents,
teachers, and politicians about our education system. And the more of those we
have, the more likely we will be able to help the largest number of kids. The
College Board era should become a memory.