Tuesday, November 26, 2024

The Rotting of the College Board

 

The Rotting of the College Board

Testing is necessary. The SAT’s creator is not

by Naomi Schaefer Riley

 

In 1947, the College Board opened an office in Berkeley, California. Previously, from the turn of the century onward, the organization had been administering entrance examinations for schools in the Northeast, and in 1926 it created and began using the original Scholastic Aptitude Test, or SAT. The Board’s western expansion after World War II was a sign, according to the writer Nicholas Lemann, of its “national aspirations and of the University of California’s high status in public higher education.” But it would take another two decades for the University of California to begin requiring its applicants to take the SAT—and therein hangs a tale about the changes in American education in which testing and the College Board itself have played a decisive and highly problematic role.

It is a tale about how the testing regime of the College Board began to dominate the higher-education admissions system on its way to becoming a behemoth that now grosses $1 billion annually. More important, it is a tale about how universities in the United States changed the way they interact with the primary and secondary educational institutions that feed into them—a change that had, and continues to have, parlous consequences for America’s youth.

After World War II, President Truman formed a commission to “chart the future of higher education,” Lemann explains in his new book, Higher Admissions: The Rise, Decline, and Return of Standardized Testing. In 1947, the commission issued a report that “recommended the democratization of access to college. It called for a massive financial aid program at the undergraduate and graduate levels, free tuition for those attending two-year schools, and a program of continuing education.” It also called for the end of segregation in the South and quotas that were employed against Jews.

But while the commission’s purpose was to open higher education to all, there was a competing idea, says Lemann—one that “came to value and champion a select number of private and public institutions that would nurture a kind of talented elite.” How to both manage a large influx of students from across the country with different educational backgrounds and also funnel them into a small number of slots was a question that might be solved by standardized testing.

The SAT began as an IQ test, a way for higher education to locate students with “aptitude” at educational institutions not necessarily known for producing college-level talent. It was modeled on an IQ test that was given to U.S. Army inductees for World War I. But as such tests became more associated with racial sorting and eugenics, the test eventually changed too—even altering its name from the Scholastic Aptitude Test to the Scholastic Achievement Test.

The story of California’s journey to the testing regime overseen by a third-party institution is instructive. In 1947, the University of California was responsible for accrediting public high schools in the Golden State—meaning it oversaw curriculum and testing practices. The same was true for public universities in states elsewhere in the country as well. “This process, rather than standardized testing, was the primary way California managed the fit between high school applicants and university entrants,” Lemann writes. It was perfectly logical: If a college knew what high-school applicants were learning and got reports in the form of grades showing how they had done in their courses, there was no need for any other measure of their performance.

But over the decades following the Second World War and with the general glorification of the paradisial life in the Golden State, the University of California became a destination for students across the country. This made it more difficult for the school to understand the level of preparation of its growing pool of applicants. And once the public-university system was divorced from K–12 education, which happened in 1963 when UC stopped accrediting high schools, its leaders lost their insight into what the students in California knew.

This was the case everywhere, especially with the growth in the number of students due to the baby boom and the number of people who followed the commission’s recommendations and began attending college. At the highest level, the old pipeline from Exeter and Andover to Harvard and Yale had once meant that university administrators had a pretty good idea of what they were getting when students matriculated. Maybe some were mediocre students with important last names, but even so, their grades in biology could easily be compared with those of their peers. Once the applicant pool expanded, depending on a high-school record alone became more of a crapshoot.

And well beyond the Holden Caulfields and the kids in A Separate Peace, there were thousands of high schools across the country. Without some uniform method of evaluating them, it would have been left to a few admissions officers in each college to familiarize themselves with the quality and requirements of America’s wildly expanding secondary educational institutions serving the Baby Boom generation.

Standardized tests were designed to help solve this problem. And the College Board was, after a time, not the only third-party group to create and administer one. In 1958, E.F. Lindquist of the University of Iowa created the ACT (which used to be an acronym for American College Testing but now is just a three-letter brand). And though it is often thought of as interchangeable with the SAT, it was based on an entirely different premise. Lemann describes it as “a direct challenge to the SAT and everything it stood for.” Lindquist, Lemann writes, “believed that the tests standing between high school and college should measure academic achievement, something not akin to IQ, which was a concept he rejected, and that they should be used for placement, not selection.”

From the outset, the tests had a peculiar regional quality to them: “The SAT’s territory was more on the coasts and the ACT’s more in the middle of the country, the SAT’s more private universities and the ACT’s more public universities, and the SAT’s more selective research universities and the ACT’s more open admissions, teaching oriented universities.”

The original distinctions between the two were lost as a growing movement against standardized testing gathered steam. The push to do away with testing began to find traction in the late 1990s and early 2000s. An organization called FairTest, now the National Center for Fair and Open Testing, started lobbying the NCAA to drop a minimum SAT requirement. Then it campaigned against school-based assessments. It filed briefs in the Michigan affirmative-action case in 2003 and lobbied against the No Child Left Behind bill, which sought to measure school performance through testing and which became law in 2001. 

In 2003, as Lemann writes, “a retired ETS [Educational Testing Service] researcher named Roy Freed published a much-discussed article in the Harvard Educational Review arguing that the SAT is, in fact, racially biased in the technical sense—especially the verbal section of the test.” Divvying up the test into easier and more difficult parts, he found that “Black test takers overperformed on the hard part and underperformed on the easy part.” He concluded that “Black students underperformed on easier items because they often don’t live in the same culture as white students,” and he advocated for a revision of the test, called the R-SAT.

Leaders of the movement managed to persuade mostly middle-class and wealthy parents both that standardized tests were discriminatory and also that they were doing more harm than good—stressing out children in good schools for no reason, distracting from more interesting and important parts of school, and not actually revealing any useful information about a student’s achievement or potential for achievement in the process. In addition, the fact that extra studying and tutoring at parental expense meant that a kid could improve his SAT score significantly (which wasn’t supposed to be possible, theoretically, since it is designed to measure a kid’s ability to learn, not what he knows) suggested to many people that the whole system was rigged in favor of the well-to-do. 

In 2003, University of California President Richard Atkinson gave a public address discussing the trend among wealthier families to pay for test prep, suggesting that it was not only racially discriminatory but also unfair to students coming from poorer backgrounds. Atkinson raised the issue of dropping the SAT entirely or considering the use of the SAT II (subject tests no longer offered) as a better way of sorting students. But neither this idea nor the R-SAT took off.

_____________

The idea that testing is either unfair or designed not for the benefit of students but to generate dollars for educators (a school with good overall scores can be rewarded for them in many states) has had significant effects on the views of parents. In 2023, for instance, 200,000 kids in New York State—mostly from middle- and upper-class families—refused to take the state’s standardized reading and math exams for grades 3–8.

The anti-testing movement reached its zenith thanks to the combination of the 2020 racial awakening and the Covid lockdowns. No one was learning anything in school, no one could show up to a testing site, and the whole project was basically an extension of Jim Crow—so why bother? Colleges across the country went test-optional. And some, as in the case of the University of California, went test-blind—that is, the schools refused to look at test scores at all.

This happened despite the objections by the University of California’s faculty—whose leaders in 2020 issued a report recommending the continued use of the SAT and ACT as admission requirements, citing data showing the tests may actually increase enrollment of disadvantaged students. The report noted that the faculty “did not find evidence that UC’s use of test scores played a major role in worsening the effects of disparities already present among applicants and did find evidence that UC’s admissions process helped to make up for the potential adverse effect of score differences between groups.” Teachers in classrooms, who play no role in the selection of the people they instruct, did not want to be stuck with students who didn’t know how to do the work. Or they didn’t want to be pressured to lower their standards to meet the expectations of such students.

Nonetheless, in 2021, as with most other elite schools, UC ended their testing regimen. Lemann, for one, largely rejects the idea that the tests themselves are racist, taking great pains to note that disparities in scoring do not automatically mean that there is structural racism going on. He is even one of the few mainstream authors willing actually to acknowledge not only the small number of black students who score well on the SATs, but also the huge gaps among racial groups when it comes to college admissions. And Lemann acknowledges that SATs do have some predictive value when it comes to college (and particularly freshman year) performance.

That being said, he also argues that tests simply tend to reinforce the class system that they were meant to undermine because they so closely reflect students’ family background and schooling. He cites recent research by Raj Chetty showing that “of currently enrolled students at Ivy Plus schools, the top 10 percent of the family income distribution accounts for 98 percent of the students with combined SAT scores of 1500 or above.” But Chetty’s work also found that wealthy students had a leg up in admissions, beyond what their SAT scores should have offered them. Indeed, the test scores were the most equalizing measure—when compared with things like extracurricular activities or athletics or essays that were guided by highly paid college counselors.

And, as it turned out, the strategy of going test-optional had the opposite of the intended effect for some schools. A study by Dartmouth College researchers, for instance, found that its test-optional policy resulted in lower-income kids actually withholding scores that would have helped them get in. And this doesn’t even cover the long-term implications. Evidence from both public and private universities has found that admitting kids with lower scores (no matter what their race) made it less likely that they would graduate and more likely that they would switch to an easier major.

By the end of 2023, with the pandemic gone and the racial awakening in full swing, many assumed that the last pieces of dirt were being shoveled onto the corpse of standardized testing. Colleges had admitted two years of classes without the tests and seemed content to continue down that path. And something else was happening as well. The cases testing the constitutionality of affirmative action at Harvard and the University of North Carolina were about to hit the Supreme Court—and the key piece of data used by the plaintiffs to argue they had been discriminated against was test scores. Without testing, schools could adopt an entirely subjective approach to assembling their ideal student bodies, a subjective approach that could use race without providing any evidence that they had done so.

And then, shockingly, there came a vibe shift.

_____________

In January 2024, David Leonhardt published a long piece in the New York Times called “The Misguided War on the SAT.” According to the subtitle, “colleges have fled standardized tests, on the theory that they hurt diversity. That’s not what the research shows.” And suddenly it was OK to say what everyone knew:

[A] growing number of experts and university administrators wonder whether the switch [away from testing] has been a mistake. Research has increasingly shown that standardized test scores contain real information, helping to predict college grades, chances of graduation and post-college success. Test scores are more reliable than high school grades, partly because of grade inflation in recent years.

Without test scores, admissions officers sometimes have a hard time distinguishing between applicants who are likely to do well at elite colleges and those who are likely to struggle. Researchers who have studied the issue say that test scores can be particularly helpful in identifying lower-income students and underrepresented minorities who will thrive. These students do not score as high on average as students from affluent communities or white and Asian students. But a solid score for a student from a less privileged background is often a sign of enormous potential.

Dartmouth decided to reinstitute its testing requirement a few weeks later, a move that sent shock waves through the world of higher education. University leaders told me they had had no advanced notice of Dartmouth’s decision. Though MIT had already reinstituted testing, it was seen as something of an outlier, given its focus on empirical fields of study. But Dartmouth?

All it took was one, though. And within a few months, dozens of schools, including Harvard and Yale, reinstituted their testing requirement. But, as Lemann notes, the court’s decision barring affirmative action will “make it much harder for universities to pursue their preferred course of embracing both standardized tests and racial integration. They will have to choose one or the other.”

It is clear that a number of schools are already trying to cheat the system. How else to explain a drop in the share of black students from 15 percent to 5 percent at MIT while at the University of North Carolina—one of the defendants in the Supreme Court litigation—the black population has remained stable? Can anyone explain why Yale’s percentage of black students was unchanged even as the percentage of Asian admissions fell, while at Harvard the percentage of black students dropped while the share of Asian students remained the same?

Assuming all this corrupt and loathsome gamesmanship is eventually sorted out with additional lawsuits and enforcement—of the kind that was necessary after Brown v. Board of Education was decided—how are schools going to use standardized tests going forward? While all this has been going on, the people who create and sell college admissions tests have been scrambling. A few years ago, they thought they were about to go the way of the buggy whip. They have been spared, it appears, from obsolescence. But now they have to find innovative ways to sell and promote their products in a changing world.

In the spring of 2024, private-equity firm Nexus Capital Management purchased the ACT. In a somewhat embarrassing display of trash-talking, the CEO of the College Board peevishly argued that the acquisition of its competitor was a bad sign for the test because it showed how hidebound and backward the ACT was. “The most striking difference between us and the ACT is that the ACT is essentially the same test as it was 15 years ago,” David Coleman told Inside Higher Ed. “Innovating is hard. We receive criticism every time we innovate.… But leaving things unchanged is a perilous position to take as an institution. If there’s one thing to take from this acquisition, it’s that.”

Sure, David. The fact of the matter is, private-equity firms don’t buy companies that are worthless. And the truth is that the ACT has been growing in popularity, including in the very places that the SAT used to dominate. The college counselors and tutors I’ve spoken to all say they have been much more likely in recent years to recommend the ACT than the SAT for students. Part of the reason is the “innovation” Coleman mentions.

Students and teachers and college counselors want tests to be predictable from one year to the next. The SAT keeps altering itself and the way it works. Recently, it became “adaptive,” in a manner that has entirely unnerved everyone in secondary education. Under its new system, the test can be taken only on a computer. This gives the College Board the ability to change the test as it’s being taken. If students do well in the early questions of the math section, for example, the test will adapt itself and force those kids into harder questions for the second half. But if they do badly, they will be sent to an easier part. The College Board says it has done this mostly to shorten test-taking time from a little over three hours to a little over two. (Since a significant percentage of students now get extra time on these tests, many students were actually spending upward of four and a half hours on the test.)

But what on earth could this mean? The theory behind the validity of standardized tests is that everyone in the country is given the same questions on the same day so that they all compete with one another under the same rules. Under this new system, the scores are adjusted so that students who get the easy section on the second half of the test will have a ceiling on their score; they won’t end up with a ruinous score, but they also won’t be able to get into the highest echelons. This alters the very definition of the word “standard.” It makes an apples-to-apples comparison impossible. It has also already created unwarranted and unpleasant anxiety for kids taking the test who have found themselves suddenly confronted with a schizophrenic test that is easy in the early going and increasingly impossible as it progresses. They leave the test thinking they have failed and have to wait weeks to learn otherwise, just as they are trying to decide what colleges they might want to apply to. The College Board is imposing psychological duress on the people who are forced to pay to take their test.

Given the novelty of “adaptive testing,” educators and parents alike are going to have to parse the exact distribution of the scores on the new tests to understand whether the College Board is trying to use this new test to rearrange things for its own benefit and for some idea of what will be best for society as a whole. Are they doing this to boost the scores of mediocre students and thereby implicitly aid colleges in reorienting themselves in a world without affirmative action?

Over the years, the SAT has futzed around with itself. It got rid of the notorious “analogies” section (cat:dog as teleology:_____). At one point, it added an essay—and then removed it. And scores have been “recentered” multiple times as American students perform worse and worse, “recentering” simply being a euphemism for score inflation.

The ACT, on the other hand, has remained mostly true to its roots. It is designed to figure out what students have learned and where they should be placed and is even expanding its market beyond higher education. Janet Godwin, ACT’s CEO since 2020, told Inside Higher Ed that the company is interested “college and career readiness,” where the results could be useful not only to admissions officers but also employers.

Godwin explains, “We’ve been working in the four-year college domain since our inception, but we’re increasingly also helping learners who are going from high school into community college, or the 40 percent of learners going straight to the workforce…. We want to double down in those areas.”

And, as Lemann argues, that’s exactly what this next phase of standardized testing should do. He thinks it’s time to stop focusing so much on what tests can do for the elite and think about how they can help everyone else. He writes: “The number of students who come from disadvantaged backgrounds and who went to underperforming high schools, but who nevertheless get superior scores on standardized tests is tiny.” He concludes, “It’s not worth it to design a whole system just to catch a few people who, in the twenty-first century, will be going to college anyway.”

Instead, he argues, “if, today, we define the problem that testing is meant to solve not as improving selection for a few elite universities but as improving the too-low graduation rates and other aspects of the student learning experience at a large number of relatively unselective universities, we would be drawn to diagnostic rather than predictive tests, to achievement, rather than aptitude tests.”

Many large schools have already done this. In his book The College Dropout Scandal, David Kirp describes how Georgia State increased its graduation rate to 54 percent from roughly 33 percent in 15 years. “Rather than blaming the students,” one of its administrators told Kirp, “we took a hard look in the mirror.” The school now monitors grades in particular introductory classes after determining that students who do badly are much more likely to drop out of school than seek help. It offers tutoring and other supports before a bad grade has even been registered. Georgia State also uses test scores to help students find majors for which they are qualified and steer them into such areas early on before they have wasted two years on a path they won’t be able to complete.

_____________

The gap between what is necessary for a high-school diploma and what a decent university might expect of its incoming class is vast for many large public schools. More than 40 percent of students entering the California State University system now require remedial education. In other words, Cal State freshmen are not ready for freshman classes at Cal State. More tests can help colleges identify which majors students might be able to succeed at and at least give the kids fair warning of the likelihood they can graduate with a degree in a particular field.

Tests along the lines of the ACT will aid in this process. Maybe it’s of greater importance to find out where millions of kids are and what they actually know than what their potential is. All the years of using the SAT to find diamonds in the rough have effectively given a pass to a lot of mediocre elementary and secondary schools. You want our smart kids? Come and find them for yourself.

Of course, there are all sorts of structural reasons why the signals sent to leaders in K–12 education may not move the needle. And Lemann is startlingly naive about this process when he lists the “broad educational goals that tests should serve.” He notes that “every American child should have a decent public education that leaves him or her truly literate, numerate, and able to think and act as an empowered citizen.” To accomplish that, he says, “would require significantly reforming the school system so that the all-poor, usually all-minority, severely underresourced schools that occupy the system’s bottom tier would become much better.”

Oh, is that all?

Nevertheless, Lemann is correct that testing sends important signals to students, parents, teachers, and politicians about our education system. And the more of those we have, the more likely we will be able to help the largest number of kids. The College Board era should become a memory.

 

Total Pageviews

GOOGLE ANALYTICS

Blog Archive