Kit / Grads Absorb the News
"It should never have appeared in the press," blared an article from The Guardian's Lois Beckett last week—seeming to refer to the findings of a recent Brookings Institution survey of college students that had been reported on widely, including here at Reason.
According to the lead researcher in that study, John Villasenor of the University of California, Los Angeles (UCLA), "The survey results establish with data what has been clear anecdotally to anyone who has been observing campus dynamics in recent years: Freedom of expression is deeply imperiled on U.S. campuses." Among other things, the poll found that 44 percent of students at four-year universities don't know the First Amendment protects so-called hate speech from censorship.
Beckett's suggestion—amplified by a number of others, including Daniel Drezner at The Washington Post—is that these numbers should not be trusted. The Brookings poll "was not administered to a randomly selected group of college students nationwide, what statisticians call a 'probability sample,'" she wrote. "Instead, it was given to an opt-in online panel of people who identified as current college students."
So what? Should you disregard the findings? Do we consider the survey "debunked"? My answer is no. The meat of Beckett's critique—that the poll didn't draw a random sample of college students—is grounded in real science. Random sampling is the gold-standard methodology in survey research, and this study did not use it. But it would not be accurate to say there's a consensus in the field that no other kinds of polls are ever valid or worth citing, something the Beckett piece strongly implies.
These days, lots of well-respected outfits are doing sophisticated work outside the confines of traditional probability polls. All surveys can be done well or poorly, and some online surveys really are garbage. Nonetheless, it's a stretch to claim that any poll that uses an opt-in panel is necessarily junk, and it's far from clear from the facts presented in the Guardian story that the study in question should be included among the bad ones.
That's the short answer, and perhaps I should leave it there. But for those who might be interested in the nitty-gritty details behind this dispute, I'll go a little further below the fold.
The charges being leveled here are threefold: that the methodology section appended to the study is overly vague; that because the poll was done via an opt-in nonprobability web panel, its findings are not trustworthy; and that Villasenor estimated a "margin of error" for the poll, a no-no for this type of research.
The first complaint is fair enough. The author says that he "requested that UCLA contract with a vendor for the data collection," omitting whom the school contracted with and how the survey was actually conducted. Disclosing that sort of thing is an important part of research transparency, so it's understandable that its absence raised questions. But of course, this doesn't necessarily mean the results are bad, only that there's good reason to seek more clarity. Catherine Rampell did that, and laid out her findings, in this piece for The Washington Post.
The second complaint is the serious one. And here it helps to have a sense of the history of online polling efforts.
For years, most good survey researchers eschewed nonprobability polling on the grounds that drawing a random sample (i.e., one where everyone has an equal chance of being interviewed) is how you know that the opinions of the relatively small number of people you actually hear from are reflective of the opinions of the population as a whole.
"I think most of us who have been part of the survey research community for more than 10 years were trained with exactly that notion—that the center of truth starts with random sampling," says Mark Blumenthal, the Pollster.com founder who now heads election polling at SurveyMonkey, a company that (in addition to the peer-to-peer web surveys you may be familiar with) offers an online nonprobability panel like the one the Brookings survey relied on. "But many of us have evolved in our thinking?."
Probability sampling is indeed the ideal. Unfortunately, as I've written about fairly extensively, getting a truly random group of people to answer your questions is difficult to the point of being almost impossible in an age when so many refuse to pick up their phones and answer their doors. Even the very best polling companies have seen response rates plummet into the single digits, meaning their raw numbers have to be adjusted ("weighted," in pollster parlance) more aggressively to try to approximate a representative sample. And it's becoming more and more expensive over time.
Under these conditions, researchers began to look for creative alternative ways to estimate public opinion—silver standards, one might say, to random sampling's gold. One of the main methods that has emerged is the online nonprobability panel–based poll.
"All forms of surveys today—whether they start with a probability sample or not—the completed sample is not truly random, and there has to be some sort of correction," Blumenthal says. "We believe we can offer something of similar quality, at a very different price point [compared to traditional probability sampling], and with more speed."
A number of companies are now working in this sphere. The British firm YouGov, probably the best in class when it comes to this type of online panel research, partners with such outlets as The New York Times, CBS News, and The Economist to conduct surveys. In 2013, the American Association of Public Opinion Research (AAPOR) cautiously authorized nonprobability polling and began building out a framework to govern its use. Sandy Berry of Rand Survey Research Group, the company UCLA brought in to oversee Villasenor's study, tells me the methodology they used "is consistent with" AAPOR best practices.
None of which means there are no legitimate critics of nonprobability survey research still out there. Cliff Zukin, a former AAPOR president quoted in the Guardian story, says nonprobability research should be reserved for internal decision-making purposes. If a number is going to be released publicly, he believes it should employ the best methodology available—and for now, that means probability sampling.
"I might not feel the same way in two years," Zukin says. "There's a lot of research being done, and [nonprobability surveys have] gotten much better than they used to be, but there's still a gap?."
How big of a gap? It's extremely hard to say. In 2016, Pew Research Center released a landmark report on the state of nonprobability survey research. Interestingly, it found that Pew's own probability-based panel (where researchers go to great lengths and enormous expense to try to reach a truly random sample of people) did not outperform all of the nonprobability polls on all of the metrics. "While the differences between probability and nonprobability samples may be clear conceptually," the authors concluded, "the practical reality is more complicated."
Among other things, that report saw what researchers have long understood: that web-only research is, for obvious reasons, weakest at estimating the opinions of groups that aren't as active online, such as the elderly, Latinos, and very low-income populations. On the flip side, tech-savvy college kids (the population the Brookings study was looking at) are arguably particularly well-suited to being reached this way.
And that brings us to the third complaint.
It's accurate to point out, as Beckett does, that it doesn't make sense to report a margin of sampling error for a survey that wasn't drawn from a random sample. I emailed AAPOR Vice President David Dutwin, who confirmed that "AAPOR does not advocate using margin of error in non-probability samples." He hurried to add, however, that "some form of error estimation has to be used to assess statistical inference [and] at this time there is no universally accepted measure of error that survey researchers use to apply to non-probability samples."
So Villasenor deviated from the standard of not using the same term ("margin of error") to discuss the uncertainty associated with nonprobability surveys as is customarily used for probability surveys. Rand SRG's Berry concedes this was a mistake on his part*. But is that enough to render the poll "junk science"?
Personally, I think not.
*UPDATE: Villasenor emailed me wishing to have noted that his original writeup said the margin of error can be estimated only "to the extent" the weighted demographics of his interviews are "probabilistically representative." He writes: "Stating the margin of error that would apply in the case of a theoretically perfect sample, accompanied by the appropriate caveat, which I gave, provides more information than staying silent on the issue. It provides information on the limiting case."
Comments