My column last week reviewed the controversy over whether representative surveys can be conducted over the Internet using non-random "opt-in" panels of volunteer respondents recruited online. They key issue is the challenge that these surveys pose to one of the "first principles" of polling: Can we get reliable, valid results from a survey that does not begin as a random sample?
This week I want to look more closely at a recently released analysis of a 2004 study examining the accuracy of a conventional telephone survey as compared to Internet surveys sampled from opt-in panels. The study also involved a hybrid of the two approaches that recruits Internet panel members using conventional telephone interviewing.
Knowledge Networks, the company that conducts surveys using the hybrid method, was founded by two Stanford academics, Norman Nie and Doug Rivers. That fact is relevant to this story because, according to Stanford professor Jon Krosnick, Nie had seen an earlier study conducted by Krosnick comparing the accuracy of different survey methods used during the 2000 election. Krosnick also said that Nie offered to have his research center, the Stanford Institute for the Quantitative Study of Society, support a similar study covering more topics and include more opt-in Internet surveys.
So with Nie's support, Krosnick and Rivers, who by then had left as CEO of Knowledge Networks (he remained chairman until 2006), collaborated on a new study in 2004 that involved parallel questionnaires used on surveys conducted at roughly the same time: one using a conventional random-digit-dial (RDD) sample, one using the Knowledge Networks panel and seven that used non-random, opt-in Internet panels. The surveys included many items that could be scored for accuracy against benchmark statistics gleaned from government records or high-quality government surveys that have very high response rates.
At about the same time, Rivers founded Polimetrix, a company that now conducts Internet surveys drawn from opt-in panels (the original plan, according to Rivers, was to recruit panelists off of phone calls made by pollster parners). Polimetrix was acquired by the British firm YouGov. (YouGov/Polimetrix is the owner and primary sponsor of my Web site, Pollster.com.)
In 2005, Krosnick did a preliminary presentation on the results (based on a draft paper co-authored with Rivers) at the annual conference of the American Association for Public Opinion Research. Then, according to Krosnick, the data sat until a graduate student named David Yeager did the additional analyses that are the basis of the newly released paper.
The most critical aspect of the new Krosnick-Yeager analysis was that it used the same procedure to weight the raw data demographically so that all nine surveys would be equally representative in terms of gender, age, race, Hispanic origin, education and census region. They then calculated the average error for each survey on 13 additional measures of either "secondary demographics" (such as home ownership, income and household size) and other non-demographic questions (such as self-reports of drinking, smoking or having a passport).
As the chart shows, they found larger errors for the opt-in Internet panels than for the telephone and Internet surveys based on probability samples. How big were those errors? That's where the debate starts.
Krosnick and Yeager conclude that "probability sample surveys were consistently more accurate than the non-probability sample surveys" and argue in a separate blog post that the average errors for the opt-in surveys were not "slight" but rather "almost twice as big as for the probability samples."
Doug Rivers, who had no involvement in the more recent analysis, saw the same numbers differently. He saw the accuracy scores for the opt-in panel surveys as only "around 2% worse" than the probability samples in both the 2004 analysis and the newer reweighted results from the paper by Krosnick, Yeager, et al.
The paper that Rivers drafted with Krosnick in 2004 includes the costs paid to each of the nine organizations that conducted surveys. The cost per interview for the telephone survey ($53) was more than double the Knowledge Networks survey ($25) and more than three times the average cost for the opt-in Internet surveys ($14, ranging from $7 to $21). "Even if it were impossible to eliminate the extra 2% of error from opt-in samples," Rivers wrote last month, "they could still be a better choice for many purposes than an RDD sample that cost several times as much."
But what about the big question posed in last week's column and today? Can surveys produce valid and reliable data without the theoretical underpinning of a random sample?
To try to answer that question, consider first what the Krosnick-Yeager study says about the conventional telephone survey that did begin with a random sample. Despite a degree of methodological rigor that far surpasses most national political surveys (a five-month field period, up to 12 attempts to reach a "no answer" and advance letters offering a $10 incentive), the un-weighted data still managed to under-represent women, whites and those with a high school education by statistically significant margins.
The accuracy that real-world probability samples achieve is partly the result of weighting and other adjustments that pollsters do. What "theoretical framework" explains how the probability samples in this study produced accurate results despite response rates ranging from 35.6 percent (for the telephone survey) to just 15.3 percent (for the probability sample Internet survey)?
A statistical approach, known as model-based inference, does exist that helps explain what pollsters do when they use weighting or other forms of modeling to correct for response bias and produce accurate results. The existence of that approach does not, by itself, solve the problem of opt-in samples, but opt-in Internet pollsters argue that they can extend weighting and modeling techniques to significantly reduce sources of bias from non-random panels.
Krosnick wants to be clear that he sees no evidence yet that "opt-in sample surveys are as accurate as or more accurate than probability sample surveys," and given their lack of foundation in probability sampling, he is not optimistic that they ever will be. "It's essential for us to be honest about what our data are and what they are not," he added.
True enough, but I would add one thought. Our honesty should extend to the limitations of probability samples as well. In the Krosnick-Yeager study, for example, despite very sophisticated weighting, that very expensive, very rigorous telephone survey still produced errors outside the margin of error on 4 of 13 benchmarks. By random chance alone, it should have produced no more than 1.
CORRECTION: The original version of this column misstated when Doug Rivers left Knowledge Networks.
CLARIFICATION: The original study was funded by a grant from SPSS Inc. to the Stanford Institute for the Quantitative Study of Society. At the time, Norman Nie served as both director of SIQSS and chairman of the board of SPSS, the company that also provided non-probability sample No. 6 for the original study. According to Rivers, SPSS also contracted individually with him and Jon Krosnick for their work on the study.