The Survey Says...
What Everyone Should Know About Statistics
January 9, 2006
Whenever discussions arise on issues important to gays and lesbians, you can pretty much count on the fact that before too long someone is going to cite a survey or a research article to claim one thing or another. Whether the claims involve child sexual abuse, divorce, promiscuity, the quality of relationships or what is best for children, percentages are tossed around with abandon on both sides. And since each side can provide competing and apparently contradictory statistics to support their claims, it’s hard to know who to believe.
To get to the bottom of things, you would have to read the research itself, which few people are able to do. Even newspaper reports of survey results don’t provide the important details you would need in order to know how accurate or trustworthy the survey is. But if you could look at the actual report, you’d be able to figure out how the research was conducted and what the numbers really mean.1
The Response Rate
The first factor that we need to understand is how many people were invited to participate in a study versus how many actually responded. This standard measure is known as the response rate. All reputable surveys provide a response rate, although news accounts of these surveys virtually never do. This is unfortunate, because the response rate is a key measure of the survey’s reliability.
When the survey is being conducted people are contacted and invited to participate, but not everyone agrees to do so. Then, as the survey is underway, others will drop out or otherwise fail to complete the study. But when people decline to participate or drop out of a study, they may do so for reasons which can be very important to the study’s results. This is more likely to happen when the study calls for revealing deeply personal or embarrassing information, which can have important consequences to the study’s outcome and can easily skew the final results.
If the response rate is low, it indicates that the sample consisted of self-selected participants — in other words, it consisted of people who are motivated for some reason to participate. Perhaps they were interested in influencing the study’s outcome. On the other hand, people who drop out because they are offended or embarrassed by the study’s questions will end up altering the results as well. After all, their opinions and behavior shape society as much as those who participate, but their inputs are not represented in the results. That’s why a lower response rate is a strong indicator that the results of the study are to be taken with a grain of salt.
There are no hard and fast rules to say what an acceptable response rate is. It can depend on the size of the population relative to the proportion of the responses for a given question. For example, let’s consider a survey to determine what percentage of American taxpayers cheat on their taxes. And let’s suppose that in this survey, only 3% of those participants admitted to cheating on their taxes. On the face of it, this sounds like a ringing endorsement of the honesty of your typical taxpayer. But what if we were to learn that the response rate was 73%? Suddenly it calls the survey’s result into question. With 27% refusing to respond to a question in which only 3% admitted to breaking the law, it’s reasonable to ask why that 27% refused to respond? Is it because they didn’t want to admit to breaking the law? It certainly calls our 3% figure into serious question.
The response rate is a vital component in understanding how reliable the survey is. But just as important is the type of survey itself. Most studies and surveys fall into three basic types: the probability sample survey, the convenience sample survey, and the casual survey. Let’s look at these three survey types, one by one.
Probability Sample Surveys
Surveys need to be well-constructed in order for the results to be accurate. But the only way we could achieve perfectly accurate survey of the U.S. population is to contact every single person in the country – an impossible task. But even if it were possible, our survey wouldn’t be perfectly accurate unless every single person in the U.S. agrees to participate for a 100% response rate, which again is impossible.
That’s why surveys are conducted over smaller groups or samples of the particular population of interest. Groups like the Gallup and Harris organizations design their samples to ensure that they are representative of the larger population that they wish to study. The members of the sample are randomly selected, that is, they are selected according to a system that gives everybody an equal change of being asked to participate. One way to do this is to randomly dial telephone numbers and asking whomever answers the phone to participate. Another way is to comb through census data and randomly select participants to match the demographics you’re looking for.
If it is done correctly, everyone in the particular population of interest would have exactly the same odds of being selected for the survey as anyone else. We can demonstrate that a our sample is truly representative by comparing several factors of our sample to what we know about the general population – such as economic status, religion, ethnicity, gender, age, geographic location, education, and so forth. If those factors match, then we can be confident that the sample is reasonably representative of the population we wish to measure.
When a sample is properly constructed, we can predict within a specified margin of error the opinions or behaviors of the general population. This margin of error is easy to calculate, using a simple formula based on the sample size and the responses given by the members of our sample.2 When all of this is taken into account, you can predict the opinions of the general population, or even the likely winner of an election. Get it wrong, and you end up with a headline declaring “Dewey Defeats Truman”. This type of sampling is known as probability sampling, and as you can imagine, it is a very complicated and expensive process.
Convenience Sample Surveys
Probability sampling is far too expensive and cumbersome for most psychological, medical or sociological research. Fortunately, it is almost never necessary. These researchers are more interested in observing a relatively narrow set of conditions, for example simple observations of outcomes for various conditions and experimental treatments. That’s why researchers typically use a number of non-probability sampling techniques, the most common of which is known as a convenience sample.
A convenience sample is just what it sounds like – a group of people who are readily available to the researcher. Members of this group are selected only according to the specific characteristics that the researcher cares about. These participants may come from any number of sources: patients from a clinic or medical practice, student volunteers, advertisements in newspapers and magazines, and so forth.
But no matter how a convenience sample is recruited, the key point is this: since there is no attempt to match the characteristics of the convenience sample to the general population, the extent to which a convenience sample represents the traits or behaviors of the general population cannot be known – and this is true regardless of how large the sample may be. That’s why when someone tries to draw a conclusion about the broader general population by using research from a convenience sample, they run a very real danger of drawing exceptionally wrong conclusions.
For example, let’s suppose a researcher wants to correlate the relationship between three diets and weight loss in persons with heart disease. To find participants for his study, he goes to a local clinic and enrolls overweight volunteers with heart disease. The volunteers are then divided into four groups: three groups are put on three different diets, and the fourth group continues to eat normally, serving as a control group. All four groups are monitored for changes in weight.
Researchers often record many pieces of seemingly unrelated information during the course of their study in case something unexpected should happen to “pop out”, suggesting the need for further research. In this case, our researcher records the volunteers’ prescription medications in order to monitor the patients’ health during the study. In this case, let’s say that 25% of the patients in this sample are prescribed medicine ‘A’ 45% take medicine ‘B’, and 30% are on medicine ‘C’.
A Real-World Example
Anti-gay activists point to the “Dutch Study” to claim that gay couples are promiscuous. But this study was a convenience sample that didn’t intend to study monogamous couples. To learn more about how anti-gay activists have misused this study, see What the “Dutch Study” Really Says About Gay Couples.
Now that the study is over, let’s imagine a drug salesman who reads this report and tells his customers that medicine ‘B’ is the most popularly prescribed medication for patients with heart disease. In doing so, he is likely to be feeding his clients false or misleading information. Remember, our researcher didn’t care which medication was more popular, so he didn’t try to construct a representative sample based on that criterion. What if the clinic that treated these patients had contracted its services to an HMO has a low-cost agreement with the manufacturer of medicine ‘B,’ thus preferring it over the others? A convenience sample from a different clinic may have shown a very different proportion of prescribed medicines.
Using the data from a convenience sample to describe characteristics which the study did not intend to model will almost certainly result in erroneous conclusions. That’s why study authors who use convenience samples nearly always warn against using their data to characterize the general population.
Casual surveys represent the least reliable type of survey, and unfortunately they are in many ways the most popular. For the most part, casual surveys involve indiscriminately distributing questionnaires, casting a wide net hoping to get as many participants as possible. Whether they take the form of online polls, toll-free telephone numbers, or a complex questionnaire printed in a popular magazine, these surveys only reflect the opinions of those who are specially motivated to participate.
Another Real-World Example
The Gay Report was the result of 5,400 responses to a very large casual survey. But in reading the report, it quickly becomes apparent that size doesn’t matter much when it comes to accuracy. For more information, see The Gay Report.
By relying on these motivated participants, casual surveys nearly always suffer from participation bias. This bias can be so great as to render the survey meaningless, no matter how many participants there are. With these casual surveys, there is something about the survey (the topic or specific questions, for example) which encourages some to participate and others to ignore it. Those who participate may feel that they have something particularly relevant or interesting to share, or they may feel especially motivated to influence the results of the survey. Conversely, the opinions and experiences of those who decide not to bother are not reflected in the survey. This allows the biases of those who participate to skew the survey’s results. Such surveys, while very popular, virtually never produce significantly meaningful data.
When evaluating research, it is vitally important to know how the research sample was put together in order to properly understand the results. That’s why when looking at the results, it is important to ask these questions:
- How was the sample population recruited?
- If the sample population was a truly random sample, what is the margin of error?
- If the sample population was not a random sample, how were they selected? Where did they come from?
- What were the researchers looking for? Who were the sample members and what were the criteria for accepting or rejecting them?
- How large or small is the sample?
- How was the control group recruited if there is one?
- What is the response rate?
- If the response rate was low, what were the reasons for the low response rate?
Whenever you hear survey statistics bandied about, you should keep these questions in mind. Without understanding the answers to these questions, it is impossible to know whether the results can be trusted. Unfortunately, many people are quick to cite casual and non-random surveys without knowing or revealing how the survey was conducted. This is exactly how so much misinformation passes into common knowledge and popular lore.
1. For a more detailed description of sampling techniques, please see the British Market Research Association, “Sampling Techniques; Chapter 7: Surveys and Sampling” web page (undated) http://www.deakin.edu.au/~agoodman/sci101/chap7.php (accessed March 17, 2004).
2. Polaris Marketing Research. “Basic statistical testing”. web page (undated) http://www.polarismr.com/education/tools_stat_testing.html (accessed October 14, 2005).