Statistics How To

Fake Statistics: How to Figure out if it’s Real or Not

Main Index>>Basic Statistics>How to Detect Fake Statistics

Fake Statistics: Overview

How do you know whether to trust results from a survey or not? Is it easy to spot fake statistics? Do you believe an egg company when it tells you 50% of consumers in a taste test preferred their eggs? How about if a voluntary survey of U.S. Marines showed overwhelming support for massive pay increases for military personnel? Sometimes it isn’t enough to just accept the data as it is presented. Dig a little deeper and you might uncover one of these common problems with stats.

fake statistics

Fake statistics: Can you trust pollsters? Maybe not.

Finding Fake Statistics: Steps

Step 1: Take a close look at who paid for the survey. If you read a statistic stating 90% of people lost 20 pounds in a month on a certain “miracle” diet, look at who paid. If it was the company who owns that “miracle” product, then it’s likely you have what’s called a self-selection study. In a self-selection study, someone stands to gain financially from the results of a trial or survey. You may have seen those soda ads where “90% of people prefer the taste of product X.” But if the manufacturer of product X paid for that survey, you probably can’t trust the results.

Step 2: Take a look at if the statistics came from a voluntary survey. A voluntary response sample is a sample where the participants can choose to be included in the sample or not. For example, if your professor sent you an email with an invitation to comment on what you think of a new textbook, then that would be a voluntary response sample. If it was a mandatory part of your course, then that would not be a voluntary response sample. Voluntary response samples are not suitable for statistics because they carry a heavy bias toward people who have strong opinions (often negative ones). In other words, students are more likely to respond to the above survey if they hate the textbook. The students who like it will probably be less likely to respond.

Step 3: Look for the faulty conclusion that one variable causes another in the survey. For example, you might read a statistic that states unemployment causes an increase in corn production because corn products (like high fructose corn syrup) are cheap and therefore people are more likely to buy cheap foods when unemployed. But there may be many other factors causing an increase in production including an increase in government subsidies for corn. Just because one factor is seemingly connected to another (correlation), that doesn’t necessarily imply causation (that one caused the other).

Step 4: Beware of journal bias. Journals are likely to report positive results (for example, a drug trial that had a positive outcome) rather than a drug trial that failed. Just because a journal publishes a positive result doesn’t mean that there aren’t other trials out there that reported a negative result.

Step 5: Make sure the sample size isn’t too limited in scope. It’s unlikely you can make generalizations about student achievement in the U.S. by studying a single inner city school in Brooklyn. And it’s unlikely you can make generalizations about American polling behavior by standing outside a polling booth in Ponte Vedra Beach, Florida. Just as inner city schools don’t behave like every other school, an affluent neighborhood can’t be used to generalize about the voting population. Also, make sure the sample size is large enough. If your voting precinct contains 1 million voters, it’s unlikely you’ll get any good results from surveying 20 people.

Step 6: Watch out for misleading percentages. Unemployment may have “slowed by 50%,” but if the unemployment rate was previously 100,000 new unemployment claims per month, that still means 50,000 people are joining the unemployed ranks every month.

Step 7: Beware of precise numbers. If a national survey reports that 3,150,023 households in the U.S. are dog owners, you might be inclined to believe that exact figure. However, it’s highly unlikely (and almost impossible) that anyone would have seriously surveyed all of the households in the U.S. It’s much more likely they surveyed a sample and that 3,150,023 is an estimate and should have been reported as 3 million to avoid being misleading.

There are many other examples of fake statistics. Newspapers sometimes print erroneous figures, drug companies print fake test results, governments present fake statistics in their favor. The golden rule is: question every statistic that you read!

Feel like cheating at statistics?

One thought on “Fake Statistics: How to Figure out if it’s Real or Not

  1. Someone

    1. Nice site!

    2. “Also make sure the sample size is large enough: if your voting precinct contains 1 million voters, it’s unlikely you’ll get any good results from surveying 20 people”.

    Technically, both parts are correct, but I fear the way you phrase this might plant the impression that surveying 20 of say 1,000 voters might produce a more reliable result, while the difference is (I guesstimate) negligible. The question asked makes a much larger difference. For example, I would guess (again) that it is possible to verify that 50% of the population is male using a survey of 20 people. Finding out what fraction is over 80 year old, however, would require many more.