What is Statistical Significance?

Statistics Definitions > What is Statistical Significance?

Statistics isn’t an exact science. In fact, you can think of stats as very finely tuned guesswork. As stats is guesswork, we need to know how close our “guess” is. That’s where statistical significance comes in.

What is Statistical Significance: Overview

Stats is all about taking a piece of the population and making a guess about what that population’s behavior might be like. If you were working with parameters (parameter vs. statistic explanation), there would be no need for guesswork; You’d have all the data. In real life getting all of the data can be costly, time-consuming, or impossible.

For example, Gallup Polls uses stats to estimate who will win the next election. Drug manufacturers use stats to estimate how many people might have side effect from their drugs. And businesses use stats to forecast sales figures for the future.

What is Statistical Significance a Measure of?

Statistical significance is a measure of whether your research findings are meaningful. More specifically, it’s whether your stat closely matches what value you would expect to find in an entire population. In order to test for statistical significance, perform these steps:

1. Decide on an alpha level. An alpha level is the error rate you are willing to work with (usually 5% or less).
2. Conduct your research. For example, conduct a poll or collect data from an experiment.
3. Calculate your statistic. A statistic is just a piece of information about your sample, like a mean, mode or median.
4. Compare the statistic you calculated in Step 3 with a statistic from a statistical table.

There are many different types of tables you can use to figure out if your data is significant or not. For more on hypothesis testing, see our hypothesis Testing index.

The Vioxx Scandal: How Significant is “Significant?”

Just over a decade ago, I had a rare disorder that affected my chest cartilage, called Tietze’s Syndrome. Tietze’s syndrome is an inflammation of the costal cartilage, it’s extremely painful, and the treatment at the time, the “miracle drug” from rheumatism of any kind, was a little pill called Vioxx. There was one slight problem with Vioxx. The manufacturer, Merck, used statistics to hide the fact that the drug had a minor side effect — it caused heart attacks. Thankfully, I wasn’t one of the unlucky ones.

Vioxx is a type of Non Steroidal Anti-Inflammatory Drug (NSAID), similar to aspirin or ibuprofen. The manufacturer, Merck, spent millions (160 million in 2000) in direct-to-consumer advertising for the drug. The drug was approved by the FDA in 1999 and withdrawn in 2004 after a slew of legal suits claimed the drug caused 23,800 cardiovascular events (including heart attacks) and a 2004 study (APPROVE) that found a statistically significant excess of cardiovascular events in Vioxx patients compared to placebo patients.

The Vioxx Scandal: What happened?

So what happened? How could a drug get approved by the FDA and put out to market if the drug was unsafe? The answer is that when the original studies about the drug were published, several of the Merck authors neglected to include three myocardial infarctions (heart attacks) in the final revision of the paper. They knew about the heart attacks, but also knew that including those three cases would make it statistically significant enough that the drug would not be approved. Three cases washed under the carpet were enough to turn an event from a statistically insignificant one (i.e. the drug got approved) to a significant one (the drug would not have got approved).

I can only imagine what went on at Merck, and the pressure these scientists had to “forget” those three cases after Merck had spent millions on development (not to mention the billions they were going to get from sales). Most likely, the scientists (or their bosses) didn’t understand the meaning of statistically significant, because I can’t imagine they imagined that hiding three cases in a sample would lead to tens of thousands of heart attacks in the population…

------------------------------------------------------------------------------

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.