Statistics Definitions > What is Statistical Significance?

Statistics isn’t an exact science. In fact, you can think of stats as very finely tuned guesswork. As stats is guesswork, we need to know how close our “guess” is. That’s where **statistical significance** comes in.

## What is Statistical Significance: Overview

Stats is all about taking a piece of the population and making a guess about what that population’s behavior might be like. If you were working with parameters (parameter vs. statistic explanation), there would be no need for guesswork; You’d have *all* the data. In real life getting all of the data can be **costly, time-consuming**, or **impossible**.

For example, Gallup Polls uses stats to estimate who will win the next election. Drug manufacturers use stats to estimate how many people might have side effect from their drugs. And businesses use stats to forecast sales figures for the future.

## What is Statistical Significance a Measure of?

Statistical significance is a measure of **whether your research findings are meaningful**. More specifically, it’s whether your stat closely matches what value you would expect to find in an entire population. In order to test for statistical significance, perform these steps:

- Decide on an alpha level. An alpha level is the error rate you are willing to work with (usually 5% or less).
- Conduct your research. For example, conduct a poll or collect data from an experiment.
- Calculate your statistic. A statistic is just a piece of information about your sample, like a mean, mode or median.
- Compare the statistic you calculated in Step 3 with a statistic from a statistical table.

There are many different types of tables you can use to figure out if your data is significant or not. For more on hypothesis testing, see our hypothesis Testing index.

## The Vioxx Scandal: How Significant is “Significant?”

Just over a decade ago, I had a rare disorder that affected my chest cartilage, called Tietze’s Syndrome. Tietze’s syndrome is an inflammation of the costal cartilage, it’s extremely painful, and the treatment at the time, the “**miracle drug**” from rheumatism of any kind, was a little pill called Vioxx. There was one *slight* problem with Vioxx. The manufacturer, Merck, used statistics to hide the fact that the drug had a *minor *side effect — it caused heart attacks. Thankfully, I wasn’t one of the unlucky ones.

Vioxx is a type of Non Steroidal Anti-Inflammatory Drug (NSAID), similar to aspirin or ibuprofen. The manufacturer, Merck, spent millions (160 million in 2000) in direct-to-consumer advertising for the drug. The drug was approved by the FDA in 1999 and withdrawn in 2004 after a slew of legal suits claimed the drug caused **23,800 cardiovascular events** (including heart attacks) and a 2004 study (APPROVE) that found a statistically significant excess of cardiovascular events in Vioxx patients compared to placebo patients.

## The Vioxx Scandal: What happened?

So what happened? How could a drug get approved by the FDA and put out to market if the drug was unsafe? The answer is that when the original studies about the drug were published, several of the Merck authors neglected to include three myocardial infarctions (heart attacks) in the final revision of the paper. They knew about the heart attacks, but also knew that including those three cases would make it statistically significant enough that the drug would not be approved. Three cases washed under the carpet were enough to turn an event from a **statistically insignificant** one (i.e. the drug got approved) to a **significant one** (the drug would not have got approved).

I can only imagine what went on at Merck, and the pressure these scientists had to “forget” those three cases after Merck had spent **millions** on development (not to mention the billions they were going to get from sales). Most likely, the scientists (or their bosses) **didn’t understand the meaning of statistically significant**, because I can’t imagine they imagined that hiding three cases in a sample would lead to tens of thousands of heart attacks in the population…

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!