Design of Experiments > Case-Control Study
What is a Case-Control Study?
A case-control study is a retrospective study that looks back in time to find the relative risk between a specific exposure (e.g. second hand tobacco smoke) and an outcome (e.g. cancer). A control group of people who do not have the disease or who did not experience the event is used for comparison. The goal is figure out the relationship between risk factors and disease or outcome and estimate the odds of an individual getting a disease or experiencing an event.
Case-control studies have four main steps:
- The study begins by enrolling people who already have a certain disease or outcome.
- A second control group of similar size is sampled, preferably from a population identical in every way except that they don’t have the disease or condition being studied. They should not be selected because of an exposure status.
- People are asked about their exposure to risk factors.
- Finally, an odds ratio is calculated.
The two types of case-control studies are:
- Non-matched case-control study: this is the simplest form. Find a person with the disease and enroll them in the study. Then enroll a control and determine their exposure status.
- Matched case-control: Find a person with the disease and enroll them in the study. Match the person for some characteristic (e.g. sex, age, weight) with a control. This can eliminate or minimize confounding variables. However, it generally results in a longer study; the more characteristics being “matched”, the longer the study takes.
Advantages and Disadvantages
A case-control study is often the best choice for rare conditions or diseases. Let’s say 10 people in Duval county in Florida had a particularly rare disease. Random sampling for a cohort study would involve large numbers of people and may not pick up any of the diseased people at all. With a case-control study, all 10 people who have the disease can be identified (assuming they are in a medical database) and enrolled in the study. Random sampling could then be used on the non-diseased population to form the control group.
- Short term study that doesn’t require waiting for events to happen, as they have already occurred.
- Multiple risk factors can be studied at the same time.
- Quickly establishes associations between risk factors and disease. This can be especially useful with disease outbreaks, as causes can be identified with small sample sizes.
- Stronger than cross-sectional studies for establishing causation.
- Control groups can be difficult to find.
- Results can easily be tainted by recall bias, where people with the disease or condition are more likely to remember past details compared to people who don’t have the disease or condition.
- Is weaker than a cohort study for establishing causation.
- Usually not generalizable.
Examples from Real Life
- This study for non-Hodgkin lymphoma found a connection between the disease and inflammatory disorders like Sjögrens, Celiac and rheumatoid arthritis.
- This study investigated how increased consumption of fruits and vegetables protects against Cervical Intraepithelial Neoplasia.
- This INTERHEART study looked at second hand tobacco smoke and increased risk of myocardial infarction.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.