Statistics Definitions > Cohort Study
A Cohort study, used in the medical fields and social sciences, is an observational study used to estimate how often disease or life events happen in a certain population. “Life events” might include: incidence rate, relative risk or absolute risk.
A cohort is a defined group, like “nurses,” “10-19 year-olds,” or “college students.” Participants are chosen for a reason, rather than randomly.
The study usually has two groups: exposed and not exposed. If the exposure is rare (for example, exposure to an industrial solvent), then the cohort is called a “special exposure cohort.” Both groups are followed to see who develops a disease and who does not. For example, you could look at cigarette smokers to see who gets breast cancer and who does not. The study would include a group of smokers, and a group of non-smokers.
Cohort Study Classification: Prospective, Retrospective, Case
Cohort studies can be grouped in several ways:
- Prospective: none of the subjects have the disease (or other outcome) being measured when the study commences; data analysis happens after a period of time has elapsed.
- Retrospective (Historical): the researcher looks at historical data for a group. Some of the people in this group have developed the disease, and some have not. This can result in finding out who has the disease and when they developed it.
- Case-control nested within a cohort: a smaller group is chosen from within the cohort for a deeper look. These investigations may include genotyping, collecting tissue samples or other factors.
- Case-cohort: similar to case-control nested within a cohort. The difference is that in a case-cohort study, participants are evaluated for outcome risk factors at any time before the first outcome (i.e. the first incidence of disease).
What is a Prospective Cohort Study?
A prospective cohort study takes a group of similar people (a cohort) and studies them over time. At the time the baseline data is collected, none of the people in the study have the condition of interest. This is in contrast to a retrospective cohort study, which takes a group of people who already have the condition and then attempts to piece together the reasons why. The now famous Framingham Heart Study is one example of a prospective cohort study; the researchers have, to date, studied three generations of Framingham residents in order to understand the causes of heart disease and stroke.
Although none of the participants actually have the disease of interest in a prospective cohort study, some of the cohort are expected to develop the disease in the future. For example, a cohort of thirty-year-old people in a certain town might be studied to see who develops lung cancer. Half of the cohort might be smokers and half may not. This enables comparisons between the two groups.
Once the prospective cohort study has been established, researchers follow up with the participants and track their progress. Follow ups can be:
- In-person interviews.
- Imaging tests.
- Internet questionnaires.
- Lab tests.
- Mail questionnaires.
- Phone interviews.
- Physical exams.
A combination of the above methods may also be used.
Advantages and Disadvantages
- One major advantage of a prospective cohort study is that researchers don’t have to tackle with the ethical issues surrounding randomized control trials (i.e. who receives a placebo and who gets the actual treatment).
- Incidence and prevalence of a disease can be easily calculated.
- Multiple diseases and outcomes can be studied at the same time.
- Selection bias and confounding variables can be a problem.
- Cohort studies can be expensive and time consuming.
- Sample sizes required are usually very large.
What is a Retrospective Cohort Study?
A retrospective cohort study (also known as a historic study or longitudinal study) is a study where the participants already have a known disease or outcome. The study looks back into the past to try to determine why the participants have the disease or outcome and when they may have been exposed. In a retrospective cohort study the researcher:
- Uses historical data to identify members of a population who have been exposed (or not exposed) to a disease or outcome.
- Assembles a group to be studied.
- Determines the current status of the disease or outcome in the participants.
One of the first recognized retrospective cohort studies was Lane-Claypon’s 1926 study of breast cancer risk factors, titled “A Further Report on Cancer of the Breast, With Special Reference to Its Associated Antecedent Conditions.” The study of 500 hospitalized patients and 500 controls led to the identification of most of the risk factors for breast cancer that we know today.
Prospective vs. Retrospective Cohort Study
In a retrospective cohort study, the group of interest already has the disease/outcome. In a prospective cohort study, the group does not have the disease/outcome, although some participants usually have high risk factors.
Retrospective example: a group of 100 people with AIDS might be asked about their lifestyle choices and medical history in order to study the origins of the disease. A Second group of 100 people without AIDS are also studied and the two groups are compared.
Prospective example: a group of 100 people with high risk factors for AIDS are followed for 20 years to see if they develop the disease. A control group of 100 people who have low risk factors are also followed for comparison.
A retrospective cohort study can be combined with a prospective cohort study: the researcher takes the retrospective study groups, and then follows the cohort in the future.
A cohort effect is the influence of a group’s life experience on the outcome of an experiment. It’s the effect of being born at the same time (i.e. GenXer vs. Baby Boomer), or in the same region (i.e. born in New Orleans vs. Seattle) or some other factor that makes the group unique. Cohorts in schools are usually defined by age group, while cohorts in organizations are defined by their date of entry into the job.
Cohort Effect Example
Lets say you were conducting cross sectional research (a method that compares different age groups at the same point in time) to find out how basic mathematics ability improves with age. You give the same basic math standardized test to groups of students who are 7-years-old, 14-years-old, and 21-years-old. You get the following mean results:
- 7-years-old: 24% correct
- 14-years-old: 48% correct
- 21-years-old: 72% correct
You might conclude that every 7 years that pass makes a difference of 24% in scores. However, what you haven’t accounted for is the cohort effect. The students differ not only in age, but they belong to different cohorts (in this case, groups of people born around the same time), some of which may have grown up when basic mathematics was strongly emphasized in schools. If the 21-year-old cohort in the above study experienced strong emphasis on basic math, it’s a possibility that they could have achieved 72% when they were 14-years old or even 7-years-old.
The problems associated with the cohort effect can be lessened by testing the same cohort over a period of time, a method called longitudinal research. In the above example, you would test a group of 7-year-olds, then test the same group every 7 years. A disadvantage to longitudinal research is that it’s costly, and dropout rates can affect the results.------------------------------------------------------------------------------
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments? Need to post a correction? Please post on our Facebook page.