What is Neyman Bias?
Neyman Bias is a selection bias where the very sick or very well (or both) are erroneously excluded from a study. The bias (“error”) in your results can be skewed in two directions:
- Excluding patients who have died will make conditions look less severe.
- Excluding patients who have recovered will make conditions look more severe.
You may not know which groups (improved/died) you are excluding, making it impossible to adjust for any bias in your results.
This type of bias often happens when a significant amount of time has passed between exposure and investigation; patients who have died or recovered will be erroneously excluded from any analysis, skewing the results towards individuals who are more “average.” For example, a study of patients hospitalized for the flu will miss those patients who have died, and those who have been discharged after recovery. Neyman bias is less of a problem with acute, short-lived cases than with long-term diseases like HIV or tuberculosis.
This type of bias is also called prevalence-incidence bias from the fact that it’s preferable to use incident cases instead of prevalent cases. Incident cases are newer cases — like first time admissions. Prevalent cases are pre-existing cases, which are usually sicker with more progressed disease than incident cases. Combining prevalent and incident cases can actually make prevalent-incidence bias worse, obscuring the true relationship between your study variables (Magnus, 2008).
Avoiding Neyman Bias
Careful selection of study type can help to lessen the effects from this bias, because some studies are more susceptible to prevalence-incidence bias than others. For example, this bias usually happens in case-control and cross-sectional research — although it sometimes occurs in experimental or cohort studies. On the other hand, a carefully designed follow-up study can help to lessen the effects from this bias.
Streiner and Norma (2009) offer the following example: the long term outlook for schizophrenia patients is poor, but is mostly based on a natural history study of patients hospitalized for the disease. This misses discharged patients living productive lives. Follow-up studies of patients admitted for the first time reveal a somewhat optimistic picture– that between 60% and 80% of patients are actually living productive lives in the community.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!