A ratio estimator, commonly used in survey sampling, is a ratio of the means of two random variables. They are biased, so corrections for error must be made when using them in experiments.
Why Use a Ratio Estimator?
A couple of circumstances could lead you to use a ratio estimator instead of a more simple estimator:
- If y and x are highly linearly correlated through the origin (i.e. x contributes to predicting y),
- If you don’t know the number of elements in the population.
Auxiliary Statistics
Auxiliary statistics can be used to calculate hard to get statistics, with the use of a ratio estimator. An auxiliary variable is an easy to study variable x, used to gather information on the harder-to-research target variable y. For example (Borkowski, n.d.):
Variable of Interest | Auxiliary |
---|---|
Amount of lumber a tree produces | Tree diameter |
income level of a 50-year-old | number of years of education completed |
number of farms per county in the US | number of farms per county in the previous census |
Historically, John Graunt (1662) was the first person to use the ratio estimate ratio y/x, for the total population y and registered births (the auxiliary variable x), in the same area for the prior year (Sen, 1993). Laplace used a similar method at a later time to measure the total population of France. There wasn’t a census at the time, so Laplace sampled 30 French communities, getting the following information:
- n = 30
- community population = 2,037,615.
Additional (auxiliary) information that Laplace obtained from government records:
- Total registered births for n = 71,866.33.
Dividing the community population by the actual number of registered births:
- 2,037,615 / 71,866.33 = 28.35.
There was only one registered birth for every 28.35 people. Laplace used this auxiliary information to produce a formula estimating total population in France:
Total population = total number of annual births * 28.35
Cautions
The bias and the variance of a ratio estimator rapidly decrease as the sample size (that they are based on) increases. Therefore, the mean square error for the ratio estimator, or the estimator of a ratio also rapidly decrease.
Sometimes, it’s advantageous to stratify the population before using a ratio estimator (Scheaffer, 2011). However, small sample sizes within strata can lead to problems with bias. This effect can be lessened using a combined ratio estimator (i.e. by using an average estimator across all samples).
References
Borkowski, J. Ratio and Regression Estimation. http://www.math.montana.edu/jobo/st446/documents/ho5a.pdf
Scheaffer, R. et al. (2011). Elementary Survey Sampling. Cengage Learning.
Sen, A. (1993). Some Early Developments in Ratio Estimation. https://doi.org/10.1002/bimj.4710350102