Ratio Estimator

A ratio estimator, commonly used in survey sampling, is a ratio of the means of two random variables. They are biased, so corrections for error must be made when using them in experiments.

Why Use a Ratio Estimator?

A couple of circumstances could lead you to use a ratio estimator instead of a more simple estimator:

If y and x are highly linearly correlated through the origin (i.e. x contributes to predicting y),
If you don’t know the number of elements in the population.

Auxiliary Statistics

Auxiliary statistics can be used to calculate hard to get statistics, with the use of a ratio estimator. An auxiliary variable is an easy to study variable x, used to gather information on the harder-to-research target variable y. For example (Borkowski, n.d.):

Variable of Interest	Auxiliary
Amount of lumber a tree produces	Tree diameter
income level of a 50-year-old	number of years of education completed
number of farms per county in the US	number of farms per county in the previous census

Historically, John Graunt (1662) was the first person to use the ratio estimate ratio y/x, for the total population y and registered births (the auxiliary variable x), in the same area for the prior year (Sen, 1993). Laplace used a similar method at a later time to measure the total population of France. There wasn’t a census at the time, so Laplace sampled 30 French communities, getting the following information:

n = 30
community population = 2,037,615.

Additional (auxiliary) information that Laplace obtained from government records:

Total registered births for n = 71,866.33.

Dividing the community population by the actual number of registered births:

2,037,615 / 71,866.33 = 28.35.

There was only one registered birth for every 28.35 people. Laplace used this auxiliary information to produce a formula estimating total population in France:
Total population = total number of annual births * 28.35

Cautions

The bias and the variance of a ratio estimator rapidly decrease as the sample size (that they are based on) increases. Therefore, the mean square error for the ratio estimator, or the estimator of a ratio also rapidly decrease.

Sometimes, it’s advantageous to stratify the population before using a ratio estimator (Scheaffer, 2011). However, small sample sizes within strata can lead to problems with bias. This effect can be lessened using a combined ratio estimator (i.e. by using an average estimator across all samples).

References

Borkowski, J. Ratio and Regression Estimation. http://www.math.montana.edu/jobo/st446/documents/ho5a.pdf
Scheaffer, R. et al. (2011). Elementary Survey Sampling. Cengage Learning.
Sen, A. (1993). Some Early Developments in Ratio Estimation. https://doi.org/10.1002/bimj.4710350102