Normal Distributions > Normalized Data / Normalization
About Normalized Data
The word “normalization” is used informally in statistics, and so the term normalized data can have multiple meanings. In most cases, when you normalize data you eliminate the units of measurement for data, enabling you to more easily compare data from different places. Some of the more common ways to normalize data include:
- Transforming data using a z-score or t-score. This is usually called standardization. In the vast majority of cases, if a statistics textbook is talking about normalizing data, then this is the definition of “normalization” they are probably using.
- Rescaling data to have values between 0 and 1. This is usually called feature scaling. One possible formula to achieve this is:
- Standardizing residuals: Ratios used in regression analysis can force residuals into the shape of a normal distribution.
- Normalizing Moments using the formula μ/σ.
- Normalizing vectors (in linear algebra) to a norm of one. Normalization in this sense means to transform a vector so that it has a length of one.
This list is by not means all-inclusive. I’ve included the most common ones, but be aware there are many, many other meanings for the word normalization.
Normalization vs. Standardization
The terms normalization and standardization are sometimes used interchangeably, but they usually refer to different things. Normalization usually means to scale a variable to have a values between 0 and 1, while standardization transforms data to have a mean of zero and a standard deviation of 1. This standardization is called a z-score, and data points can be standardized with the following formula:
Z-scores are very common in statistics. They allow you to compare different sets of data and to find probabilities for sets of data using standardized tables (called z-tables). For more about z-scores, see: Z-score: Definition, Formula, and Calculation.
If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!