## What is Hierarchical Linear Modeling?

Hierarchical linear modeling, also called multi-level modeling, is a way to analyze hierarchical data. Hierarchical data is data that is nested in some way.

Nested data is a common occurrence in real life. For example:

- Employees are nested within departments, companies, geographic regions and sectors of the economy.
- Schoolchildren are nested in grades, schools, districts and states.
- School district employees are nested in families, geographic areas and sectors of the economy.

Hierarchical linear modeling is an extension of **ordinary least squares regression**. The technique takes into account all of these different hierarchies, and can include many different levels of the hierarchy. Participants can also be *cross-classified* within levels that aren’t related. For example, employees could be cross-classified for family type and department. The types of question that can be answered with HLM include:

- Does a particular classroom, school, and district affect standardized test results?
- Do family type, department worked in, and industry contribute to job satisfaction?

Hierarchical linear modeling is a two step process. For the family type example above, OLS regression could be performed on family type and job satisfaction. The regression results then become outcome variables to use for a second regression on industry.

## Problems with Hierarchical Data

One of the assumptions for most statistical tests is independence of observations. This assumption is usually violated for hierarchical data.

Let’s say you were conducting an experiment to see if certain teaching methods improved kindergartners’ math performance. Budget and time restrictions would prevent you from sampling from the entire population of kindergartners in the US (imagine getting one child from every state!), so you might decide to focus on one randomly sampled class in an inner city school. The children in this school are going to be similar in many ways (socioeconomic status for one) than a true random sample. As the independence of observations assumption is violated, this necessitates the use of hierarchical linear modeling as an alternative.

## Assumptions for Hierarchical Linear Modeling

- Normality: Data should be normally distributed.
- Homogeneity of variance: variances should be equal.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!