Clustering > Dendrograms

## What is a Dendrogram?

A dendrogram is a type of tree diagram showing hierarchical clustering — relationships between similar sets of data. They are frequently used in biology to show how genes or samples are clustered, but they can represent any type of grouped data.

## Parts of a Dendrogram

A dendogram can be a column graph (as in the image below) or a row graph. Some dendograms are circular or have a fluid-shape, but software will usually produce a row or column graph. No matter what the shape, the basic graph is made of the same parts:

- The
*clade*is the branch. Usually labeled with Greek letters from left to right (e.g. α β, δ…). - Each clade has one or more
*leaves*. The leaves in the above image are:- Single (simplicifolius): F
- Double (bifolius): D E
- Triple (trifolious): A B C

A clade can theoretically have an infinite amount of leaves. However, the more leaves you have, the harder the graph will be to read with the naked eye.

## How to Read a Dendrogram

The clades are arranged according to how similar (or dissimilar) they are. Clades that are close to the same height are similar to each other; clades with different heights are dissimilar — **the greater the difference in height, the more dissimilarity** (similarity can be measured in many different ways; One of the most popular measures is Pearson’s Correlation Coefficient).

- Leaves A, B, and C are more similar to each other than they are to leaves D, E, or F.
- Leaves D and E are more similar to each other than they are to leaves A, B, C, or F.
- Leaf F is substantially different from all of the other leaves.

Note that on the above graph, leaves A,B,C,D,and E Are joined by the same clave, β. That means that the two groups (A,B,C & D,E) are more similar to each other than they are to F.

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

Comments are now closed for this post. Need help or want to post a correction? Please post a comment on our Facebook page and I'll do my best to help!