Probability > Decision Tree

## What is a Decision Tree?

A decision tree is a very specific type of probability tree that enables you to make a decision about some kind of process. For example, you might want to choose between manufacturing item A or item B, or investing in choice 1, choice 2, or choice 3. Trees are an excellent way to deal with these types of complex decisions, which always involve many different factors and usually involve some degree of uncertainty. Although they can be drawn by hand, software is often used as the trees can become complex very quickly.

## Components

There are three broad areas usually displayed in a tree:

**The Decision**: displayed as a square node with two or more arcs (called “decision branches”) pointing to the options.**The Event sequence**: displayed as a circle node with two or more arcs pointing out the events. Probabilities may be displayed with the circle nodes, which are sometimes called “chance nodes”.**The Consequences**: the costs or utilities associated with different pathways of the decision tree. The endpoint is called a “Terminal” and is represented by a triangle or bar on a computer.

For a decision tree to be effective, it must contain all possibilities, i.e. all possible pathways and event sequences. In addition, the events must be mutually exclusive; in other words, if one event happens, the other cannot.

The following image, based on an infographic from John DeGroote’s website shows how a decision tree can be used to give realistic expectations of what plaintiffs can expect when going to court.

For complete calculation steps of the probabilities shown on this tree, see the example in Expected Monetary Value.

## Advantages and Disadvantages

### Advantages

- A decision tree is easy to understand and interpret.
- Expert opinion and preferences can be included, as well as hard data.
- Can be used with other decision techniques.
- New scenarios can easily be added.

### Disadvantages

- If a decision tree is used for categorical variables with multiple levels, those variables with more levels will have more information gain.
- Calculations can quickly become very complex, although this is usually only a problem if the tree is being created by hand.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!