Statistics Definitions > Prime Numbers

## Prime numbers: Contents

**Click to skip to the section:**

## What is a Prime Number?

Prime numbers are whole numbers (numbers that aren’t fractions) greater than 1 that are **divisible only by itself and one.** For example, 13 is a prime number because it cannot be divided by anything but 13 and 1.

One way to figure out if a number is a prime is to arrange the units (1s) into rows and columns. If you can make equal rows and columns, the result is **not** a prime. For example, 12 is not a prime because you can arrange it in 3 rows of 4 (or 6 rows of 2). As the picture below shows, not matter how you try to arrange the number 11 in even rows and column, you can’t. Therefore, the number 11 is a prime.

## Is 2 a prime number?

Yes. It’s the first (and only) even prime number.

## How many prime numbers are there?

There are an infinite amount of prime numbers. Although new prime numbers are being discovered every day, there’s no end to the amount of primes to be discovered. The proof of this fact goes back to 300 B.C.E. when Euclid outlined it in book IX of *The Elements*. Proposition 20 states every list of primes (no matter how large) is missing at least one prime number.

## What is the Largest Prime Number?

As of the beginning of 2016, the largest prime number was 2^{74,207,281}-1 (record stands as of time of writing — September 17, 2017). It is calculated by multiplying 2 by itself 74,207,281 times, then subtracting one. New primes are being found all the time. For the most up-to-date list of the largest prime number, see: GIMPS.

Back to top

## What are primes used for in probability and statistics?

Prime numbers aren’t generally used *in* statistics (other than those number appearing in data), but statistics and probabilities are used to work with prime numbers in **number theory**. For example, you might want to find the probability of choosing a prime number from a series of numbers. The odds depend on what interval you choose:

- The probability of finding a prime in the set {0,1,2} is .333, because one out of the three numbers is a prime (1/3 = .333).
- The probability for the set of numbers from 1 to 100 is .25, because 25/100 numbers in that set are primes (which are 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31, 37, 41, 43, 47, 53, 59, 61, 67, 71, 73, 79, 83, 89, 97).

Prime numbers *look* random, but there’s some research using statistical mechanics that suggests a chaos pattern (statistical mechanics is a branch of mathematical physics that studies the behavior of systems). “It is evident that the primes are randomly distributed but, unfortunately, we don’t know what ‘random’ means” (Vaughan, 1990).

Why do we even care about prime numbers in real life or the probability of finding them? You may not realize it, but prime numbers play an important role in many areas of science, including the math behind internet shopping. Prime numbers are the nuts and bolts behind the cryptography that keeps your personal information secure when you shop online.

## Prime Counting Function

The prime-counting function answers the question “How many primes are there less than or equal to a real number x?” The function is denoted by π(x) (which has nothing to do with the number π).

The first few values of π(n) for n = {1,2,3,…n} are 0, 1, 2, 2, 3, 3, 4, 4, 4, 4, 5, 5, 6. For example, at n = 12 there are 5 primes (2, 3, 5, 7, 11).

While the prime counting function tells you *how many primes,* it isn’t practical to list all the possible primes in order to figure every π(x). For example, π(10^{5}) = 9592. Although **many functions exist** that can count these numbers, there isn’t one single function that can be called **the **counting function. All are approximations (i.e. all have margins of error). Two functions from the literature:

## Prime Number Theorem

The Prime Number Theorem helps to calculate probabilities of prime numbers in larger sets. It gives an **approximate number of primes less than or equal to any positive real number x.**

The theorem states that the “density” of prime numbers in the interval from 1 to x is approximately 1 / ln[x].

The following image shows how the approximation works. The black line is the actual density of primes from 0 to 200. For example, if you look at 40 on the chart, the density is 0.3. This is because there are 12 primes up to x = 40: 2, 3, 5, 7, 11, 13, 17, 19, 23, 29, 31 and 37. So the **actual density** is 12/40 = 0.3. The approximation 1/ln(40) = 0.27, which is a **reasonable approximation**, albeit a little low. The red line in the graph shows an alternate approximation 1/(ln(40)-1) = .37 — which is a little on the high side.

## Consequence of the Prime Number Theorem

An interesting **consequence of the prime number theorem **is that the average distance between consecutive primes (in the vicinity of *n*) is the logarithm of *n* (Robbins, 2006). For example, let’s say you took the interval 900 to 1100 (centered at 1,000). There are 30 prime numbers in the interval 900 to 1100.

The total distances are given in parentheses between each pair:

907 (4) 911 (8) 919 (10) 929 (8) 937 (4) 941 (6) 947 (6) 953 (14) 967 (4) 971 (6) 977 (6) 983 (8) 991 (6) 997 (12) 1009 (4) 1013 (6) 1019 (2) 1021 (10) 1031 (2) 1033 (6) 1039 (10) 1049 (2) 1051 (10) 1061 (2) 1063 (6) 1069 (18) 1087 (4) 1091 (2) 1093 (4) 1097

The average distance is 6.55:

(4 + 8 + 10 + 8 + 4 + 6 + 6 + 14 + 4 + 6 + 6 + 8 + 6 + 12 + 4 + 6 + 2 + 10 + 2 + 6 + 10 + 2 + 10 + 2 + 6 + 18 + 4 + 2 + 4) / 29 = 6.55.

Log(1000) gives us about 6.9. That’s a reasonable approximation, which gets a little better for a greater interval; The average distance from 800 to 1200 is 6.8.

## Primes and the Poisson Distribution

There is a connection between the Poisson distribution and the prime number theorem: Short intervals of primes fall into the approximate shape of a Poisson distribution.

The Poisson Distribution formula is: P(x; μ) = (e

^{-μ}) (μ^{x}) / x!

Let’s say that that x (as in the prime counting function) is a very big number, like x = 10^{100}. If you choose a random number that’s less than or equal to x, the probability of that number being prime is about 0.43 percent. Furthermore, if you make that interval very short, with μ^{x} > 0 and *j* under about 20, then the number of primes in the interval roughly follows a Poisson distribution (Croot, 2010).

## Ramanujan prime

A Ramanujan prime is another theory about the number of primes between certain points. The *n*th Ramanujan prime p is the smallest prime such that there are at least n primes between x and 2x. This is true for any x such that 2x > p.

### References:

Croot, E. (2010). Some notes on the Poisson distribution. Retrieved September 17, 2017 from: people.math.gatech.edu/~ecroot/3225/poisson_notes2.pdf.

Feldman, P. Prime Numbers and their statistical properties. Retrieved September 16, 2017 from: http://phillipmfeldman.org/primes/primes.html

Great Internet Mersenne Prime Search (GIMPS). Retrieved September 17, 2017 from: ttp://www.mersenne.org/.

Ribenboim, P. (1995). The Little Book of Big Primes. Springer Verlag.

Robbins, N. (2006). Beginning Number Theory. Jones and Bartlett Learning.

Vaughan, R. (1990). Harald Cramér and the distribution of prime numbers. Retrieved September 16, 2017 from: a href=”http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.129.6847″ target=”_blank”.

If you prefer an online interactive environment to learn R and statistics, this *free R Tutorial by Datacamp* is a great way to get started. If you're are somewhat comfortable with R and are interested in going deeper into Statistics, try *this Statistics with R track*.

*Facebook page*and I'll do my best to help!