If you’re looking for help with probability and statistics, you’re in the right place. StatisticsHowTo.com has a comprehensive database of articles covering all the material you’re likely to find in an AP statistics, elementary statistics or college statistics class. Several hundred articles include short, how-to videos that you can also find on our YouTube Channel.

### Probability and Statistics Topic Indexes

- Basic Statistics.
- Descriptive Statistics: Charts, Graphs and Plots.
- Probability.
- Binomial Theorem.
- Definitions for Common Statistics Terms.
- Critical Values.
- Hypothesis Testing.
- Normal Distributions.
- T-Distributions.
- Central Limit Theorem.
- Confidence Intervals.
- Chebyshev’s Theorem.
- Sampling and Finding Sample Sizes.
- Chi Square.
- Online Tables (z-table, chi-square, t-dist etc.).
- Regression Analysis / Linear Regression.
- Non Normal Distributions.

### Technology Topic Indexes

- Online calculators.
- Microsoft Excel for Statistics.
- TI 83 for elementary statistics.
- TI 89 for elementary statistics.
- SPSS Statistics.
- Statistics Help.
- What is the best calculator for statistics?

## Misc.

## What is Probability and Statistics?

Probability and Statistics usually refers to an introductory course in probability and statistics. The “probability” part of the class includes calculating probabilities for events happening. While it’s usual for the class to include basic scenarios like playing cards and dice rolling at first, these basic tools are used later in the class to find more complex probabilities, like the probability of contracting a certain disease. The “statistics” part of probability and statistics includes a wide variety of methods to find actual statistics, which are numbers you can use to generalize about a population. For example, you could calculate the height of all your male classmates and find the mean height to be 5’9″ — this is a statistic. But then you could take that statistic and say “I think the average height of an American male is 5’9″ “. How accurate your guess is depends on many factors, including how many men you measured and how many men are in the entire population. Statistics are useful because we often don’t have the resources to measure, survey or poll every member of a population, so instead we take a sample (a small amount).

If you prefer an online interactive environment to learn R and statistics, this free R Tutorial by Datacamp is a great way to get started. If you’re are somewhat comfortable with R and are interested in going deeper into Statistics, try this Statistics with R track.

I am taking Elementary Statistics this summer and have a TI-83 plus calculator, I did not do very good in algebra but passed. I am thinking of buying the book, does it come with training on how to use the calculator? I was not able to fin the app that says stats?

Thank you, and please let me know I would like to pass this class and I only have a 10 week class

Trish

Trish,

The TI-83 guide walks you through the steps to using the calculator for a wide variety of problems. It comes free with the statistics handbook if you buy it here on the site.

Good luck!

Stephanie

download link not visible. Paid for statistics handbook. 1/28/15

Hello, Glory,

I am resending you the link right now. Please let me know if you don’t get it.

The filling machine used by a dairy company to fill 1kg containers of yoghurt produces output which follows a normal distribution with mean 1030g (slightly more than 1kg) and standard deviation 20g. Suppose that the company can change either the mean or the standard deviation of the filling amount (but not both). If they require the probability of underfilling a container (i.e. contents less than 1000g) to equal 0.05, find (i) the smallest value of the mean; and (ii) the largest value of the standard deviation that satisfies this requirement (in each case to the nearest gram).

I have taken Elementary Statistics /Math 300 through Aleks. And I failed it. This is the last class for my BA degree, and I am worried I will not pass it again. Will your book help me get through this class. I am on SS only and have to pay for this class which is going to hurt me a lot. I want to order you book it looks like it will give me some hope in this class. If I order this how long will it take it to get to me?

Thank you for your time

Mary Ann K

Your site’s definition of “variable” (a quantity that has changing values, although better: that could be any of a set of values) is horrible. A variable is a “place holder in a data matrix.” The idea is that you plan out your research project before any data are collected: how many respondents (n), how many variables (v1…vk), & each of the variables’ possible attributes (v1{a1…z1},…, vk{ak…zk}). The upshot of this plan is an empty data matrix (technically referred to as your sample space), which only gets filled in when your sample is drawn. Filling in a data matrix involves n-times-k events, where each set of k events for any individual respondent is random (i.e., it has the same probability as every other such respondent-set in the population). (Since events among respondents are not constrained to be random, associations among events are interpreted as being among variables. This follows because due to one’s random selection process, associations among respondents could only result due to sampling error [i.e., only if the random selection process yielded a peculiar, unrepresentative sample].) So a variable is not a quantity; at best it is a “possible quantity.” And it does not have changing values: Its values are fixed at the moment one’s sample is drawn (at which time it is no longer a random variable, but forms a column of numbers in your data matrix). With respect, your definition does not simplify matters; it leaves the novice with little insight into one of the most fundamental of statistical ideas. Beyond this, I do not see any section of your site devoted to the Central Limit Theorem. Don’t “the rest of us” need to understand this?

Hi, Carl,

Thanks for your comments. I do get hits from people searching for algebra topics as well as statistics. I’ve had questions in the past from people confused about the difference between algebra variables, statistics variables etc., so I wanted the term “variable” to be as general as possible. Most of the people visiting this site are not math majors; they are liberal arts majors struggling to understand statistics, so I try to simplify the terms as much as possible. I fear that putting the definition “a place holder in a data matrix” will completely confuse most liberal arts folks.

On the specific variable posts/pages (i.e.confounding variable ) I do go more in depth.

Here is the CLT page: Central Limit Theorem.

Regards,

Stephanie

This information is worth everyone’s attention. How can I find out more?

This is my first time visit at here and i am truly happy to read everthing at one place.

hello im joan rose…is this title enough?? “application of spiral curriculum on Statistics and Probability”

i am conducting a research and my study is all about the aforementioned study…..

can i have to revised it or not…..please suggest another title…

thank you very much

im sorry….do i have to revised it or not???

(my previous question is wrong grammar)

Hi, Joan,

That title sounds fine by me, although only you know your study well enough to know if it is truly suitable :)

Stephanie

i need a write up on some of burr distributions, especially types III, IV, and V. please can you kindly help.

Thank you.

As you probably are aware, there’s very little research into Burr distributions so many are “unknowns.” That said, III, IV and V are really just derivations of other distributions like the Rayleigh. See: Burr Distributions

Dear Dr Laxmisri,

Thanks for this simple explanation. I have a question concerning this issue.

I have done in my last research paper a new quantitative analytical assay(detection method) for the detection and quantification of Hepatitis C Virus, I have the virus number for each sample using this new method ( I have done each sample in triplicate, and add got the mean for each sample).

To confirm, that my new method is working well, I have done quantification for the same samples in triplicate also, using the standard method that are usually used in the laboratories, and also got the mean for each sample.

There is no statistical difference between the readings between the two methods, which is fine.

Now, I want to calculate the RSD or CV for my method to prove that the method is reproducible.

Where is the problem, I have negative and positive samples. The results negative samples for the viral quantification is zero, not a number. I don’t know what to do for the calculation.

I have measured the CV for my new method using origin software, by just adding all the mean of each sample that I have obtained (except the negative samples because they were zero, I don;t know if this was right or not) in one column in origin and got the CV and it was 1.2. Actually, I don’t know if this is right or not, and if it is right is 1.2 means that the method is reproducible or not or if this result is good. Note: when I have added the negative samples that there values are zero, the CV increased dramatically so, I am confused now.

could I try it on SPSS software or not?

By the way, I have done ROC curve for the method to get the specificty, sensitivity and the limit of detection of the assay. Is there any other method to calculate the limit of detection or the cut-off value?

can I use one way Annova or t-test?

Thanks for your support

Regards

Hello, Sherif, I think you may have posted in the wrong place, as this comment was meant for “Dr Laxmisri” and there is no one of that name here. I’m not sure where you meant to post it. Best of luck!

Hello Andale,

I am very sorry it was a typing error.

kindly could you help in the following issue.

I have done in my last research paper a new quantitative analytical assay(detection method) for the detection and quantification of Hepatitis C Virus, I have the virus number for each sample using this new method ( I have done each sample in triplicate, and add got the mean for each sample).

To confirm, that my new method is working well, I have done quantification for the same samples in triplicate also, using the standard method that are usually used in the laboratories, and also got the mean for each sample.

There is no statistical difference between the readings between the two methods, which is fine.

Now, I want to calculate the RSD or CV for my method to prove that the method is reproducible.

Where is the problem, I have negative and positive samples. The results negative samples for the viral quantification is zero, not a number. I don’t know what to do for the calculation.

I have measured the CV for my new method using origin software, by just adding all the mean of each sample that I have obtained (except the negative samples because they were zero, I don;t know if this was right or not) in one column in origin and got the CV and it was 1.2. Actually, I don’t know if this is right or not, and if it is right is 1.2 means that the method is reproducible or not or if this result is good. Note: when I have added the negative samples that there values are zero, the CV increased dramatically so, I am confused now.

could I try it on SPSS software or not?

By the way, I have done ROC curve for the method to get the specificty, sensitivity and the limit of detection of the assay. Is there any other method to calculate the limit of detection or the cut-off value?

can I use one way Annova or t-test?

Thanks for your support

Regards

CV is just a ratio, so it wouldn’t matter if you have + or -. The CV or RSD do not prove that your method is reproducible. You need to run a hypothesis test for that. As to “can i use one way anova or t-test”, that depends on a LOT of factors. The two tests are quite different.

Hi I am taking elementary statistics and i do not understand binomial probability distribution at all any tips

Start here :)

I don’t know which section this belongs in. Say a ticket cost 2$ for 33% chance of win, and 1$ for a 18% chance of win. Each ticket pays the same, so the goal is to get highest number of ‘win’ tickets. Which is better buy?

The $2 ticket. I’m assuming you want to spend the same amount of money? Let’s say you spend $4.

If you buy two $2 tickets, the odds that both will win are .33*.33 = .1

If you buy four $1 tickets, the odds that all are winners is .18^4 = .001

Good evening! I want to ask about DEFF. I’ll be using cluster sampling for my reseaech among university student. From 16 faculties, 3 faculties will be randomly selected and 1 program will be selected under each selected faculties. All students from the selected programs will be invited. May I know if I can just estimate the DEFF value? Can I just estimate that my sample size will be 1.3times higher than if I used simple random sampling? Thank you in advance!

Why estimate the value? Why not use the formula as it’s pretty straightforward.

DEFF Formula.

As far as if it’s ok just to ballpark sample sizes, it would depend on what you’re using it for. If you’re trying to get published, for example, you probably don’t want to ballpark :)

Good day sir.

I am doing my thesis and I’m going to cite you as my resource on how to test retest validity (thank you for that straight forward article). I cant find your complete name. Do you mind disclosing your name?

Hi, Moira,

My name is Stephanie Glen.

Good luck with your thesis,

Regards,

S

Andale, thank you so much for your simple and informative definition of the null hypothesis. I am attaining my masters in forensic psychology and doing a report on false positives in the DNA evidence used in courtroom trials as evidence. I hate , loathe , despise statistics, but you explained it in such an easy way, I didn’t feel stupid. Thank you…. TDE MSRRT

I just bought your book. Great read. Can you revisit the bayes problem. In the prescription pill example, you state a probability is 5% .05 but used .5 in the solution yielding 16%.

Did I miss something> Please send or post the corrected solution. Think it’s page 74.

Regards

Thanks. This is a great site. i am glad I found you. I am interested in becoming a statistician so I will be using your site a lot. How can I benefit fully from your site? Is there a way I can down load the material I need to study and practice with?

Thanks

Jonathan

Hi, Paula,

Thanks for spotting that. It’s corrected on the site (http://www.statisticshowto.com/bayes-theorem-problems/) and I will fix it in the upcoming new edition of the book.

Regards,

Stephanie

Would like to point out a historical inaccuracy on the following page:

http://www.statisticshowto.com/what-is-the-null-hypothesis/

This page repeats the common misconception that Copernicus was somehow the first to suggest that the earth was not flat. The sphericity of the earth has in fact been commonly accepted since classical antiquity. Copernicus’ revolutionary idea rather pertained to a completely different issue: the issue of geocentrism (the erroneous classical viewpoint that the sun revolved around the earth) versus heliocentrism (the Copernican viewpoint that the earth revolves around the sun).

Hello, Elbert,

I’m going to politely disagree. While I agree he did argue for the earth-at-the-center-of-the-universe theory, he also argued for the round Earth theory. I don’t state that Copernicus was the first, just that he was one of several scientists. Here is my source.

Stephanie