Statistical Analysis with Excel For Dummies. Joseph Schmuller

Чтение книги онлайн.

Читать онлайн книгу Statistical Analysis with Excel For Dummies - Joseph Schmuller страница 16

Statistical Analysis with Excel For Dummies - Joseph Schmuller

Скачать книгу

statisticians make decisions, they express their confidence about those decisions in terms of probability. They can never be certain about what they decide. They can only tell you how probable their conclusions are.

      So, what is probability? The best way to attack this is with a few examples. If you toss a coin, what's the probability that it comes up heads? Intuitively, you know that if the coin is fair, you have a 50-50 chance of heads and a 50-50 chance of tails. In terms of the kinds of numbers associated with probability, that’s ½.

      How about rolling a die? (That’s one member of a pair of dice.) What’s the probability that you roll a 3? Hmm… . A die has six faces and one of them is 3, so that ought to be 1⁄6, right? Right.

      Here’s one more. You have a standard deck of playing cards. You select one card at random. What’s the probability that it’s a club? Well, a deck of cards has four suits, so that answer is ¼.

      Things can get a bit more complicated. When you toss a die, what’s the probability you roll a 3 or a 4? Now you're talking about two ways the event you're interested in can occur, so that's

. What about the probability of rolling an even number? That has to be 2, 4, or 6, and the probability is
.

      On to another kind of probability question. Suppose you roll a die and toss a coin at the same time. What's the probability you roll a 3 and the coin comes up heads? Consider all the possible events that can occur when you roll a die and toss a coin at the same time. The outcome can be a head and 1-6 or a tail and 1-6. That's a total of 12 possibilities. The head-and-3 combination can happen only one way, so the answer is

.

      In general, the formula for the probability that a particular event occurs is

      I begin this section by saying that statisticians express their confidence about their decisions in terms of probability, which is really why I brought up this topic in the first place. This line of thinking leads me to conditional probability — the probability that an event occurs given that some other event occurs. For example, suppose I roll a die, take a look at it (so that you can't see it), and tell you I’ve rolled an even number. What’s the probability that I've rolled a 2? Ordinarily, the probability of a 2 is 1⁄6, but I’ve narrowed the field. I’ve eliminated the three odd numbers (1, 3, and 5) as possibilities. In this case, only the three even numbers (2, 4, and 6) are possible, so now the probability of rolling a 2 is 1⁄3.

      Exactly how does conditional probability play into statistical analysis? Read on.

      In advance of doing a study, a statistician draws up a tentative explanation — a hypothesis — of why the data might come out a certain way. After the study is complete and the sample data are all tabulated, the statistician faces the essential decision every statistician has to make: whether or not to reject the hypothesis.

      That decision is wrapped in a conditional probability question — what’s the probability of obtaining the sample data, given that this hypothesis is correct? Statistical analysis provides tools to calculate the probability. If the probability turns out to be low, the statistician rejects the hypothesis.

      If it turns out to be 99 heads and 1 tail, you’d undoubtedly reject the fair coin hypothesis. Why? The conditional probability of getting 99 heads and 1 tail given a fair coin is very low. Wait a second. The coin could still be fair and you just happened to get a 99-1 split, right? Absolutely. In fact, you never really know. You have to gather the sample data (the results from 100 tosses) and make a decision. Your decision might be right, or it might not.

      Juries face this dilemma all the time. They have to decide among competing hypotheses that explain the evidence in a trial. (Think of the evidence as data.) One hypothesis is that the defendant is guilty. The other is that the defendant is not guilty. Jury members have to consider the evidence and, in effect, answer a conditional probability question: What’s the probability of the evidence given that the defendant is not guilty? The answer to this question determines the verdict.

      Null and alternative hypotheses

      Consider once again the coin tossing study I mention in the preceding section. The sample data are the results from the 100 tosses. Before tossing the coin, you might start with the hypothesis that the coin is a fair one so that you expect an equal number of heads and tails. This starting point is called the null hypothesis. The statistical notation for the null hypothesis is H0. According to this hypothesis, any heads-tails split in the data is consistent with a fair coin. Think of it as the idea that nothing in the results of the study is out of the ordinary.

      An alternative hypothesis is possible: The coin isn’t a fair one, and it's loaded to produce an unequal number of heads and tails. This hypothesis says that any heads-tails split is consistent with an unfair coin. The alternative hypothesis is called, believe it or not, the alternative hypothesis. The statistical notation for the alternative hypothesis is H1.

      

Notice that I did not say “accept H0.” The way the logic works, you never accept a hypothesis. You either reject H0 or don't reject H0.

      Here’s a real-world example to help you understand this idea. Whenever a defendant goes on trial, that person is presumed innocent until proven guilty. Think of innocent as H0. The prosecutor’s job is to convince the jury to reject H0. If the jurors reject, the verdict is guilty.

Скачать книгу