Population Genetics. Matthew B. Hamilton
Чтение книги онлайн.
Читать онлайн книгу Population Genetics - Matthew B. Hamilton страница 26
In more general terms, the expected frequency of an event, p, times the number of trials or samples, n, gives the expected number of events or np. To test the hypothesis that p is the frequency of an event in an actual population, we compare np with
(2.7)
where ∑ (pronounced “sigma”) indicates taking the sum of multiple terms.
The χ2 formula makes intuitive sense. In the numerator, there is a difference between the observed and Hardy–Weinberg expected number of individuals. This difference is squared, like a variance, since we do not care about the direction of the difference but only the magnitude of the difference. Then, in the denominator, we divide by the expected number of individuals to make the squared difference relative. For example, a squared difference of 4 is small if the expected number is 100 (it is 4%) but relatively larger if the expected number is 8 (it is 50%). Adding all of these relative squared differences gives the total relative squared deviation observed over all genotypes.
(2.8)
We need to compare our statistic to values from the χ2 distribution. But, first, we need to know how much information, or the degrees of freedom (commonly abbreviated as df), was used to estimate the χ2 statistic. In general, degrees of freedom are based on the number of categories of data: df = no. of classes compared − no. of parameters estimated −1 for the χ2 test itself. In this case, df = 3–1 − 1 = 1 for three genotypes and one estimated allele frequency (with two alleles: the other allele frequency is fixed once the first has been estimated).
Figure 2.9 shows a χ2 distribution for one degree of freedom. Small deviations of the observed from the expected are more probable since they leave more area of the distribution to the right of the χ2 value. As the χ2 value gets larger, the probability that the difference between the observed and expected is just due to chance sampling decreases (the area under the curve to the right gets smaller). Another way of saying this is that as the observed and expected get increasingly different, it becomes more improbable that our null hypothesis of Hardy–Weinberg is actually the process that is determining genotype frequencies. Using Table 2.5, we see that a χ2 value of 7.46 with 1 df has a probability between 0.01 and 0.001. The conclusion is that the observed genotype frequencies would be observed less than 1% of the time in a population that actually had Hardy–Weinberg expected genotype frequencies. Under the null hypothesis, we do not expect this much difference or more from Hardy–Weinberg expectations to occur often. By convention, we would reject chance as the explanation for the differences if the χ2 value had a probability of 0.05 or less. In other words, if chance explains the difference in five trials out of 100 or less, then we reject the hypothesis that the observed and expected patterns are the same. The critical value above which we reject the null hypothesis for a χ2 test is 3.84 with 1 df, or in notation χ20.05, 1 = 3.84. In this case, we can clearly see an excess of heterozygotes and deficits of homozygotes, and employing the χ2 test allows us to conclude that Hardy–Weinberg expected genotype frequencies are not present in the population.
Figure 2.9 A χ2 distribution with one degree of freedom. The χ2 value for the Hardy–Weinberg test with MN blood group genotypes as well as the critical value to reject the null hypothesis are shown. The area under the curve to the right of the arrow indicates the probability of observing that much or more difference between the observed and expected outcomes.
Table 2.5 χ2 values and associated cumulative probabilities in the right‐hand tail of the distribution for one through five degrees of freedom.
Probability | ||||||
---|---|---|---|---|---|---|
df | 0.5 | 0.25 | 0.10 | 0.05 | 0.01 | 0.001 |
1 | 0.4549 | 1.3233 | 2.7055 | 3.8415 | 6.6349 | 10.8276 |
2 | 1.3863 | 2.7726 | 4.6052 | 5.9915 | 9.2103 | 13.8155 |
3 | 2.3660 | 4.1083 | 6.2514 | 7.8147 | 11.3449 | 16.2662 |
4 | 3.3567 | 5.3853 | 7.7794 | 9.4877 | 13.2767 | 18.4668 |
5 | 4.3515 | 6.6257 | 9.2364 | 11.0705 | 15.0863 |