Biostatistics Decoded. A. Gouveia Oliveira

Чтение книги онлайн.

Читать онлайн книгу Biostatistics Decoded - A. Gouveia Oliveira страница 25

Biostatistics Decoded - A. Gouveia Oliveira

Скачать книгу

is a simple matter to calculate the frequency with which each of these results will appear. We write down all possible combinations of males and females that can be obtained in samples of four, and count in how many cases there are 0, 1, 2, 3, and 4 females. In this example, there are 16 possible outcomes. There is only one way of having 0 females, so the theoretical relative frequency of this outcome is once out of 16 outcomes, or 6.25%. There are four possible ways out of 16 of having 25% of females, which is when the first, or the second, or the third, or the fourth sampled individual is a female. Hence, the relative frequency of this outcome, at least theoretically, is 25%. There are six possible ways of having 50% of females, so the relative frequency of this outcome is 37.5%. There are four possible ways of having 75% of females, so the frequency of this result is 25%. Finally, there is only one possible way of having 100% of females, and the relative frequency of this result is 6.25%.

An illustration of the phenomenon of sampling variation the provided pie charts show the observed proportions in random samples of size n of a binary variable and the graph below shows the distribution of sample proportions of a large number of random samples.

      Therefore, with interval attributes we know the probability distribution of sample means only when the sample sizes are large or the attribute has a normal distribution. By contrast, with binary attributes we always know which the probability distribution of sample proportions is: it is the binomial distribution.

equation Bar chart depicts the probability distribution of a proportion which is the binomial distribution.

      We can use the formula to make the above calculations. For example, to calculate the probability of having k = 3 women in a sample of n = 4 observations, assuming that the proportion of women in the population is π = 0.5:

equation

      as before.

      Since the means of binary attributes in random samples follow a probability distribution, we can calculate the mean and the variance of sample proportions in the same way as we did with interval‐scaled attributes. If we view a sample proportion as the sum of single observations from binary variables with identical distribution, then the properties of means allow us to conclude that the mean of the distribution of sample proportions is equal to the population proportion of the attribute.

      By the same reasoning, we conclude that the variance of sample proportions must be the population variance of a binary attribute (the product of the probability of each value), divided by the sample size. If we call π the probability of an attribute having the value 1 (or, if we prefer, the proportion of the population having the attribute) and n the sample size, the variance of sample proportions is, therefore

equation

      To sum up, let us review what can be said about the distribution of means of random samples of binary variables:

       The distribution of the sample proportions is always known, and is called the binomial distribution.

       The mean of the distribution of sample proportions is equal to the population proportion of the attribute.

       The standard error of sample proportions is equal to the square root of the product of the probability of each value divided by the sample size.

Graphs depict the convergence of the binomial to the normal distribution.

      Конец ознакомительного фрагмента.

      Текст предоставлен ООО «ЛитРес».

      Прочитайте эту книгу целиком, купив полную легальную версию на ЛитРес.

      Безопасно оплатить книгу можно банковской картой

Скачать книгу