Biostatistics Decoded. A. Gouveia Oliveira

Чтение книги онлайн.

Читать онлайн книгу Biostatistics Decoded - A. Gouveia Oliveira страница 22

Biostatistics Decoded - A. Gouveia Oliveira

Скачать книгу

If two variables have the value 1 and two have the value 2, then the sum will be 6, and this may occur in six different ways. If one variable has value 1 and three have value 2, then the result will be 7 and this may occur in four different ways. Finally, if all four variables have value 2, the result will be 8 and this can occur in only one way.

An illustration of the origin of the normal distribution.

      If we repeat the experiment with not two, but a much larger number of variables, the variable that results from adding all those variables will have not just five different values, but many more. Consequently, the graph will be smoother and more bell‐shaped. The same will happen if we add variables taking more than two values.

      If we have a very large number of variables, then the variable resulting from adding those variables will take an infinite number of values and the graph of its probability distribution will be a perfectly smooth curve. This curve is called the normal curve. It is also called the Gaussian curve after the German mathematician Karl Gauss who described it.

      What was presented in the previous section is known as the central limit theorem. This theorem simply states that the sum of a large number of independent variables with identical distribution has a normal distribution. The central limit theorem plays a major role in statistical theory, and the following experiment illustrates how the theorem operates.

      With a computer, we generated random numbers between 0 and 1, obtaining observations from two continuous variables with the same distribution. The variables had a uniform probability distribution, which is a probability distribution where all values occur with exactly the same probability.

Graphs depict the frequency distribution of sums of identical variables with uniform distribution.

      Notice that the more variables we add together, the more the shape of the frequency distribution approaches the normal curve. The fit is already fair for the sum of four variables. This result is a consequence of the central limit theorem.

      The normal distribution has many interesting properties, but we will present just a few of them. They are very simple to understand and, occasionally, we will have to call on them further on in this book.

      First property. The normal curve is a function solely of the mean and the variance. In other words, given only a mean and a variance of a normal distribution, we can find all the values of the distribution and plot its curve using the equation of the normal curve (technically, that equation is called the probability density function). This means that in normally distributed attributes we can completely describe their distribution by using only the mean and the variance (or equivalently the standard deviation). This is the reason why the mean and the variance are called the parameters of the normal distribution, and what makes these two summary measures so important. It also means that if two normally distributed variables have the same variance, then the shape of their distribution will be the same; if they have the same mean, their position on the horizontal axis will be the same.

      Third property. The sum, or difference, of a constant to a normally distributed variable will result in a new variable with a normal distribution. According to the properties of means and variances, the constant will be added to or subtracted from its mean, and its variance will not change (Figure 1.26).

      Fourth property. The multiplication, or division, of the values of a normally distributed variable by a constant will result in a new variable with a normal distribution. Because of the properties of means and variances, its mean will be multiplied, or divided, by that constant and its variance will be multiplied, or divided, by the square of that constant (Figure 1.26).

Скачать книгу