Biostatistics Decoded. A. Gouveia Oliveira

Чтение книги онлайн.

Читать онлайн книгу Biostatistics Decoded - A. Gouveia Oliveira страница 23

Biostatistics Decoded - A. Gouveia Oliveira

Скачать книгу

of the relationship between the area under the normal curve and the standard deviation."/>

      So what would be the consequences of that change of perspective? With this point of view, a sample mean would correspond to the sum of a large number of observations from variables with identical distribution, each observation being divided by a constant amount which is the sample size. Under these circumstances, the central limit theorem applies and, therefore, we must conclude that the sample means have a normal distribution, regardless of the distribution of the attribute being studied.

      In the case of small samples, however, the means will also have a normal distribution provided the attribute has a normal distribution. This is not because of the central limit theorem, but because of the properties of the normal distribution. If the means are sums of observations on identical normally distributed variables, then the sample means have a normal distribution whatever the number of observations, that is, the sample size.

An illustration of the total obtained from the throw of six dice may be seen as the sum of observations on six identically distributed variables.

      We now know that the means of large samples may be defined as observations from a random variable with normal distribution. We also know that the normal distribution is completely characterized by its mean and variance. The next step in the investigation of sampling distributions, therefore, must be to find out whether the mean and variance of the distribution of sample means can be determined.

      As expected, as the variable we used had a normal distribution, the sample means also have a normal distribution. We can see that the average value of the sample means is, in all cases, the same value as the population mean, that is, 0. However, the standard deviations of the values of sample means are not the same in all four runs of the experiment. In samples of size 4 the standard error is 0.50, in samples of size 9 it is 0.33, in samples of size 16 it is 0.25, and in samples of size 25 it is 0.20.

Graphs depict the distribution of sample means of different sample sizes.

      In the next section we will present an explanation for this relationship, but for now let us consolidate some of the concepts we have discussed so far.

      The standard deviation of the sample means has its own name of standard error of the mean or, simply, standard error. If the standard error is equal to the population standard deviation divided by the square root of the sample size, then the variance of the sample means is equal to the population variance divided by the sample size.

      Now we can begin to see why people tend to get confused with statistics. We have been talking about different means and different standard deviations, and students often become disoriented with so many measures. Let us review the meaning of each one of those measures.

      There is the sample mean, which is not equal in value to the population mean. Sample means have a probability distribution whose mean has the same value as the population mean.

      Next, there is the sample standard deviation, which is not equal in value to the population standard deviation. Sample means have a distribution whose standard deviation, also known as standard error, is different from the sample standard deviation and from the population standard deviation. The usual notation for the sample standard deviation is the letter s, and for the population standard deviation is the letter σ (“s” in the Greek alphabet). There is no specific

Скачать книгу