Medical Statistics. David Machin

Чтение книги онлайн.

Читать онлайн книгу Medical Statistics - David Machin страница 36

Medical Statistics - David  Machin

Скачать книгу

Z, the number of standard deviations 4.5 kg is away from the mean of 3.4 kg, that is, images. Then look for z = 1.83 in Table T1 of the Normal distribution table, which gives the probability of being outside the values of the mean −1.83SD to mean +1.83SD as 0.0672. Therefore the probability of having a birthweight of 4.5 kg or higher is 0.0672/2 = 0.0336 or 3.4%.

      The Normal distribution also has other uses in statistics and is often used as an approximation to the Binomial and Poisson distributions. Figure 4.4 shows that the Binomial distribution for any particular value of the parameter π approaches the shape of a Normal distribution as the other parameter n increases. The approach to Normality is more rapid for values of π near 0.5 than for values near to 0 or 1. Thus, provided n is large enough, a count may be regarded as approximately Normally distributed with mean and images. The Poisson distribution with mean λ approaches Normality as λ increases (see Figure 4.5). When λ is large a Poisson variable may be regarded as approximately Normally distributed with mean λ and SD = √λ.

      Diagnostics tests use patient data to classify individuals as either normal or abnormal. A related statistical problem is the description of the variability in normal individuals, to provide a basis for assessing the test results of other individuals. The most common form of presenting such data is as a range of values or interval that contains the values obtained from the majority of a sample of normal subjects. The reference interval is often referred to as a normal range or reference range. To distinguish the use of the same word for the Normal distribution we have used a lower case, for the normal range, and upper case convention throughout this book.

      Worked Example – Reference Range – Birthweight

equation equation

      If the baby data were not Normally distributed then the normal reference range is obtained from the calculated percentiles of the sample as described in Chapter 2. Thus the 2.5 percentile corresponds to 2.5% of the babies below this weight which equals 2.91 kg. Correspondingly the estimated 97.5 percentile suggests that only 2.5% of babies are heavier than 4.43 kg at birth. The percentile‐based reference range for baby birthweight is therefore estimated to be 2.19 to 4.43 kg. This is very close to that obtained when we assume the birthweight has a Normal distribution.

      Most reference ranges are based on samples larger than 3500 people. Over many years, and millions of births, the World Health Organization (WHO) has come up with a normal birthweight range for new‐born babies. These ranges represent results than are acceptable in new‐born babies and actually cover the middle 80% of the population distribution, that is, the 10th and 90th centiles. Low birthweight babies are usually defined (by the WHO) as weighing less than 2500 g (the 10th centile) regardless of gestational age, and large birth weight babies are defined as weighing above 4000 g (the 90th centile). Hence the normal birth weight range is around 2.5 to 4.0 kg. For our sample data, the 10th to 90th centile range was similar, at 2.75 to 4.03 kg.

      There are many other probability distributions used in statistics. In this section we briefly list and describe those that are more commonly used.

      t‐distribution

      Student's t‐distribution is any member of a family of continuous probability distributions that arises when estimating the mean of a Normally distributed variable (in the population) in situations where the sample size is small and the population standard deviation is unknown. It was developed by William Sealy Gosset under the pseudonym Student.

      The t‐distribution plays an important role in a number of widely used statistical analyses, including Student's t‐test for assessing the statistical significance of the difference between two sample means, the construction of confidence intervals for the difference between two population means, and in linear regression analysis.

Graphs depict a few examples of probability density or distribution functions for the t-, chi-squared, F- and uniform distributions. (a) t-distribution. (b) Chi-squared distribution. (c) F-distribution. (d) Uniform distribution.

      Chi‐squared Distribution

      The chi‐squared distribution (or χ2‐distribution) with n degrees of freedom (Figure 4.14b) is the distribution of a sum of the squares of n independent standard Normal random variables. The chi‐squared distribution is always positive and its shape is uniquely determined by the degrees of freedom. The distribution becomes more symmetrical as the degrees of freedom increase and when the degrees of freedom are greater than 50, the chi‐squared distribution is very similar to the Normal distribution. The chi‐squared distribution is used in the common chi‐squared tests for goodness of fit of an observed distribution to a theoretical one, the independence of two criteria of classification of qualitative data, and in confidence interval estimation for a population standard deviation of a Normal distribution from a sample standard deviation.

      F‐distribution

      The F‐distribution (Figure 4.14c) is the distribution of the ratio of two chi‐squared distributions and is used in hypothesis testing when we want to compare variances, such as in one‐way

Скачать книгу