Applied Biostatistics for the Health Sciences. Richard J. Rossi

Чтение книги онлайн.

Читать онлайн книгу Applied Biostatistics for the Health Sciences - Richard J. Rossi страница 33

Applied Biostatistics for the Health Sciences - Richard J. Rossi

Скачать книгу

target="_blank" rel="nofollow" href="#fb3_img_img_a918fc23-ed83-5158-b0c1-f834eebaabc6.png" alt="StartLayout 1st Row 1st Column Population 1 colon 2nd Column 22 comma 24 comma 25 comma 27 comma 28 comma 28 comma 31 comma 32 comma 33 comma 35 comma 39 comma 41 comma 67 2nd Row 1st Column Population 2 colon 2nd Column 22 comma 24 comma 25 comma 27 comma 28 comma 28 comma 31 comma 32 comma 33 comma 35 comma 39 comma 41 comma 670 EndLayout"/>

      These two populations are identical except for their largest values, 67 and 670. For population 1, the mean is

StartLayout 1st Row 1st Column mu 2 2nd Column equals StartFraction 22 plus 24 plus 25 plus 27 plus 28 plus 28 plus 31 plus 32 plus 33 plus 35 plus 39 plus 41 plus 67 Over 13 EndFraction 2nd Row 1st Column Blank 2nd Column equals StartFraction 432 Over 13 EndFraction equals 33.23 EndLayout

      Now, because there are 11 units in population 1, the median is the sixth observation in the ordered list of population values. Thus, the median is 28. For population 2, the mean is

StartLayout 1st Row 1st Column mu 2 2nd Column equals StartFraction 22 plus 24 plus 25 plus 27 plus 28 plus 28 plus 31 plus 32 plus 33 plus 35 plus 39 plus 41 plus 670 Over 13 EndFraction 2nd Row 1st Column Blank 2nd Column equals StartFraction 1035 Over 13 EndFraction equals 79.63 EndLayout

      Since there are also 11 units in population 2, the median is also the sixth observation in the ordered list of population values. Thus, the median of population 2 is also 28.

      Note that the mean of population 2 is more than twice the mean of population 1 even though the populations are identical except for their single largest values. The medians of these two populations are identical because the median is not influenced by extreme values in a population. In population 1, both the mean and median are representative of the central values of the population. In population 2, none of the population units is near the mean value, which is 79.63. Thus, the mean does not represent the value of a typical unit in population 2. The median does represent a fairly typical value in population 2 since all but one of the values in population 2 are relatively close to the value of the median.

      The previous example illustrates the sensitivity of the mean to the extremes in a long-tailed distribution. Thus, in a distribution with an extremely long tail to the right or left, the mean will often be less representative than the median for describing the typical values in the population.

       Example 2.15

      Figure 2.14 The distribution of the age of onset of obsessive compulsive disorder.

      Figure 2.15 The distributions of the age of onset of Child Onset and Adult Onset OCD.

GM equals left-parenthesis product of all of the upper X values right-parenthesis Superscript 1 slash upper N

      where N is the number of units in the population. That is,

GM equals left-parenthesis upper X 1 times upper X 2 times upper X 3 times midline-horizontal-ellipsis times upper X Subscript upper N Baseline right-parenthesis Superscript 1 slash upper N

      where the X values for the N units in the population are X1,X2,X3,…,XN.

       Example 2.16

      The distribution given below has a long tail to the right.

22 comma 24 comma 25 comma 27 comma 28 comma 28 comma 31 comma 32 comma 33 comma 35 comma 39 comma 41 comma 670

      In a previous example, µ was computed to be 79.63. The geometric mean for this population is

left-parenthesis 22 times 24 times 25 times 27 times 28 times 28 times 31 times 32 times 33 times 35 times 39 times 41 times 670 right-parenthesis Superscript one-thirteenth Baseline equals 29.4

      Thus, even though there is an extremely large and atypical value in this population, the geometric mean is not sensitive to this value and is a more reasonable parameter for representing the typical value in this population. In fact, the geometric mean and median are very close for this population with GM = 29.4 and μ~=28.

      2.2.5 Measures of Dispersion

      Figure 2.16 Two different populations having the same mean, median, and mode.

      Even though the mean, median, and mode of these two populations are the same, clearly, population I is much more spread out than population II. The density of population II is greater at the mean, which means

Скачать книгу