Medical Statistics. David Machin
Чтение книги онлайн.
Читать онлайн книгу Medical Statistics - David Machin страница 17
![Medical Statistics - David Machin Medical Statistics - David Machin](/cover_pre843611.jpg)
Interval and Ratio Scales
One can distinguish between interval and ratio scales. In an interval scale, such as body temperature or calendar dates, a difference between two measurements has meaning, but their ratio does not. Consider measuring temperature (in degrees centigrade) then we cannot say that a temperature of 20 °C is twice as hot as a temperature of 10 °C. In a ratio scale, however, such as bodyweight, a 10% increase implies the same weight increase whether expressed in kilogrammes or pounds. The crucial difference is that in a ratio scale, the value of zero has real meaning, whereas in an interval scale, the position of zero is arbitrary.
One difficulty with giving ranks to ordered categorical data is that one cannot assume that the scale is interval. Thus, as we have indicated when discussing ordinal data, one cannot assume that risk of a corn healing for a current smoker, relative to a non‐smoker, is the same as the risk for a previous smoker relative to a non‐smoker. Were Farndon et al. (2013) simply to score the three levels of smoking as 0, 1, 2 in their subsequent analysis, then this would imply in some way the intervals between the levels or scores have equal numerical value.
2.2 Summarising Categorical Data
Binary data are the simplest type of data in which each individual has a label that takes one of two values such as: male or female; corn healed or not healed. A simple summary would be to count the different types of label. However, a raw count is rarely useful. For example, in Table 2.1 there are more non‐smokers in the scalpel group (40 out of 99 or 40%) compared to corn plaster group (34 out of 98 or 35%). It is only when this number is expressed as a proportion that it becomes useful. Hence the first step to analysing categorical data is to count the number of observations in each category and express them as proportions of the total sample size.
Illustrative Example – Salicylic Acid Plasters for Treatment of Foot Corns
Farndon et al. (2013) reports a randomised controlled trial that investigated the effectiveness of salicylic acid plasters compared with usual scalpel debridement for treatment of foot corns. As we have already mentioned one categorical variable recorded was the centre where each trial participant was treated. Trial participants were treated at one of seven centres and the corresponding categories as displayed in Table 2.2. The first column shows category (treatment centre) names, whilst the second shows the number of individuals in each category together with its percentage contribution to the total. Since the total sample size is more than 100 we have reported the percentages to one decimal place. Table 2.2 clearly shows that the majority (54.5%) of patients were treated at the ‘Central’ treatment centre.
Table 2.2 Treatment centre for 202 patients with corns who were recruited to a randomised control trial of the effectiveness of salicylic acid plasters compared with ‘usual’ scalpel debridement for the treatment of corns
(Source: data from Farndon et al. 2013).
Treatment centre | Frequency | Percentage |
---|---|---|
Central | 110 | 54.5% |
Manor | 33 | 16.3% |
Jordanthorpe | 24 | 11.9% |
Limbrick | 9 | 4.5% |
Firth Park | 11 | 5.4% |
Huddersfield | 9 | 4.5% |
Darnall | 6 | 3.0% |
Total | 202 | 100.0 |
In addition to tabulating each variable separately, we might be interested in whether the distribution of patients across each centre is the same for each randomised group. Table 2.3 shows the distribution of the number of patients treated at centre by randomised group; in this case it can be said that treatment centre has been cross‐tabulated with randomised group. Table 2.3 is an example of a contingency table with seven rows (representing treatment centre) and two columns (randomised group). Note that we are interested in the distribution of patients across the seven centres in each randomised group (to see whether or not we have similar numbers of patients randomised to each treatment within each centre), and so the percentages add to 100 down each column, rather than across the rows. In this example since we have 101 and 101 patients in each randomised group the percentages are almost the same as the raw counts. However, for most studies you are unlikely to have exactly 100 participants in each group!
Table 2.3 Cross‐tabulation of treatment centre by randomised group for 202 patients with corns who were recruited to a randomised control trial of the effectiveness of salicylic acid plasters compared with ‘usual’ scalpel debridement for the treatment of corns
(Source: data from Farndon et al. 2013).
Randomised group | |||
---|---|---|---|
Corn plaster | Scalpel | All | |
n (%) | n (%) | n (%) | |
Central | 58 (57) | 52 (52) | 110 (54.5) |
Manor | 13 (13) | 20 (20) | 33 (16.3) |
Jordanthorpe | 10 (10) | 14 (14) | 24 (11.9) |
Limbrick | 3 (3) | 6 (6) | 9 (4.5) |
Firth Park | 7 (7) | 4 (4) | 11 (5.4) |
Huddersfield
|