Biostatistics Decoded. A. Gouveia Oliveira

Чтение книги онлайн.

Читать онлайн книгу Biostatistics Decoded - A. Gouveia Oliveira страница 14

Biostatistics Decoded - A. Gouveia Oliveira

Скачать книгу

methods require that everything is measured. It is of great importance to select and identify the scale used for the measurement of each study variable, or attribute, because the scale determines the statistical methods that will be used for the analysis. There are only four scales of measurement.

      The simplest scale is the binary scale, which has only two values. Patient sex (female, male) is an example of an attribute measured in a binary scale. Everything that has a yes/no answer (e.g. obesity, previous myocardial infarction, family history of hypertension, etc.) was measured in a binary scale. Very often the values of a binary scale are not numbers but terms, and this is why the binary scale is also a nominal scale. However, the values of any binary attribute can readily be converted to 0 and 1. For example, the attribute sex, with values female and male, can be converted to the attribute female sex with values 0 meaning no and 1 meaning yes.

      Next in complexity is the categorical scale. This is simply a nominal scale with more than two values. In common with the binary scale, the values in the categorical scale are usually terms, not numbers, and the order of those terms is arbitrary: the first term in the list of values is not necessarily smaller than the second. Arithmetic operations with categorical scales are meaningless, even if the values are numeric. Examples of attributes measured on a categorical scale are profession, ethnicity, and blood type.

      When values can be ordered, we have an ordinal scale. An ordinal scale may have any number of values, the values may be terms or numbers, and the values must have a natural order. An example of an ordinal scale is the staging of a tumor (stage I, II, III, IV). There is a natural order of the values, since stage II is more invasive than stage I and less than stage III. However, one cannot say that the difference, either biological or clinical, between stage I and stage II is larger or smaller than the difference between stage II and stage III. In ordinal scales, arithmetic operations are meaningless.

      If an interval scale has a meaningful zero, it is called a ratio scale. Examples of ratio scales are height and weight. An example of an interval scale that is not a ratio scale is the Celsius scale, where zero does not represent the absence of temperature, but rather the value that was by convention given to the temperature of thawing ice. In ratio scales, not only are sums and subtractions possible, but multiplications and divisions as well. The latter two operations are meaningless in non‐ratio scales. For example, we can say that a weight of 21 g is half of 42 g, and a height of 81 cm is three times 27 cm, but we cannot say that a temperature of 40°C is twice as warm as 20°C. With very rare exceptions, all interval‐scaled attributes that are found in research are measured in ratio scales.

      We said above that one important purpose of biostatistics is to determine the characteristics of a population in order to be able to make predictions on any subject belonging to that population. In other words, what we want to know about the population is the expected value of the various attributes present in the elements of the population, because this is the value we will use to predict the value of each of those attributes for any member of that population. Alternatively, depending on the primary aim of the research, we may consider that biological attributes have a certain, unknown value that we attempt to measure in order to discover what it is. However, an attribute may, and usually does, express variability because of the influence of a number of factors, including measurement error. Therefore, the mean value of the attribute may be seen as the true value of an attribute, and its variability as a sign of the presence of factors of variation influencing that attribute.

      There are several possibilities for expressing the expected value of an attribute, which are collectively called central tendency measures, and the ones most used are the mean, the median, and the mode.

      The mean is a very common measure of central tendency. We use the notion of mean extensively in everyday life, so it is not surprising that the mean plays an extremely important role in statistics. Furthermore, being a sum of values, the mean is a mathematical quantity and therefore amenable to mathematical processing. This is the other reason why it is such a popular measure in statistics.

Скачать книгу