Читать онлайн книгу - Medical Statistics. David Machin. Медицина. LiveLib

Новинки Лучшее Рекомендации

Информация о книге:

Название:

Автор:

Жанр:

Серия:

Издательство:

Скачать книгу

have been collected. They believe that the job of the statistician is simply to analyse the data and, with powerful computers available, even complex studies with many variables can be easily processed. However, analysis is only part of a statistician's job, and calculation of the final ‘P‐value’ a minor one at that!

A far more important task for the medical statistician is to ensure that results are comparable and generalisable.

Example from the Literature – Drinking Coffee and Cancer (IARC 2018)

In 2016, a working group of 23 scientists from 10 countries met at IARC in Lyon, France, to review the research evidence of whether or not drinking coffee is carcinogenic and causes cancer. They reviewed the available data from more than 1000 observational and experimental studies. In rating the evidence, the working group gave the greatest weight to well‐conducted studies that controlled satisfactorily for important potential confounders, including tobacco and alcohol consumption. For bladder cancer, they found no consistent evidence of an association with drinking coffee, or of a dose–response relationship, that is drinking more coffee increased the incidence of cancer. In several studies, the relative risks of cancer for those drinking coffee compared to non‐drinkers were increased in men but women were either not affected or the risk decreased. IARC (2018) concluded from this that there was no evidence that drinking coffee caused bladder cancer and, as Loomis et al. (2016) stated ‘that positive associations reported in some studies could have been due to inadequate control for tobacco smoking, which can be strongly associated with heavy coffee drinking’.

In the above example tobacco and alcohol consumption are examples of confounding variables as illustrated in Figure 1.1. In this example, the individuals exposed or drinking coffee are typified by their tobacco and alcohol consumption, and these same factors are also known to influence cancer incidence rates.

Figure 1.1 Graphical representation of how confounding variables may influence both exposure (drinking coffee) and bladder cancer incidence.

Any observational study that compares populations distinguished by a particular variable (such as a comparison of coffee drinkers and non‐coffee drinkers) and ascribes the differences found in other variables (such as bladder cancer rates) to the first variable is open to the charge that the observed differences are in fact due to some other, confounding, variables. Thus, the difference in bladder cancer rates between coffee drinkers and non‐drinkers has been ascribed to genetic factors; that is, some factor that makes people want to drink coffee also makes them more susceptible bladder cancer. The difficulty with observational studies is that there is an infinite source of potential confounding variables. An investigator can measure all the variables that seem reasonable to him but a critic can always think of another, unmeasured, variable that just might explain the result. It is only in prospective randomised studies that this logical difficulty is avoided. In randomised trials, where the alternative interventions (the exposure variables) are assigned purely by a chance mechanism, it can be assumed that unmeasured confounding variables are comparable, on average, in the two groups. Unfortunately, in many circumstances it is not possible to randomise the exposure variable as part of the experimental design, as in the case of drinking coffee and bladder cancer, and so alternative interpretations are always possible. Observational studies are further discussed in Chapter 14.

1.4 How a Statistician Can Help

Statistical ideas relevant to good design and analysis are not easy and we would always advise an investigator to seek the advice of a statistician at an early stage of an investigation. Here are some ways the medical statistician might help.

Sample Size and Power Considerations

One of the commonest questions asked of a consulting statistician is: how large should my study be? If the investigator has a reasonable amount of knowledge as to the likely outcome of a study, and potentially large resources of finance and time, then the statistician has tools available to enable a scientific answer to be made to the question. However, the usual scenario is that the investigator has either a research grant of a limited size, or limited time, or a limited pool of patients. Nevertheless, given certain assumptions the medical statistician is still able to help. For a given number of patients, the probability of obtaining effects of a certain size can be calculated. If the outcome variable is simply success or failure, the statistician will need to know the anticipated percentage of successes in each group so that the difference between them can be judged of potential clinical relevance. If the outcome variable is a quantitative measurement, the statistician will need to know the size of the difference between the two groups, and the expected variability of the measurement. For example, in a survey to see if patients with diabetes have raised blood pressure the medical statistician might say ‘with 100 diabetics and 100 healthy subjects in this survey and a possible difference in blood pressure of 5 mmHg, with standard deviation 10 mmHg, you have a 20% chance of obtaining a statistically significant result at the 5% level’. (The term ‘statistically significant’ will be explained in Chapter 6.) This statement means that one would anticipate that in only one study in five (20%) of the proposed size would a statistically significant result be obtained. The investigator would then have to decide whether it was sensible or ethical to conduct a survey with such a small probability of success. One option would be to increase the size of the survey until success (defined as a statistically significant result if a difference of 5 mmHg or more does truly exist) becomes more probable.

Questionnaires

Rigby et al. (2004), in their survey of original articles in three UK general practice journals, found that the most common design was that of a cross‐sectional or questionnaire survey, with approximately one third of the articles classified as such.

For all but the smallest data sets it is desirable to use a computer for statistical analysis. The responses to a questionnaire will need to be easily coded for computer analysis and a medical statistician may be able to help with this. It is important to ask for help at an early stage so that the questionnaire can be piloted and modified before use in a study. Further details on questionnaire design and surveys are given in Chapter 14.

Choice of Sample and of Control Subjects

The question of whether one has a representative sample is a typical problem faced by statisticians. For example, it used to be believed that migraine was associated with intelligence, perhaps on the grounds that people who used their brains were more likely to get headaches, but a subsequent population study failed to reveal any social class gradient and, by implication, any association with intelligence. The fallacy arose, perhaps, because intelligent people were more likely than the less intelligent to consult their physician about migraine.

In many studies an investigator will wish to compare patients suffering from a certain disease with healthy (control) subjects. The choice of the appropriate control population is crucial to a correct interpretation of the results. This is discussed further in Chapter 14.

Design of Study

It has been emphasised