Statistics for HCI. Alan Dix

Чтение книги онлайн.

Читать онлайн книгу Statistics for HCI - Alan Dix страница 7

Statistics for HCI - Alan Dix Synthesis Lectures on Human-Centered Informatics

Скачать книгу

penalties inhibiting them.

image

      This kind of learning is not quite a weighted sum of past experience: for example, negative experiences typically count more than positive ones, and once a pattern is established it takes a lot to shift it. However, it is not so far from a probability estimate. We humans share these subconscious learning processes with other animals. They are powerful and lead to very rapid reactions, but need very large numbers of exposures to similar situations to establish memories.

      Of course we are not just our subconscious! In addition, we have conscious thinking and reasoning, which enable us to learn from a single experience. Retrospectively we are able to retrieve a relevant past experience, compare it to what we are encountering now, and work out what to do based on it. This is very powerful, but unlike our more unconscious sea of overlapping memories and associations, our conscious mind is linear and is normally locked into a single model of the world. Because of that single model, this form of thinking is not so good at intuitively grasping probabilities, as is repeatedly evidenced by gambling behaviour and more broadly our assessment of risk.

      One experiment used four packs of cards with different penalties and rewards to see how quickly people could assess the difference [5]. The experiment included some patients with prefrontal brain damage, but we’ll just consider the non-patients. The subjects could choose cards from the different packs. Each pack had an initial reward attached to it, but when they turned over a card it might also have a penalty, “sorry, you’ve lost $500.” Some of the packs, those with the higher initial per-card reward, had more penalties, and the other packs had a better balance of rewards. After playing for a while most subjects realised that the packs were different and could tell which were better. The subjects were also wired up to a skin conductivity sensor as used in a lie detector. Well before they were able to say that some of the card packs were worse than the others, they showed a response on the sensor when they were about to turn over a card from the disadvantageous pack—that is subconsciously they knew it was likely to be a bad card.

      Because our conscious mind is not naturally good at dealing with probabilities we need to use the tool of mathematics to enable us to reason explicitly about them. For example, if the subjects in the experiment had kept a tally of good and bad cards, they would have seen, in the numbers, which packs were better.

      Some years ago, when I was first teaching statistics, I remember learning that statistics education was known to be particularly difficult. This is in part because it requires a combination of maths and real-world thinking.

      In statistics we use the explicit tallying of data and mathematical reasoning about probabilities to let us do quite complex reasoning from effects (measurements) back to causes (the real word phenomena that are being measured). So you do need to feel reasonably comfortable with this mathematics. However, even if you are a whizz at maths, if you can’t relate this back to understanding about the real world, you are also stuck. It is a bit like the applied maths problems where people get so lost in the maths that they forget the units: “the answer is 42”—but 42 what? 42 degrees, 42 metres, or 42 bananas?

      On the whole, those who are good at mathematics are not always good at relating their thinking back to the real world, and those of a more practical disposition are not always best at maths—no wonder statistics is hard!

      However, knowing this we can try to make things better.

      It is likely that the majority of readers of this book will have a stronger sense of the practical issues, so I will try to explain some of the concepts that are necessary, without getting deep into the mathematics of how they are calculated—leave that to the computer!

      The fact that you have opened this book suggests that you think you should learn something about statistics. However, maybe the majority of your work is qualitative, or you typically do small-scale studies and you wonder if it is sufficient to eyeball the raw data and make a judgement.

      Sometimes no statistics are necessary. Perhaps you have performed a small user trial and one user makes an error; you look at the circumstances and think “of course lots of users will have the same problem.” Your judgement is based purely on past experience and professional knowledge.

      However, suppose you have performed a survey comparing two alternative systems and asked users which system they prefer. The results are shown in Fig. 1.2. It is clear that System A is far more popular than System B. Or is it?

      Notice that the left hand scale has two notches, but no values. Let’s suppose first that the notches are at 1000 and 2000: the results of surveying 3000 people. This is obviously a clear result. However, if instead the notches were at 1 and 2, representing a survey of 3 users, you might not be so confident in the results. As you eyeball the data, you are performing some informal statistics.

image

      What if it were 10 to 20, or 5 to 10? How clear a result would that be? The job of statistics is precisely to help you with judgements such as these.

      If you want to use statistics you will need to learn about t-tests and p-values, perhaps Bayesian statistics or Normal distributions, maybe a stats package such as SPSS or R. But why do this at all? What does statistics actually achieve?

      Fundamentally, statistics is about trying to learn dependable things about the real world based on measurements of it.

      However, what we mean by ‘real’ is itself a little complicated, from the actual users you have tested to the hypothetical idea of a ‘typical user’ of your system.

      We’ll start with the real world, but what is that?

      the sample First of all, there is the actual data you have: results from an experiment, responses from a survey, or log data from a deployed application. This is the real world. The user you tested at 3 PM on a rainy day in March, after a slightly overfilling lunch, did make precisely 3 errors and finished the task in 17 minutes and 23 seconds. However, while this measured data is real, it is typically not what you wanted to know. Would the same user on a different day, under different conditions, have made the same errors? What about other users?

      the population Another idea of ‘real’ is when there is a larger group of people you want to know about, say all the employees in your company, or all users of product A. This larger group is often referred to as the population. What would be the average (and variation in) error rate if all of them sat down and used the software you are testing? Or, as a more concrete kind of measurement, what is their average height? You might take a sample of 20 people and find their average height, but you are using this to make an estimate about your population as a whole.

      the ideal However, while this idea of the actual population is very concrete, often the ‘real’ world

Скачать книгу