We Humans and the Intelligent Machines. Jörg Dräger
Чтение книги онлайн.
Читать онлайн книгу We Humans and the Intelligent Machines - Jörg Dräger страница 9
Pablo Picasso (1881–1973)
“The computer says no.” It is one of the running gags in the popular British television show Little Britain. The idea is always the same: A customer approaches Carol, an office worker, with a request and is rejected. Sometimes the scene takes place in a travel agency, sometimes in a doctor’s office, sometimes in a bank. No matter what the customer wants, whether to book a vacation, make an appointment or open an account, “The computer says no” is inevitably the answer after Carol types in the relevant query.
She is so grumpy that the problem is completely obvious: It is not the computer, but Carol’s foul mood. Perhaps she lacks the skill to fulfil the customer’s request; she is, in any event, completely unwilling to be helpful. To prevent this from becoming an issue, she blames the computer. As if to say “Sorry, nothing I can do,” every request or complaint bounces off her, no matter how convincing or affecting it might be.
The situation that the British series parodied at the beginning of the 2000s can still be found in today’s world, since, together with incompetent, unthinking or unethical users, algorithmic systems like Carol’s computer sometimes produce unwanted, unfair and inappropriate results.
The following six stories from very different areas illustrate when, where and why algorithms can be wrong. And they show how serious the consequences can be. They also give an initial idea of what we have to do or stop doing in order ensure algorithms actually serve society.
System error: Algorithms fail to do the job they are assigned to
Louise Kennedy was actually a bit nervous.2 Emigrating from Ireland to Australia was a big step for the veterinarian. She suspected that everything would not fall into her lap right away. However, she had not expected that the highest hurdle for the native speaker with two university degrees would, of all things, be an English-language test. She scored 74 points on the oral part, 79 were required. She was refused permanent residence. Who would not think of “Computer says no”?
The Irishwoman had indeed failed – because of voice recognition technology. The computer-based test is used by the Australian immigration authorities to assess oral speaking ability. Foreigners who want to live in Australia have to repeat sentences and retell a story. An algorithm then analyzes their ability to speak.
Alice Xu, originally from China, attempted to pass the test as well. She studied in Australia and speaks English fluently, but the algorithm refused to recognize her abilities, too. She scored a paltry 41 points on her oral examination. Xu did not want to give up so easily and hired a coach, who helped her pass on her second attempt with the maximum number of points. How do you improve your oral language skills so markedly in such a short time?
Her coach Clive Liebmann explains the leap in performance, revealing the absurdity of how the software works: “So I encourage students to exaggerate tonation in an over-the-top way and that means less making sense with grammar and vocabulary and instead focusing more on what computers are good at, which is measuring musical elements like pitch, volume and speed.”3 If pitch, volume and speed are correct, the test takers could, in extreme cases, talk utter nonsense as long as some of the vocabulary matches the topic.
Louise Kennedy did not go to a coach, but to the public. The media around the world then mocked the Australian immigration authorities. But they did not react at all. On the contrary, the company providing the automatic language tests just pointed out that the requirements for potential immigrants were very high. Of course, it was not the performance standards that prevented a young and highly qualified native speaker from getting permanent residency, it was the algorithmic system that was simply not able to process the Irish accent correctly. The voice recognition software used in Australia is not yet capable of testing sentence structure, vocabulary and the ability to logically render complex information. That is the heart of the problem. The refusal to admit the obvious makes the incident look like a parody.
Yet the story has a very serious side. Ultimately, the software does not safeguard the Australian state’s legitimate interests when it comes to immigration, neither does it provide justice for those individuals who have worked diligently in the hope of gaining a residence permit. Alice Xu and Louise Kennedy found ways to circumvent the deficient algorithmic system. One exploited the software’s weaknesses and told it exactly what it wanted to hear. The other married an Australian, allowing her to stay in the country permanently. But people should not have to adapt to meet the needs of a faulty algorithm; dysfunctional software should be adapted to meet people’s needs instead.
Wrong conclusions: Algorithms misinterpret data
Today, social media are an important source of information for many people, and users get the messages on their screen that interest them the most. Facebook, for example, tries to ensure that people will spend as much time as possible on the social network, viewing as many texts, videos and photos as possible and commenting on them. The messages that the site’s algorithms automatically present to each user should therefore be as relevant as possible to her or him. But how do you define and measure relevance?
For Facebook, the key indicator is individual user behavior. If someone lingers even a moment longer on a message, clicks on a button or calls up a video, the platform sees this as a sign of increased interest. The more intensively and the longer people interact with content, the more relevant that content must be – that, at least, is the assumption. Using this sort of analysis, the algorithm calculates who it will supply with which news from which source in the future. The problem with this is that the more disturbing a post, the more likely it is that someone will spend time with it. The software again will see this as interest and send the user additional messages of the same kind. If you wanted to measure relevance in a way that benefits society, it would have to be done differently. Basic values that are important to the common good, such as truth, diversity and social integration, play a subordinate role here at best. What counts instead is getting attention and screen time (see Chapter 13).
Facebook not only tries to find out what users like best but also what they do not like at all. This led to a long-standing misinterpretation because a wrong impetus was being measured: If a user clicked on the “hide post” option, the algorithm interpreted this as a clear sign of dissatisfaction and accordingly did not show the person any further messages of a similar kind. This was true until 2015, when someone took a closer look and discovered that 5 percent of Facebook users were responsible for 85 percent of the hidden messages.
These so-called super hiders were a mystery. They hid almost everything that appeared in their news stream, even posts they had commented on shortly before. Surveys then revealed that the super hiders were by no means dissatisfied. They just wanted to clear away read messages, just as some people keep their inbox clean by continually deleting e-mails. Having discovered what was going on, Facebook changed its approach. Since then, it no longer necessarily interprets hiding a post as a strong signal of displeasure.4
In this case, the algorithm did what it was told, but with an unwanted result. Wrong criteria led to wrong conclusions. The algorithm was unable to detect the super-hider phenomenon. An investigation initiated and evaluated by humans was required to uncover what was truly happening. Anyone who uses algorithmic systems is well advised to regularly question and check the systems’ logic and meaningfulness.
Discriminatory data: Algorithms amplify inequalities
It is a mild spring day in Fort Lauderdale, Florida, in the