The Creativity Code. Marcus du Sautoy
Чтение книги онлайн.
Читать онлайн книгу The Creativity Code - Marcus du Sautoy страница 9
It was a good idea but it didn’t work. Any conventional machine programmed on a database of accepted openings wouldn’t have known how to respond and would most likely have made a move that would have serious consequences in the grand arc of the game. But AlphaGo was not a conventional machine. It could assess the new moves and determine a good response based on what it had learned over the course of its many games. As David Silver, the lead programmer on AlphaGo, explained in the lead-up to the match: ‘AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving.’ If anything, Sedol had put himself at a disadvantage by playing a game that was not his own.
As I watched I couldn’t help feeling for Sedol. You could see his confidence draining out of him as it gradually dawned on him that he was losing. He kept looking over at Huang, the DeepMind representative who was playing AlphaGo’s moves, but there was nothing he could glean from Huang’s face. By move 186 Sedol had to recognise that there was no way to overturn the advantage AlphaGo had built up on the board. He placed a stone on the side of the board to indicate his resignation.
By the end of day one it was: AlphaGo 1 Humans 0. Sedol admitted at the press conference that day: ‘I was very surprised because I didn’t think I would lose.’
But it was game 2 that was going to truly shock not just Sedol but every human player of the game of Go. The first game was one that experts could follow and appreciate why AlphaGo was playing the moves it was. They were moves a human champion would play. But as I watched game 2 on my laptop at home, something rather strange happened. Sedol played move 36 and then retired to the roof of the hotel for a cigarette break. While he was away, AlphaGo on move 37 instructed Huang, its human representative, to place a black stone on the line five steps in from the edge of the board. Everyone was shocked.
The conventional wisdom is that during the early part of the game you play stones on the outer four lines. The third line builds up short-term territory strength on the edge of the board while playing on the fourth line contributes to your strength later in the game as you move into the centre of the board. Players have always found that there is a fine balance between playing on the third and fourth lines. Playing on the fifth line has always been regarded as suboptimal, giving your opponent the chance to build up territory that has both short- and long-term influence.
AlphaGo had broken this orthodoxy built up over centuries of competing. Some commentators declared it a clear mistake. Others were more cautious. Everyone was intrigued to see what Sedol would make of the move when he returned from his cigarette break. As he sat down, you could see him physically flinch as he took in the new stone on the board. He was certainly as shocked as all of the rest of us by the move. He sat there thinking for over twelve minutes. Like chess, the game was being played under time constraints. Using twelve minutes of your time was very costly. It is a mark of how surprising this move was that it took Sedol so long to respond. He could not understand what AlphaGo was doing. Why had the program abandoned the region of stones they were competing over?
Was this a mistake by AlphaGo? Or did it see something deep inside the game that humans were missing? Fan Hui, who had been given the role of one of the referees, looked down on the board. His initial reaction matched everyone else’s: shock. And then he began to realise: ‘It’s not a human move. I’ve never seen a human play this move,’ he said. ‘So beautiful. Beautiful. Beautiful. Beautiful.’
Beautiful and deadly it turned out to be. Not a mistake but an extraordinarily insightful move. Some fifty moves later, as the black and white stones fought over territory from the lower left-hand corner of the board, they found themselves creeping towards the black stone of move 37. It was joining up with this stone that gave AlphaGo the edge, allowing it to clock up its second win. AlphaGo 2 Humans 0.
Sedol’s mood in the press conference that followed was notably different. ‘Yesterday I was surprised. But today I am speechless … I am in shock. I can admit that … the third game is not going to be easy for me.’ The match was being played over five games. This was the game that Sedol needed to win to be able to stop AlphaGo claiming the match.
The human fight-back
Sedol had a day off to recover. The third game would be played on Saturday, 12 March. He needed the rest, unlike the machine. The first game had been over three hours of intense concentration. The second lasted over four hours. You could see the emotional toll that losing two games in a row was having on him.
Rather than resting, though, Sedol stayed up till 6 a.m. the next morning analysing the games he’d lost so far with a group of fellow professional Go players. Did AlphaGo have a weakness they could exploit? The machine wasn’t the only one who could learn and evolve. Sedol felt he might learn something from his losses.
Sedol played a very strong opening to game 3, forcing AlphaGo to manage a weak group of stones within his sphere of influence on the board. Commentators began to get excited. Some said Sedol had found AlphaGo’s weakness. But then, as one commentator posted: ‘Things began to get scary. As I watched the game unfold and the realisation of what was happening dawned on me, I felt physically unwell.’
Sedol pushed AlphaGo to its limits but in so doing he revealed the hidden powers that the program seemed to possess. As the game proceeded, it started to make what commentators called lazy moves. It had analysed its position and was so confident in its win that it chose safe moves. It didn’t care if it won by half a point. All that mattered was that it won. To play such lazy moves was almost an affront to Sedol, but AlphaGo was not programmed with any vindictive qualities. Its sole goal was to win the game. Sedol pushed this way and that, determined not to give in too quickly. Perhaps one of these lazy moves was a mistake that he could exploit.
By move 176 Sedol eventually caved in and resigned. AlphaGo 3 Humans 0. AlphaGo had won the match. Backstage, the DeepMind team was going through a strange range of emotions. They’d won the match, but seeing the devastating effect it was having on Sedol made it hard for them to rejoice. The million-dollar prize was theirs. They’d already decided to donate the prize, if they won, to a range of charities dedicated to promoting Go and science subjects as well as to Unicef. Yet their human code was causing them to empathise with Sedol’s pain.
AlphaGo did not demonstrate any emotional response to its win. No little surge of electrical current. No code spat out with a resounding ‘YES!’ It is this lack of response that gives humanity hope and is also scary at the same time. Hope because it is this emotional response that is the drive to be creative and venture into the unknown: it was humans, after all, who’d programmed AlphaGo with the goal of winning. Scary because the machine won’t care if the goal turns out to be not quite what its programmers had intended.
Sedol was devastated. He came out in the press conference and apologised:
I don’t know how to start or what to say today, but I think I would have to express my apologies first. I should have shown a better result, a better outcome, and better content in terms of the game played, and I do apologize for not being able to satisfy a lot of people’s expectations. I kind of felt powerless.
But he urged people to keep watching the final two games. His goal now was to try to at least get one back for humanity.
Having lost the match, Sedol started game 4 playing far more freely. It was as if the heavy burden of expectation had been lifted, allowing him to enjoy his game. In sharp contrast