banner banner banner
The Creativity Code: How AI is learning to write, paint and think
The Creativity Code: How AI is learning to write, paint and think
Оценить:
Рейтинг: 0

Полная версия:

The Creativity Code: How AI is learning to write, paint and think

скачать книгу бесплатно


Can creativity be taught?

Many artists like to fuel their own creation myth, appealing to external forces as responsible for their creativity. In Ancient Greece poets were said to be possessed by the muses, who breathed inspiration into the minds of men, sometimes sending them insane in the process. For Plato ‘a poet is holy, and never able to compose until he has become inspired, and is beside himself and reason is no longer in him … for no art does he utter but by power divine’. Ramanujan, the great Indian mathematician, likewise attributed his great insights to ideas he received in his dreams from his family goddess Namagiri. Is creativity a form of madness or a gift of the divine?

One of my mathematical heroes, Carl Friedrich Gauss, was one of the worst at covering his creative tracks. Gauss is credited with creating modern number theory with the publication in 1798 of one of the great mathematical works of all time: Disquisitiones arithmeticae. When people tried to read the book to uncover where he got his ideas, they were mystified. The work has been described as a book of seven seals. Gauss seems to pull ideas like rabbits out of a hat, without ever really giving us an inkling of how he achieved this magic. Later, when challenged, he retorted that an architect does not leave up the scaffolding after the house is complete. Gauss, like Ramanujan, attributed one revelation to ‘the Grace of God’, saying he was ‘unable to name the nature of the thread which connected what I previously knew with that which made my success possible’.

Yet the fact that an artist may be unable to articulate where their ideas came from does not mean that they followed no rules. Art is a conscious expression of the myriad of logical gates that make up our unconscious thought processes. There was of course a thread of logic that connected Gauss’s thoughts: it was just hard for him to articulate what he was up to – or perhaps he wanted to preserve the mystery, to fuel his image as a creative genius. Coleridge’s claim that the drug-induced vision of Kubla Khan came to him in its entirety belies all the preparatory material that shows the poet working on the ideas before that fateful day when he was interrupted by the person from Porlock. Of course, this makes for a good story. Even my own account of creation will focus on the flash of inspiration rather than the years of preparatory work I put in.

We have an awful habit of romanticising creative genius. The solitary artist working in isolation is frankly a myth. In most instances what looks like a step change is actually a continuous growth. Brian Eno talks about the idea of ‘scenius’, not genius, to acknowledge the community out of which creative intelligence often emerges. The American writer Joyce Carol Oates agrees: ‘Creative work, like scientific work, should be greeted as a communal effort – an attempt by an individual to give voice to many voices, an attempt to synthesize and explore and analyze.’

What does it take to stimulate creativity? Might it be possible to program it into a machine? Are there rules we can follow to become creative? Can creativity, in other words, be a learned skill? Some would say that to teach or program is to show people how to imitate what has gone before, and that imitation and rule following are both incompatible with creativity. And yet we have examples of creative individuals all around us who have studied and learned and improved their skills. If we study what they do, could we imitate them and ultimately become creative ourselves?

These are questions I find myself asking every new semester. To receive their PhDs, doctoral candidates in mathematics have to create a new mathematical construct. They have to come up with something that has never been done before. I am tasked with teaching them how to do that. Of course, at some level they have been training to do this to a certain extent already. Solving problems involves personal creativity even if the answer is already known.

That training is an absolute prerequisite for the jump into the unknown. By rehearsing how others have come to their breakthroughs you hope to provide the environment to foster your own creativity. And yet that jump is far from guaranteed. I can’t take anyone off the street and teach them to be a creative mathematician. Maybe with ten years of training we could get there, but not every brain seems to be able to achieve mathematical creativity. Some people appear to be able to achieve creativity in one field but not another, yet it is difficult to understand what makes one brain a chess champion and another a Nobel Prize-winning novelist.

Margaret Boden recognises that creativity isn’t just about being Shakespeare or Einstein. She distinguishes between what she calls ‘psychological creativity’ and ‘historical creativity’. Many of us achieve acts of personal creativity that may be novel to us but historically old news. These are what Boden calls moments of psychological creativity. It is by repeated acts of personal creativity that ultimately one hopes to produce something that is recognised by others as new and of value. While historical creativity is rare, it emerges from encouraging psychological creativity.

My recipe for eliciting creativity in students follows the three modes of creativity Boden identified. Exploration is perhaps the most obvious path. First understand how we’ve come to the place we are now and then try to push the boundaries just a little bit further. This involves deep immersion in what we have created to date. Out of that deep understanding might emerge something never seen before. It is often important to impress on students that there isn’t very often some big bang that resounds with the act of creation. It is gradual. As Van Gogh wrote: ‘Great things are not done by impulse but by small things brought together.’

Boden’s second strategy, combinational creativity, is a powerful weapon, I find, in stimulating new ideas. I often encourage students to attend seminars and read papers in subjects that don’t appear to connect with the problem they are tackling. A line of thought from a disparate bit of the mathematical universe might resonate with the problem at hand and stimulate a new idea. Some of the most creative bits of science are happening today at the junctions between the disciplines. The more we can come out of our silos and share our ideas and problems, the more creative we are likely to be. This is where a lot of the low-hanging fruit is to be found.

At first sight transformational creativity seems hard to harness as a strategy. But again the goal is to test the status quo by dropping some of the constraints that have been put in place. Try seeing what happens if we change one of the basic rules we have accepted as part of the fabric of our subject. These are dangerous moments because you can collapse the system, but this brings me to one of the most important ingredients needed to foster creativity – and that is embracing failure.

Unless you are prepared to fail, you will not take the risks that will allow you to break out and create something new. This is why our education system and our business environment, both realms that abhor failure, are often terrible environments for fostering creativity. It is important to celebrate the failures as much as the successes in my students. Sure, the failures won’t make it into the PhD thesis, but we learn so much from failure. When I meet with my students I repeat again and again Beckett’s call to ‘Fail, fail again, fail better.’

Are these strategies that can be written into code? In the past the top-down approach to coding meant there was little prospect of creativity in the output of the code. Coders were never too surprised by what their algorithms produced. There was no room for experimentation or failure. But this all changed recently: because an algorithm, built on code that learns from its failures, did something that was new, shocked its creators, and had incredible value. This algorithm won a game that many believed was beyond the abilities of a machine to master. It was a game that required creativity to play.

It was news of this breakthrough that triggered my recent existential crisis as a mathematician.

3 (#ulink_0f983e9e-093c-5e0e-9daa-54bed17e32eb)

READY, STEADY, GO (#ulink_0f983e9e-093c-5e0e-9daa-54bed17e32eb)

We construct and construct, but intuition is still a good thing.

Paul Klee

People often compare mathematics to playing chess. There certainly are connections, but when Deep Blue beat the best chessmaster the human race could offer in 1997, it did not lead to the closure of mathematics departments. Although chess is a good analogy for the formal quality of constructing a proof, there is another game that mathematicians have regarded as much closer to the creative and intuitive side of being a mathematician, and that is the Chinese game of Go.

I first discovered Go when I visited the mathematics department at Cambridge as an undergraduate to explore whether to do my PhD with the amazing group that had helped complete the classification of finite simple groups, a sort of Periodic Table of Symmetry. As I sat talking to John Conway and Simon Norton, two of the architects of this great project, about the future of mathematics, I kept being distracted by students at the next table furiously slamming black and white stones onto a large 19×19 grid carved into a wooden board.

Eventually I asked Conway what they were doing. ‘That’s Go. It’s the oldest game that is still being played to this day.’ In contrast to the war-like quality of chess, he explained, Go was a game of territory. Players take it in turn to place white and black pieces or stones onto the 19×19 grid. If you manage to surround a collection of your opponent’s stones with your own, you capture your opponent’s stones. The winner is the player who has captured the most stones by the end of the game. It sounded rather simple. The subtlety of the game, Conway explained, is that as you try to surround your opponent, you must avoid having your own stones captured.

‘It’s a bit like mathematics: simple rules that give rise to beautiful complexity.’ It was while watching the game evolve between two experts as they drank coffee in the common room that Conway discovered that the endgame was behaving like a new sort of number that he christened ‘surreal numbers’.

I’ve always been fascinated by games. Whenever I travel abroad I like to learn and bring back the game locals like to play. So when I got back from the wild outreaches of Cambridge to the safety of my home in Oxford I decided to buy Go from the local toy shop to see what it was that was obsessing these students. As I began to explore the game with one of my fellow students in Oxford, I realised how subtle it was. It was hard to identify a clear strategy that would help me win. And as more stones were laid down on the board, the game seemed to get more complicated, unlike chess, where as pieces are gradually removed the game starts to simplify.

The American Go Association estimates that it would take a number with 300 digits to count the number of games of Go that are legally possible. In chess the computer scientist Claude Shannon estimated that a number with 120 digits (now called the Shannon number) would suffice. These are not small numbers in either case, but they give you a sense of the wide range of possible permutations.

I had played a lot of chess as a kid. I enjoyed working through the logical consequences of a proposed move. It appealed to the mathematician that was growing inside me. The tree of possibilities in chess branches in a controlled manner, making it manageable for a computer and even a human to analyse the implications of going down different branches. In contrast Go just doesn’t seem like a game that would allow you to work out the logical implications of a future move. Navigating the tree of possibilities quickly becomes impossible. That’s not to say that a Go player doesn’t follow through the logical consequences of their next move, but this seems to be combined with a more intuitive feel for the pattern of play.

The human brain is acutely attuned to finding structure and pattern if there is one in a visual image. A Go player can look at the lie of the stones and tap into the brain’s ability to pick out these patterns and exploit them in planning the next move. Computers have traditionally always struggled with vision. It is one of the big hurdles that engineers have wrestled with for decades.

The human brain’s highly developed sense of visual structure has been honed over millions of years and has been key to our survival. Any animal’s ability to survive depends in part on its ability to pick out structure in the visual mess that Nature confronts us with. A pattern in the chaos of the jungle is likely to be evidence of the presence of another animal – and you’d better take notice cos that animal might eat you (or maybe you could eat it). The human code is extremely good at reading patterns, interpreting how they might develop, and responding appropriately. It is one of our key assets, and it plays into our appreciation for the patterns in music and art.

It turns out that pattern recognition is precisely what I do as a mathematician when I venture into the unexplored reaches of the mathematical jungle. I can’t rely on a simple step-by-step logical analysis of the local environment. That won’t get me very far. It has to be combined with an intuitive feel for what might be out there. That intuition is built up by time spent exploring the known space. But it is often hard to articulate logically why you might believe that there is interesting territory out there to explore. A conjecture in mathematics is by its nature not yet proved, but the mathematician who has made the conjecture has built up a feeling that the mathematical statement they have made may have some truth to it. Observation and intuition go hand in hand as we navigate the thickets and seek to carve out a new path.

A mathematician who can make a good conjecture will often garner more respect than one who joins up the logical dots to reveal the truth of the conjecture. In the game of Go the final winning position is in some respects the conjecture and the plays are the logical moves on your way to proving that conjecture. But it is devilishly hard to spot the patterns along the way.

And so, although chess has been useful to help explain some aspects of mathematics, the game of Go has always been held up as far closer in spirit to the way mathematicians actually go about their business. That’s why mathematicians weren’t too worried when Deep Blue beat the best humans could offer at chess. The real challenge was the game of Go. For decades people have been claiming that the game of Go can never be played by a computer. Like all good absolutes, it invited creative coders to test that proposition. But even a junior player appeared to be able to outplay even the most complex algorithms. And so mathematicians happily hid behind the cover that Go was providing them. If a computer couldn’t play Go then there was no chance it could play the even subtler and more ancient game of mathematics.

But just as the Great Wall of China was eventually breached, my defensive wall has just crumbled in spectacular fashion.

Game Boy extraordinaire

At the beginning of 2016 it was announced that a program had been created to play Go that its developers were confident could hold its own against the best humans had to offer. Go players around the world were extremely sceptical, given the failure of past efforts. So the company that developed the program offered a challenge. It set up a public contest with a huge prize and invited one of the world’s leading Go players to take up the challenge. An international champion, Lee Sedol from Korea, stepped forward. The competition would be played over five games with the winner taking home a prize of one million dollars. The name of Sedol’s challenger: AlphaGo.

AlphaGo is the brainchild of Demis Hassabis. Hassabis was born in London in 1976 to a Greek Cypriot father and a mother from Singapore. Both parents are teachers and what Hassabis describes as bohemian technophobes. His sister and brother went the creative route, one becoming a composer, the other choosing creative writing. So Hassabis isn’t quite sure where his geeky scientific side came from. But as a kid Hassabis was someone who quickly marked himself out as gifted and talented, especially when it came to playing games. His abilities at chess were such that at eleven he was the second-highest-ranked child of his age in the world.

But then at an international match in Liechtenstein that year Hassabis had an epiphany: what on earth were they all doing? The hall was full of so many great minds exploring the logical intricacies of this great game. And yet Hassabis suddenly recognised the total futility of such a project. In a radio interview on the BBC he admitted thinking at the time: ‘We were wasting our minds. What if we used that brain power for something more useful like solving cancer?’

His parents were pretty shocked when after the tournament (which he narrowly lost after battling for ten hours with the adult Dutch world champion) he announced that he was giving up chess competitions. Everyone had thought this was going to be his life. But those years playing chess weren’t wasted. A few years earlier he’d used the £200 prize money he’d won for beating a US opponent, Alex Chang, to buy his first computer: a ZX Spectrum. That computer sparked his obsession with getting machines to do the thinking for him.

Hassabis soon graduated on to a Commodore Amiga, which could be programmed to play the games he enjoyed. Chess was still too complicated, but he managed to program the Commodore to play Othello, a game that looks rather similar to Go with black and white stones that get flipped when they are trapped between stones of the opposite colour. It’s not a game that merits grandmasters, so he tried his program out on his younger brother. It beat him every time.

This was classic ‘if …, then …’ programming: he needed to code in by hand the response to each of his opponent’s moves. It was: ‘If your opponent plays that move, then reply with this move.’ The creativity all came from Hassabis and his ability to see what the right responses were to win the game. It still felt a bit like magic though. Code up the right spell and then, rather like the Sorcerer’s Apprentice, the Commodore would go through the work of winning the game.

Hassabis raced through school, culminating with an offer from Cambridge to study computer science at the age of sixteen. He’d set his heart on Cambridge after seeing Jeff Goldblum in the film The Race for the Double Helix. ‘I thought, is this what goes on at Cambridge? You go there and you invent DNA in the pub? Wow.’

Cambridge wouldn’t let him start his degree at the age of sixteen, so he had to defer for a year. To fill his time he won a place working for a game developer after having come second in a competition run by Amiga Power magazine. While he was there, he created his own game, Theme Park, where players had to build and run their own theme park. The game was hugely successful, selling several million copies and winning a Golden Joystick award. With enough funds to finance his time at university, Hassabis set off for Cambridge.

His course introduced him to the greats of the AI revolution: Alan Turing and his test for intelligence, Arthur Samuel and his program to play draughts, John McCarthy, who coined the term artificial intelligence, Frank Rosenblatt and his first experiments with neural networks. These were the shoulders on which Hassabis aspired to stand. It was while sitting in his lectures at Cambridge that he heard his professor repeating the mantra that a computer could never play Go because of the game’s creative and intuitive characteristics. This was like a red rag to the young Hassabis. He left Cambridge determined to prove his professor wrong.

His idea was that rather than trying to write a program himself that could play Go, he would write a meta-program that would be responsible for writing the program that would play Go. It sounded a crazy idea, but the point was that the meta-program would be created so that as the Go-playing program played more and more games it would learn from its mistakes.

Hassabis had learned about a similar idea implemented by the artificial-intelligence researcher Donald Michie in the 1960s. Michie had written an algorithm called ‘MENACE’ that learned from scratch the best strategy to play noughts and crosses. (MENACE stood for Machine Educable Noughts And Crosses Engine.) To demonstrate the algorithm, Michie had rigged up 304 matchboxes representing all the possible layouts of noughts and crosses encountered while playing. Each matchbox was filled with different-coloured balls to represent possible moves. Balls were removed or added to the boxes to punish losses or reward wins. As the algorithm played more and more games, the reassignment of the balls eventually led to an almost perfect strategy for playing. It was this idea of learning from your mistakes that Hassabis wanted to use to train an algorithm to play Go.

Hassabis had a good model to base his strategy on. A newborn baby does not have a brain that is pre-programmed to cope with making its way through life. It is programmed instead to learn as it interacts with its environment.

If Hassabis was going to tap into the way the brain learned to solve problems, then knowing how the brain works was clearly going to help in his dream of creating a program to play Go. So he decided to do a PhD in neuroscience at University College London. It was during coffee breaks from lab work that Hassabis started discussing with a neuroscientist, Shane Legg, his plans to create a company to try out his ideas. It shows the low status of AI even a decade ago that they never admitted to their professors their dream to dedicate their lives to AI. But they felt they were on to something big, so in September 2010 the two scientists decided to create a company with Mustafa Suleyman, a friend of Hassabis from childhood. DeepMind was incorporated.

The company needed money but initially Hassabis just couldn’t raise any capital. Pitching on a platform that they were going to play games and solve intelligence did not sound serious to most investors. A few, however, did see the vision. Among those who put money in right at the outset were Elon Musk and Peter Thiel. Thiel had never invested outside Silicon Valley and tried to persuade Hassabis to relocate to the West Coast. A born-and-bred Londoner, Hassabis held his ground, insisting that there was more untapped talent in London that could be exploited. Hassabis remembers a crazy conversation he had with Thiel’s lawyer. ‘Does London have law on IP?’ she asked innocently. ‘I think they thought we were coming from Timbuctoo!’ The founders had to give up a huge amount of stock to the investors, but they had their money to start trying to crack AI.

The challenge of creating a machine that could learn to play Go still felt like a distant dream. They set their sights at first on a seemingly less cerebral goal: playing 1980s Atari games. Atari is probably responsible for a lot of students flunking courses in the late 1970s and early 1980s. I certainly remember wasting a huge amount of time playing the likes of Pong, Space Invaders and Asteroids on a friend’s Atari 2600 console. The console was one of the first whose hardware could play multiple games that were loaded via a cartridge. It allowed a whole range of different games to be developed over time. Previous consoles could only play games that had been physically programmed into the units.

One of my favourite Atari games was called Breakout. A wall of coloured bricks was at the top of the screen and you controlled a paddle at the bottom that could be moved left or right using a joystick. A ball would bounce off the paddle and head towards the bricks. Each time it hit a brick, the brick would disappear. The aim was to clear the bricks. The yellow bricks at the bottom of the wall scored one point. The red bricks on top got you seven points. As you cleared blocks, the paddle would shrink and the ball would speed up to make the game play harder.

We were particularly pleased one afternoon when we found a clever way to hack the game. If you dug a tunnel up through the bricks on the edge of the screen, once the ball made it through to the top it bounced back and forward off the top of the screen and the upper high-scoring bricks, gradually clearing the wall. You could sit back and watch until the ball eventually came back down through the wall. You just had to be ready with the paddle to bat the ball back up again. It was a very satisfying strategy!

Hassabis and the team he was assembling also spent a lot of time playing computer games in their youth. Their parents may be happy to know that the time and effort they put into those games did not go to waste. It turned out that Breakout was a perfect test case to see if the team at DeepMind could program a computer to learn how to play games. It would have been a relatively straightforward job to write a program for each individual game. Hassabis and his team were going to set themselves a much greater challenge.

They wanted to write a program that would receive as an input the state of the pixels on the screen and the current score and set it to play with the goal of maximising the score. The program was not told the rules of the game: it had to experiment randomly with different ways of moving the paddle in Breakout or firing the laser cannon at the descending aliens of Space Invaders. Each time it made a move it could assess whether the move had helped increase the score or had had no effect.

The code implements an idea dating from the 1990s called reinforcement learning, which aims to update the probability of actions based on the effect on a reward function or score. For example, in Breakout the only decision is whether to move the paddle at the bottom left or right. Initially the choice will be 50:50. But if moving the paddle randomly results in it hitting the ball, then a short time later the score goes up. The code then recalibrates the probability of whether to go left or right based on this new information. It will increase the chance of heading in the direction towards which the ball is heading. The new feature was to combine this learning with neural networks that would assess the state of the pixels to decide what features were correlating to the increase in score.

At the outset, because the computer was just trying random moves, it was terrible, hardly scoring anything. But each time it made a random move that bumped up the score, it would remember the move and reinforce the use of such a move in future. Gradually the random moves disappeared and a more informed set of moves began to emerge, moves that the program had learned through experiment seemed to boost its score.

It’s worth watching the supplementary video the DeepMind team included in the paper they eventually wrote. It shows the program learning to play Breakout. At first you see it randomly moving the paddle back and forward to see what will happen. Then, when the ball finally hits the paddle and bounces back and hits a brick and the score goes up, the program starts to rewrite itself. If the pixels of the ball and the pixels of the paddle connect that seems to be a good thing. After 400 game plays it’s doing really well, getting the paddle to continually bat the ball back and forward.

The shock for me came when you see what it discovered after 600 games. It found our hack! I’m not sure how many games it took us as kids to find this trick, but judging by the amount of time I wasted with my friend it could well have been more. But there it is. The program manipulated the paddle to tunnel its way up the sides, such that the ball would be stuck in the gap between the top of the wall and the top of the screen. At this point the score goes up very fast without the computer’s having to do very much. I remember my friend and I high-fiving when we’d discovered this trick. The machine felt nothing.

By 2014, four years after the creation of DeepMind, the program had learned how to outperform humans on twenty-nine of the forty-nine Atari games it had been exposed to. The paper the team submitted to Nature detailing their achievement was published in early 2015. To be published in Nature is one of the highlights of a scientist’s career. But their paper achieved the even greater accolade of being featured as the cover story of the whole issue. The journal recognised that this was a huge moment for artificial intelligence.

It has to be reiterated what an amazing feat of programming this was. From just the raw data of the state of the pixels and the changing score, the program had changed itself from randomly moving the paddle of Breakout back and forth to learning that tunnelling the sides of the wall would win you the top score. But Atari games are hardly on a par with the ancient game of Go. Hassabis and his team at DeepMind decided they were ready to create a new program that could take it on.

It was at this moment that Hassabis decided to sell the company to Google. ‘We weren’t planning to, but three years in, focused on fundraising, I had only ten per cent of my time for research,’ he explained in an interview in Wired at the time. ‘I realised that there’s maybe not enough time in one lifetime to both build a Google-sized company and solve AI. Would I be happier looking back on building a multi-billion business or helping solve intelligence? It was an easy choice.’ The sale put Google’s firepower at his fingertips and provided the space for him to create code to realise his goal of solving Go … and then intelligence.

First blood

Previous computer programs built to play Go had not come close to playing competitively against even a pretty good amateur, so most pundits were highly sceptical of DeepMind’s dream to create code that could get anywhere near an international champion of the game. Most people still agreed with the view expressed in The New York Times by the astrophysicist Piet Hut after DeepBlue’s success at chess in 1997: ‘It may be a hundred years before a computer beats humans at Go – maybe even longer. If a reasonably intelligent person learned to play Go, in a few months he could beat all existing computer programs. You don’t have to be a Kasparov.’

Just two decades into that hundred years, the DeepMind team believed they might have cracked the code. Their strategy of getting algorithms to learn and adapt appeared to be working, but they were unsure quite how powerful the emerging algorithm really was. So in October 2015 they decided to test-run their program in a secret competition against the current European champion, the Chinese-born Fan Hui.

AlphaGo destroyed Fan Hui five games to nil. But the gulf between European players of the game and those in the Far East is huge. The top European players, when put in a global league, rank in the 600s. So, although it was still an impressive achievement, it was like building a driverless car that could beat a human driving a Ford Fiesta round Silverstone then trying to challenge Lewis Hamilton in a Grand Prix.

Certainly when the press in the Far East heard about Fan Hui’s defeat they were merciless in their dismissal of how meaningless the win was for AlphaGo. Indeed, when Fan Hui’s wife contacted him in London after the news got out, she begged her husband not to go online. Needless to say he couldn’t resist. It was not a pleasant experience to read how dismissive the commentators in his home country were of his credentials to challenge AlphaGo.

Fan Hui credits his matches with AlphaGo with teaching him new insights into how to play the game. In the following months his ranking went from 633 to the 300s. But it wasn’t only Fan Hui who was learning. Every game AlphaGo plays affects its code and changes it to improve its play next time around.

It was at this point that the DeepMind team felt confident enough to offer their challenge to Lee Sedol, South Korea’s eighteen-time world champion and a formidable player of the game.

The match was to be played over five games scheduled between 9 and 15 March 2016 at the Four Seasons hotel in Seoul, and would be broadcast live across the internet. The winner would receive a prize of a million dollars. Although the venue was public, the precise location within the hotel was kept secret and was isolated from noise – not that AlphaGo was going to be disturbed by the chitchat of the press and the whispers of curious bystanders. It would assume a perfect Zen-like state of concentration wherever it was placed.

Sedol wasn’t fazed by the news that he was up against a machine that had beaten Fan Hui. Following Fan Hui’s loss he had declared: ‘Based on its level seen … I think I will win the game by a near landslide.’

Although he was aware of the fact that the machine he would be playing was learning and evolving, this did not concern him. But as the match approached, you could hear doubts beginning to creep into his view of whether AI will ultimately be too powerful for humans to defeat it even in the game of Go. In February he stated: ‘I have heard that DeepMind’s AI is surprisingly strong and getting stronger, but I am confident that I can win … at least this time.’

Most people still felt that despite great inroads into programming, an AI Go champion was still a distant goal. Rémi Coulom, the creator of Crazy Stone, the only program to get close to playing Go at any high standard, was still predicting another decade before computers would beat the best humans at the game.

As the date for the match approached, the team at DeepMind felt they needed someone to really stretch AlphaGo and to test it for any weaknesses. So they invited Fan Hui back to play the machine going into the last few weeks. Despite having suffered a 5–0 defeat and being humiliated by the press back in China, he was keen to help out. Perhaps a bit of him felt that if he could help make AlphaGo good enough to beat Sedol, it would make his defeat less humiliating.

As Fan Hui played he could see that AlphaGo was extremely strong in some areas but he managed to reveal a weakness that the team was not aware of. There were certain configurations in which it seemed to completely fail to assess who had control of the game, often becoming totally delusional that it was winning when the opposite was true. If Sedol tapped into this weakness, AlphaGo wouldn’t just lose, it would appear extremely stupid.

The DeepMind team worked around the clock trying to fix this blind spot. Eventually they just had to lock down the code as it was. It was time to ship the laptop they were using to Seoul.

The stage was set for a fascinating duel as the players, or at least one player, sat down on 9 March to play the first of the five games.

‘Beautiful. Beautiful. Beautiful’

It was with a sense of existential anxiety that I fired up the YouTube channel broadcasting the matches that Sedol would play against AlphaGo and joined 280 million other viewers to see humanity take on the machines. Having for years compared creating mathematics to playing the game of Go, I had a lot on the line.

Lee Sedol picked up a black stone and placed it on the board and then waited for the response. Aja Huang, a member of the DeepMind team, would play the physical moves for AlphaGo. This, after all, was not a test of robotics but of artificial intelligence. Huang stared at AlphaGo’s screen, waiting for its response to Sedol’s first stone. But nothing came.

We all stared at our screens wondering if the program had crashed! The DeepMind team was also beginning to wonder what was up. The opening moves are generally something of a formality. No human would think so long over move 2. After all, there was nothing really to go on yet. What was happening? And then a white stone appeared on the computer screen. It had made its move. The DeepMind team breathed a huge sigh of relief. We were off! Over the next couple of hours the stones began to build up across the board.

One of the problems I had as I watched the game was assessing who was winning at any given point in the game. It turns out that this isn’t just because I’m not a very experienced Go player. It is a characteristic of the game. Indeed, this is one of the main reasons why programming a computer to play Go is so hard. There isn’t an easy way to turn the current state of the game into a robust scoring system of who leads by how much.

Chess, by contrast, is much easier to score as you play. Each piece has a different numerical value which gives you a simple first approximation of who is winning. Chess is destructive. One by one pieces are removed so the state of the board simplifies as the game proceeds. But Go increases in complexity as you play. It is constructive. The commentators kept up a steady stream of observations but struggled to say if anyone was in the lead right up until the final moments of the game.

What they were able to pick up quite quickly was Sedol’s opening strategy. If AlphaGo had learned to play on games that had been played in the past, then Sedol was working on the principle that it would put him at an advantage if he disrupted the expectations it had built up by playing moves that were not in the conventional repertoire. The trouble was that this required Sedol to play an unconventional game – one that was not his own.

It was a good idea but it didn’t work. Any conventional machine programmed on a database of accepted openings wouldn’t have known how to respond and would most likely have made a move that would have serious consequences in the grand arc of the game. But AlphaGo was not a conventional machine. It could assess the new moves and determine a good response based on what it had learned over the course of its many games. As David Silver, the lead programmer on AlphaGo, explained in the lead-up to the match: ‘AlphaGo learned to discover new strategies for itself, by playing millions of games between its neural networks, against themselves, and gradually improving.’ If anything, Sedol had put himself at a disadvantage by playing a game that was not his own.

As I watched I couldn’t help feeling for Sedol. You could see his confidence draining out of him as it gradually dawned on him that he was losing. He kept looking over at Huang, the DeepMind representative who was playing AlphaGo’s moves, but there was nothing he could glean from Huang’s face. By move 186 Sedol had to recognise that there was no way to overturn the advantage AlphaGo had built up on the board. He placed a stone on the side of the board to indicate his resignation.

By the end of day one it was: AlphaGo 1 Humans 0. Sedol admitted at the press conference that day: ‘I was very surprised because I didn’t think I would lose.’

But it was game 2 that was going to truly shock not just Sedol but every human player of the game of Go. The first game was one that experts could follow and appreciate why AlphaGo was playing the moves it was. They were moves a human champion would play. But as I watched game 2 on my laptop at home, something rather strange happened. Sedol played move 36 and then retired to the roof of the hotel for a cigarette break. While he was away, AlphaGo on move 37 instructed Huang, its human representative, to place a black stone on the line five steps in from the edge of the board. Everyone was shocked.

The conventional wisdom is that during the early part of the game you play stones on the outer four lines. The third line builds up short-term territory strength on the edge of the board while playing on the fourth line contributes to your strength later in the game as you move into the centre of the board. Players have always found that there is a fine balance between playing on the third and fourth lines. Playing on the fifth line has always been regarded as suboptimal, giving your opponent the chance to build up territory that has both short- and long-term influence.

AlphaGo had broken this orthodoxy built up over centuries of competing. Some commentators declared it a clear mistake. Others were more cautious. Everyone was intrigued to see what Sedol would make of the move when he returned from his cigarette break. As he sat down, you could see him physically flinch as he took in the new stone on the board. He was certainly as shocked as all of the rest of us by the move. He sat there thinking for over twelve minutes. Like chess, the game was being played under time constraints. Using twelve minutes of your time was very costly. It is a mark of how surprising this move was that it took Sedol so long to respond. He could not understand what AlphaGo was doing. Why had the program abandoned the region of stones they were competing over?

Was this a mistake by AlphaGo? Or did it see something deep inside the game that humans were missing? Fan Hui, who had been given the role of one of the referees, looked down on the board. His initial reaction matched everyone else’s: shock. And then he began to realise: ‘It’s not a human move. I’ve never seen a human play this move,’ he said. ‘So beautiful. Beautiful. Beautiful. Beautiful.’

Beautiful and deadly it turned out to be. Not a mistake but an extraordinarily insightful move. Some fifty moves later, as the black and white stones fought over territory from the lower left-hand corner of the board, they found themselves creeping towards the black stone of move 37. It was joining up with this stone that gave AlphaGo the edge, allowing it to clock up its second win. AlphaGo 2 Humans 0.

Sedol’s mood in the press conference that followed was notably different. ‘Yesterday I was surprised. But today I am speechless … I am in shock. I can admit that … the third game is not going to be easy for me.’ The match was being played over five games. This was the game that Sedol needed to win to be able to stop AlphaGo claiming the match.

The human fight-back

Sedol had a day off to recover. The third game would be played on Saturday, 12 March. He needed the rest, unlike the machine. The first game had been over three hours of intense concentration. The second lasted over four hours. You could see the emotional toll that losing two games in a row was having on him.

Rather than resting, though, Sedol stayed up till 6 a.m. the next morning analysing the games he’d lost so far with a group of fellow professional Go players. Did AlphaGo have a weakness they could exploit? The machine wasn’t the only one who could learn and evolve. Sedol felt he might learn something from his losses.

Sedol played a very strong opening to game 3, forcing AlphaGo to manage a weak group of stones within his sphere of influence on the board. Commentators began to get excited. Some said Sedol had found AlphaGo’s weakness. But then, as one commentator posted: ‘Things began to get scary. As I watched the game unfold and the realisation of what was happening dawned on me, I felt physically unwell.’

Sedol pushed AlphaGo to its limits but in so doing he revealed the hidden powers that the program seemed to possess. As the game proceeded, it started to make what commentators called lazy moves. It had analysed its position and was so confident in its win that it chose safe moves. It didn’t care if it won by half a point. All that mattered was that it won. To play such lazy moves was almost an affront to Sedol, but AlphaGo was not programmed with any vindictive qualities. Its sole goal was to win the game. Sedol pushed this way and that, determined not to give in too quickly. Perhaps one of these lazy moves was a mistake that he could exploit.

By move 176 Sedol eventually caved in and resigned. AlphaGo 3 Humans 0. AlphaGo had won the match. Backstage, the DeepMind team was going through a strange range of emotions. They’d won the match, but seeing the devastating effect it was having on Sedol made it hard for them to rejoice. The million-dollar prize was theirs. They’d already decided to donate the prize, if they won, to a range of charities dedicated to promoting Go and science subjects as well as to Unicef. Yet their human code was causing them to empathise with Sedol’s pain.

AlphaGo did not demonstrate any emotional response to its win. No little surge of electrical current. No code spat out with a resounding ‘YES!’ It is this lack of response that gives humanity hope and is also scary at the same time. Hope because it is this emotional response that is the drive to be creative and venture into the unknown: it was humans, after all, who’d programmed AlphaGo with the goal of winning. Scary because the machine won’t care if the goal turns out to be not quite what its programmers had intended.

Sedol was devastated. He came out in the press conference and apologised:

I don’t know how to start or what to say today, but I think I would have to express my apologies first. I should have shown a better result, a better outcome, and better content in terms of the game played, and I do apologize for not being able to satisfy a lot of people’s expectations. I kind of felt powerless.

But he urged people to keep watching the final two games. His goal now was to try to at least get one back for humanity.