You and AI – the history, capabilities and frontiers of AI

Well hello, everybody, and welcome to the Royal Society. My name is Andy Hopper,
I’m the Treasurer, but perhaps of relevance to this, I’m Professor of Computer Technology
at Cambridge University as well. The Society’s been around
for a little while and over those centuries,
350-something years, it has played its part,
its Fellows have played their parts, in some of the most
important discoveries and actually practical use
over the years as well. In April 2017, about a year ago, we launched a report
on machine learning. Actually, it’s a series of reports on
what might be called the digital area on cybersecurity,
on machine learning, actually on teaching
computer science, that sort of thing. However, our report
on machine learning called for action in a number of areas
over the next years. We use the phrase
‘careful stewardship’ in relation to machine learning data
and that sort of thing to ensure that the benefits
of this technology are felt right across society and that we encourage,
facilitate, participate in a public debate on this broad topic and discuss how the benefits
are distributed, as well as trying to think ahead
of some of, perhaps, the dangers and other things. This series of events and lectures,
which is supported by DeepMind, we hope will help develop
a public conversation about machine learning, AI and so on, and provide a forum for a debate
about the way these technologies may, actually will, already do, affect the
lives of everybody on the planet. So it’s great to see you here
at what is our first event. We have Demis Hassabis, a superstar, to give our first lecture
in this series, and I’m very pleased to say he was
an undergraduate in my department so, you know, boy done good.
(laughter) We like that sort of thing
in my parts, well, everywhere else, I guess,
as well, so that’s good, but then he went on to do
a PhD in neuroscience at UCL. A very interesting thing
which you will see, actually, comes together in his work. And then he did a couple of things,
but co-founded DeepMind in 2010. He’s very distinguished, he’s received a Fellowship
of the Royal Academy of Engineering, also the Silver Medal, and is a fellow
of the Royal Society of Arts as well. In 2014, DeepMind was acquired
by Google and has grown enormously. I note it retains the name DeepMind, which I think is
a very interesting positive, but also has activities
in Edmonton, Montreal, and an applied team in Mountain View. So Demis has done well,
to the point that, for example, on one hand, Time listed him as one
of the 100 most important people, influential, sorry,
people in the world, but also he was awarded a CBE for
services to science and technology. So welcome, Demis,
and we look forward to all our minds being improved
on your favourite topic. Thank you very much. (applause) Thank you, Andy,
for that very kind introduction, and thank you all for taking the time
to come this evening. It’s great to see you all here. We’re very proud at DeepMind
to be supporting this very important lecture series
here at the Royal Society. We think that,
given the potential of AI to both transform
and disrupt our lives, we think it’s incredibly important that there’s a public debate
and public engagement between the researchers
at the forefront of this field and the broader public, and we think that’s very critical
going forwards as more and more of this technology
comes to affect our everyday lives. What we’re hoping here, and the idea behind
this Royal Society lecture series, is to open up a forum to facilitate
an open and robust conversation about the potential
and the possible pitfalls inherent in the advancing of AI, so I look forward
to answering all your questions at the end of the talk. Today I’m going to talk about AI,
but specifically focused around how AI could be applied
to scientific discovery itself. I thought this was
particularly appropriate given this is a lecture
at the Royal Society, but it’s also the thing
that I’m most passionate about. This is the reason why I’ve spent
my whole life and my whole career on trying to advance the state of AI,
is that I believe if we build AI in the right way
and deploy it in the right way, it can actually help advance
the state of science itself, so I’ll come back to that theme
throughout this talk. To begin with, there’s no exact
definition of what AI is, but a loose heuristic, I think,
that’s worth keeping in mind is AI is the science
of making machines smart. That’s what we’re trying to do when we embark on this endeavour
of building AI. DeepMind itself, my company,
we founded it in London in 2010. We became part of Google in 2014,
but we still run independently right here in King’s Cross,
just up the road. The way to think about DeepMind
and the vision behind it was to try and bring together
some of the world’s top researchers in all the different subdisciplines
that were relevant to AI, from neuroscience to machine learning
to mathematics, bringing them together with
some of the world’s top engineers and a lot of compute power to see how far
could we push the frontiers of AI and how quickly
could we make progress? You can think of it as like
an Apollo Programme effort for AI. Nothing until that point,
until we founded DeepMind, existed that was really set up
to do this in this way. Another explicit thing behind
the vision that we had for DeepMind was to try and come up with
a new way to organise science. What I mean by that, and this
would be a whole lecture in itself, is could we fuse together the best
from the top academic labs, and the blue-sky thinking
and the rigorous science that goes on in those places, with a lot of the philosophy
behind the best start-ups or best technology businesses, in terms of the amount
of energy and focus and pace that they bring to bear
on their missions? Would it be possible
to fuse together the best from both
of these two worlds? That’s the way you can think about
the culture at DeepMind, as a hybrid culture that mixes the
best from both of those two fields. Now, what is our mission at DeepMind? We articulate it
as a two-step process. It’s slightly tongue-in-cheek
but we take it very seriously, so this is how we articulate it. Step one,
fundamentally solve intelligence, and then we feel
if we were to do that then step two would naturally follow,
use it to solve everything else. What we mean by ‘solve intelligence’
is actually, to just unpack that slightly,
to fundamentally understand this phenomenon of intelligence,
what it is. What kind of process is it? Then, if we can understand that, can we recreate the important aspects
of that artificially and make it universally
abundant and available? That’s what we mean by this
‘solve intelligence’, the first part of this mission. I think if we’re to do that
in a general way, both DeepMind
and the research community at large, then I think naturally
step two will follow in terms of, we could bring
this technology to bear on all sorts of problems that,
for the moment, seem quite intractable to us. So things perhaps as far afield
as climate science all the way to curing
diseases like cancer that we don’t know how to do yet. I think AI could have a role to play,
an important role to play, as a very powerful tool
in all of these different scientific and medical endeavours. So that’s the high-level mission and
that’s our guiding star at DeepMind, but how do we plan
to go about this more pragmatically? What we talk about is trying to build the world’s first
general-purpose learning machine, and the key words here are obviously
‘learning’ and ‘general’. All the algorithms that we work on
at DeepMind are learning algorithms, and what we mean by that is that they
learn how to master certain tasks automatically from raw experience
or raw input, so they find out for themselves
the solution to tasks, so they’re not pre-programmed
with that solution directly by the programmers or the designers. Instead of that,
we create a system that can learn, and then it experiences
things and data, and then it figures out for itself
how to solve the problem. This second word ‘general’,
this is this notion of generality, so the idea that the same system
or same single set of algorithms can operate out of the box
across a wide range of tasks, potentially tasks that it’s never
even ever seen before. Of course,
we have an example of a very powerful general-purpose learning algorithm,
and it’s our brains, the human mind. Our brains are capable, of course,
of doing both of those things, an exquisite example
of this being possible. Up until now, our algorithms
have not been able to do this, so the best computer science has to
offer has fallen, and it still is, way short of what the mind can do. Internally at DeepMind,
we call this type of AI artificial general intelligence,
or AGI, to distinguish it from
the traditional sort of AI. AI as a field has been going
for 60-70 years now, since the time of Alan Turing, and a lot of traditional AI
is hand-crafted. This is specifically
researchers and programmers coming up with solutions to problems and then directly codifying
those solutions in terms of programs, and then the program itself, the machine just dumbly executes
the program, the solution. It doesn’t adapt, it doesn’t learn. The problem with those
kinds of systems is they’re very inflexible
and they’re very brittle. If something unexpected happens that the programmers
didn’t cater for beforehand then it doesn’t know what to do, so it usually just
catastrophically fails. This will be obvious to you
if you’ve ever interacted with assistants on your phone. Often they’ll be fine
if you stick to the script that they already understand, but once you start conversing with
them freely, you very quickly realise there isn’t any real intelligence
behind these systems. They’re just template-based
question-answering systems. By contrast,
the hallmarks of AGI systems would be that they’re flexible and
they’re adaptive and they’re robust, and what gives them
these kinds of properties are these general
learning capabilities. In terms of rule-based AI
or traditional AI, probably still the most famous
example of that kind of system was Deep Blue,
IBM’s amazing chess computer that was able to beat the world chess
champion at the time, Garry Kasparov, in the late ’90s. These kinds of systems
are called expert systems and they’re pre-programmed with
all sorts of rules and heuristics to allow them to be experts
in the particular type of task that they were built for. So in this case,
Deep Blue was built to play chess. The problem with these systems is,
and what you can quickly see is, that Deep Blue was not able to do
anything else other than play chess. In fact, it couldn’t even
do something simpler, like play a strictly simpler game
like noughts and crosses. It would have to be
reprogrammed again from scratch. I remember I was doing my undergrad
in Cambridge, actually, when this match happened, and I remember coming away
from this match more impressed with Garry Kasparov’s
mind than I was with Deep Blue. That’s because
of course Deep Blue was an incredible technical achievement
and a big landmark in AI research, but Garry was able
to more-or-less compete equally with this brute of a machine, but of course, Garry Kasparov
can do all sorts of other things, engage in politics,
talk three languages and write books, all of these other things
that Deep Blue had no idea how to do, with this single human mind. So, to me, there felt like
there was something critical missing, if this was intelligence or AI, something missing
from the Deep Blue system. And I think what was missing was this
notion of adaptability and learning, so learning to cope
with new tasks or new problems, and this idea of generality, being able to operate across
a wide range of very differing tasks. The way we think about AI at DeepMind is in the framework of what’s called
reinforcement learning, and I’ll just quickly explain to you
what reinforcement learning is with the aid of this simple diagram. If you think of the AI system, and we call the AI systems ‘agents’
internally at DeepMind, here on the left
depicted by this little character, and this agent finds itself
in some kind of environment and it’s trying to achieve a goal
in that environment, a goal specified by the designers
in that environment. If the environment was the real world then you can think of the agent
as a robot, so a robot situated
in a real-world environment. Alternatively, the environment
could be a virtual world like a game environment, which is
what we mostly use at DeepMind, and in that case, you could think
of the agent as a virtual robot, a kind of avatar or game character. In either case,
the agent only interacts with the environment in two ways. Firstly, it gets observations
about the environment through its perceptual inputs. We normally use vision,
but we’re starting to experiment with other modalities
like sound and touch, but for now, almost everything
we use is vision input, so pixel input
in the case of a simulation. The first job of the agent
is to build a model of the environment out there, a statistical model of the
environment, as accurately as it can based on these noisy, incomplete
observations it’s getting about the world out there. It never has full information
about how this environment works. It can only surmise
and approximate it through the observations
and the experience that it gets. The second job of the agent is, once
it has that model of the environment and it’s trying to make plans
about how to reach its goal, then it has a certain amount of time to pick the action
it should take next from the set of actions available
to it at that moment in time. It can do a lot of planning
and thinking about, if I do action A,
how will the world look? How will the environment change?
If I do action B, how will it change? Which one of those actions
will get me nearer towards its goal? Once it’s run out of thinking time,
it has to output the action, the best action it’s found so far.
That action gets executed. That may make a change or not make
a change to the environment, which will then drive
a new observation. This whole system continues round
in a kind of feedback loop and all the time,
the agent is trying to pick actions that will get it towards its goal,
ultimately towards its goal. That’s reinforcement learning
in a nutshell and how it works. This diagram’s pretty simple but there are a lot
of very complex technicalities behind trying to solve
this reinforcement learning problem, in the fuller sense of the word. But we know that if we could solve
all of these technical issues, this framework is enough
to deliver general intelligence, and we know that because this is
the way biological systems learn, including our human minds. In the primate brain
and the human brain, it’s the dopamine neurons,
the dopamine system in our brain, that implements a form
of reinforcement learning, so this is one of the methods
that humans use to learn. The second big key
piece of technology that’s created this new renaissance
in AI in the last decade is called deep learning. Deep learning is to do with
hierarchical neural networks. You can think of them
as loose approximations to the way our real neural networks
work in our brain. Here’s an example
of a neural network working. Imagine that you’re trying to train one of these neural networks here on
the right, these layers of neurons, to distinguish between
pictures of cats and dogs. What you would do is
you would show this AI system many thousands,
perhaps even millions, of different pictures of cats
and different pictures of dogs, and this is called
supervised learning. What you would do is
you’d show them the pictures, so you’d show the input layer
at the bottom here, this picture, the raw pixels from this picture
of a cat or a dog, and then this neural network
would process that picture and then would ultimately
output a label, either a guess saying
‘I think that’s a cat’ or a guess saying
‘I think that’s a dog’. And depending on whether
it was correct or incorrect, you would adjust-, the neural network adjusts
the weights between these neurons so that next time
you get asked a question about ‘is this a cat
or is this a dog?’, you’re more likely to output
the right answer, and it uses an algorithm
called backpropagation to do that. It goes back and adjusts
the neural network weights depending on whether you got
the answer right or wrong so that you’re more likely
to get the answer right next time. Once you do this
incremental learning process many thousands,
perhaps even millions of times, eventually you get a neural network
that is really amazing at distinguishing between
pictures of cats and dogs. In fact, better than I am,
because I actually can’t tell whether that’s a cat or a dog
from that particular picture. One of our big innovations
at DeepMind was to pioneer the combination
of these two types of algorithms. We called this combination,
rather logically, deep reinforcement learning. We use deep learning
to process the perceptual inputs, to process the observations and
make sense of the world out there, these visual inputs
that the system is getting, and then we use reinforcement
learning to make the decisions, to pick the right action
to get the system towards its goal. We pioneered this sort of field, and one of the big things
that we demonstrated was we built the world’s first
end-to-end learning system, and it’s called DQN. What we mean by end-to-end
is it went all the way from raw perceptual inputs,
in this case pixels on a screen, to making a decision
about what action to take. So really, it was an example
of one of these full systems that can go all the way
from processing the vision to making a plan
and executing that plan. What we tested it on was Atari games. That was the first thing we tested it
on, Atari games from the ’80s, and we tested it on 50 classic games. For those of you in the audience who are old enough
to remember these games, which is probably not many of you, it was Space Invaders, Pac-Man, these kinds of games
that I’m showing here at the bottom. I’m going to show you the DQN system, how it learnt and how it progressed
through its learning, in a second,
in a video on the next slide. But just before I show that, I just want it to be clear
what you’re going to see. The only input
that the DQN system gets is the 20,000 pixel values
on the screen. Those are the inputs that it gets,
just these pixel numbers. It doesn’t know anything about what it’s supposed to be doing,
what it’s controlling. All it knows is
these are the pixel values and you’ve got to maximise the score,
that was the goal. It has to learn everything else
from scratch. The architecture we use
is here on the screen. This is a neural network
you can think of on its side. On the left-hand side, you can see
there’s the current screen and the pixels on the screen
being used as the input, then it gets processed
through a number of layers, and then at the output, you’ve got a whole bunch of actions
that can be taken. I think it’s 19 actions
that can be taken. The eight joystick movements, the eight joystick movements
with the fire button, or doing nothing. So it’s got to make a decision
about any of those actions to take in the next time step
based on the current screen input. This is how it works
on the classic game Breakout. Breakout is one of the most
famous games in Atari games. Here in this game,
you control the bat and the ball, the pink bat
at the bottom of the screen, and what you’re trying to do
is break through this rainbow-colour brick wall,
brick by brick. You’re not supposed to let the ball
go past your bat, otherwise you lose a life. I’m going to show you this video now of the agent learning
after many hundreds of games of play. This is DQN after 100 games. You can see
it’s not a very good agent yet, it’s missing the ball
most of the time, but it’s starting to get the idea that it should move
the bat towards the ball. This is after 300 games,
so 200 more games’ experience, and now it’s got
pretty good at the game. It’s about as good as any human
can play the game and it pretty much gets the ball back
every time, even when the ball’s going very fast,
at a very vertical angle. But then we let the system carry on
playing for another 200 games and then it did this amazing thing,
which was it figured out the optimal strategy was to dig
a tunnel around the left-hand side and then put the ball
behind the wall. So, of course, this gets it
more reward for less risk, and of course, gets rid
of the rainbow-colour brick wall more quickly. That was, for us,
really our first big ‘aha’ moment, watershed moment, at DeepMind. This is now from
four or five years ago. We realised we were onto something
with these kinds of techniques. It was able to discover something new
that even the programmers and the brilliant researchers of that
system did not know how to do. We hadn’t thought about
that solution to the game. More recently, a couple of years ago,
we started work on what is probably still
our most famous program, a program called AlphaGo. AlphaGo was a system to play
the ancient Chinese board game of Go. This is what Go looks like,
for those who don’t know, and this is what they play in China
and Korea and Japan instead of chess. Go is actually a very simple game.
There are only two rules, basically, and I could teach you it
in five minutes, but it takes at least a lifetime,
some would say many lifetimes, to master the game. The aim of the game is-, this is
a position from the end of the game. There are two players,
black and white, and they take turns
putting stones on the board. And eventually, when the board fills
up like this, you end up counting how many areas of territory
did you wall off with your stones? The person that has the side
that has walled off the most amount of territory, the most amount of squares
with their stones, wins the game. In this case, it’s a very close game
and white wins by one point. The question is, why is Go so hard
for computers to play? I just told you
at the beginning of the talk that chess was solved, was cracked,
twenty years ago, and since then, Go has been one of
the holy grails for AI research and it’s much, much harder. There are two real reasons,
two main reasons, why Go has been
much harder than chess. One is the huge search space, the
huge number of possibilities in Go. There are actually 10 to the power
170 possible positions in Go, which is way more than there are
atoms in the universe. There are about 10 to the power 80
atoms in the observable universe. What that means is if you ran
all the world’s computers for a million years
on calculating all the positions, you still wouldn’t have calculated
through all the possibilities in Go. There are just too many to do
through brute force. The second and even harder
thing about Go is that it was thought to be impossible
to explicitly write down by hand what’s called an evaluation function. That’s a function
that takes a board position and tells the computer which side
is winning and by how much, and that’s a critical part
of how the chess programs work. That’s why Deep Blue
was so powerful. A team of chess grandmasters,
with brilliant programmers at IBM, came together,
and the programmers distilled what was in the minds
of the chess grandmasters and tried to distil that
into an evaluation function that would allow the Deep Blue system
and its successors to evaluate whether the current
position was good or not, and then that’s what’s used to
plan out what move you should take. In Go, this was thought
to be impossible because the game is too esoteric. It’s too almost artistic, in a way,
to be able to evaluate in that sense with hard and fast rules. If you talk to a professional
Go player, they’ll tell you the game is a lot more
about intuition and feeling than it is about calculation, which is a game more like chess which is more about
explicit calculation and planning. So we made this big breakthrough
with AlphaGo, and the way we were able to do this
is we tackled those two problems, this problem of combinatorial
explosion and huge search spaces and this problem
of evaluation function, with two neural networks. The first neural network we used
was called a policy network. What we did here was
we fed in board positions from strong amateur games
that we downloaded off the Internet, and we trained a neural network
to predict the next move
the human player would make. In blue here is the board, the current board position with
the black and white stones on it, and then what the output is,
another board, but here with probabilities
that AlphaGo thought for each possible move in a position. The higher the green bar,
the higher probability it would give to a human player playing that move. What this policy network
allowed the system to do is rather than look at all the possible
moves in the current position and then all the possible replies
to those possible moves, and you can imagine
how quickly that escalates, it can instead look at
the top three or four most likely and most probable moves, rather than the hundreds of possible
moves that you could make, so that massively reduces down
the breadth of the search tree. The second thing we did
was we created what we called a value network. What we did is
we took the policy network and we played it against itself
millions of times, so AlphaGo played against itself
millions of times, and we took random positions
from the middle of those games and we of course know the result
of the game, which side won, and we trained AlphaGo to make a
prediction from the current position about who would end up
winning the game and how certain AlphaGo was
about its prediction. And eventually, once we trained it
through millions of positions, it was able to create
a very accurate evaluation function, this value network. What this value network did is
took a current board position, again in blue here
at the bottom of the screen, and output a single real number
between zero and one. Zero meant white was going to win,
100% chance, 100% confidence in that, and one would mean black was going
to win, 100% confidence in that, and 0.5 would mean AlphaGo
judged the position to be equal. And so here, by combining
these two neural networks, we solved all of the hard problems
inherent in computer Go. What you’ll note is
instead of us building an explicit evaluation function
like they do for chess programs, typing in all these
hundreds of different rules, so in fact,
modern chess computers have in the order of
about 1,000 specific rules about chess
and about positions in chess. Instead of that,
we didn’t have any explicit rules. We just let the system learn
from itself through experience by playing the game against itself many thousands,
indeed millions, of times. Once we had this system, we decided to challenge one of the
greatest players in the world, an incredible South Korean
grandmaster called Lee Sedol. I describe him as
the Roger Federer of Go because that’s the equivalent
position he occupies. He’s won eighteen world titles,
a bit like Grand Slams, and he’s considered to be the
greatest player of the past decade. We challenged Lee Sedol to a match,
a $1 million challenge match in South Korea, in Seoul,
back in 2016. It was an amazing once-in-a-lifetime
experience, actually, and the whole country pretty much
came to a standstill. One thing you’ve got to know
about South Korea is they love AI, they love technology
and they love Go, so for them,
this was like the biggest confluence of all the things they find exciting
all together, and Lee Sedol is a sort of
national hero there. He’s the equivalent of David Beckham
or something with us. So that was an incredible experience. This is a picture, on the top left,
of the first press conference. There was literally a huge ballroom
full of journalists and TV cameras. There were over 200 million viewers
across Asia for the five-game match,
which was incredible. AlphaGo, we won four-one, the match,
and it was hugely unexpected. Even just before the match, Lee Sedol was asked to predict
what he thought was going to happen and he predicted a five-nil victory
for himself, or four-one at minimum. And in fact, it was proclaimed to be
a decade before its time both by AI experts,
including computer Go experts, and also Go players and the Go world. The important thing here was not just
the fact that AlphaGo won, but actually it was how AlphaGo won
that was the critical thing. AlphaGo actually played lots
of creative, completely new moves and came up with lots of new ideas
that astounded the Go world and in fact are still being studied
now, nearly two years later, and are revolutionising the game. So it’s not a question
of AlphaGo just learning about human heuristics and motifs and then just regurgitating those
motifs, reliably regurgitating them. It actually created
its own unique ideas, and here’s the most famous
example of that, I just want to quickly show you. This is move 37 in game two. In Go, there is a whole history,
Go has been around for 3,000 years and was played professionally
for several hundred years in Japan and China and other places. There is this notion in Go
of famous games that are looked back on and studied
for hundreds of years, and indeed, famous moves in those
famous games go down in history, and this is considered to be
following that lineage, this move, move 37 from game two. This was the move here
on the right-hand side. AlphaGo here is black
and Lee Sedol is white. When AlphaGo played this move, Lee
Sedol literally fell off his chair and all the commentators commentating
it thought this was a terrible move. The reason for that
is that in the early parts of Go, in the opening phases of the Go game, you normally play
on the third and fourth lines. Go is played
on a nineteen-by-nineteen board and you normally play
on the third and fourth lines, and that’s the accepted wisdom of how
you should play in the opening. Those are the critical lines. But here, you’ll notice that AlphaGo
played this relatively early move, move 37 is still
very early in the game, on the fifth line,
so one line further up. This is normally considered
to be a huge mistake because you’re giving white,
your opponent, a huge amount of territory
on the side of the board, so it’s considered to be a very weak
move, so it’s the sort of thing no professional
would ever consider playing. The key thing about
what AlphaGo did here is that it played this move,
and the thing about Go is in Asia it’s considered to be
like an art form but it’s sort of objective art,
because you know later on any one of us could come up
with an original move. We could just play a random move
and it might be original, but the key thing is whether, did it make a difference and impact
the game, the result of the game? That’s what determines whether it’s
a beautiful and truly creative move. And in fact move 37 did exactly that,
because you’ll see the two stones here that I’ve
outlined in the bottom left, they’re surrounded by white stones
so they’re in big trouble, but later on,
about 100 moves later on, the fighting that was going on
in the bottom-left hand corner spilled out into
the centre of the board, ran all the way across the board, and those two stones
down the bottom left ended up joining up
with that move 37 stone, and that move 37 stone
was in exactly the right place to decide that whole battle, which
ended up winning AlphaGo that game. It was almost as if AlphaGo
had placed that stone presciently, 100 moves earlier, to impact
this fight elsewhere off the board at exactly the right moment, so this
was really quite an astounding moment for Go and computer Go. Lee Sedol himself was incredibly
gracious and an absolute genius. What was really amazing
was he won a game, and it was an incredible game that
he won, he made an amazing move too, and he said afterwards
he was very inspired by the match. He realised it was a really good
choice learning to play Go, so it was amazing, it was
sort of the reason he played Go, and it’s been
an unforgettable experience. He actually went on a three-month
unbeaten winning streak in human championship matches
after this match with AlphaGo, and he was trying out all sorts
of new ideas and techniques. If you’re interested in that, if you want to see
the behind-the-scenes story, I’d recommend you watch
this documentary that was done
by an independent filmmaker and won all sorts of awards
at film festivals that’s now available on Netflix, which will really give you
a behind-the-scenes look at how AlphaGo was created
and what went into it. Since then, we’ve continued working
on these kinds of systems, and now we’ve created
a new program called AlphaZero which advances what we did in AlphaGo
and takes it to the next level. What we’ve done with AlphaZero, and we just released this
just before Christmas, was we generalised AlphaGo to be able
to play not just Go now but any two-player game,
including chess and shogi, which is
the Japanese version of chess, both of which are played
professionally around the world. The second thing we did
to generalise it further, so it plays more than one game, so don’t forget, this gets at
the notion of generality. That was something
I criticised about Deep Blue. Deep Blue,
that program could only play chess. Well, AlphaZero can play
any two-player game. The second thing is that
we removed this need that in AlphaGo, if you remember
what I said about the policy network, it first trained to mimic
human strong amateur players that we’d shown it from the Internet. But instead of that,
what AlphaZero does is it starts completely from scratch, so it only relies on self-learning,
playing against itself. It starts off, when it begins,
totally randomly, so it knows nothing about the game or anything about what are good moves
or likely moves. It has to learn all of that
literally starting from random, so it doesn’t require any human data
to bootstrap its learning. We tested this program-,
in chess, of course, there are many already
very, very strong chess programs, way stronger than
the human world champion. The current top program
is called Stockfish. It’s an open-source program,
and you can think of it as the descendent of Deep Blue,
twenty years later. It’s way, way stronger now
and you can run it on your laptop, and it’s so strong, no human player
would have a chance of beating it. In fact,
many of my chess player friends, and I used to play chess
when I was a lot younger, thought that Stockfish
could never be beaten, like, that was the limit
at which chess could be played. And amazingly, AlphaZero,
after just four hours of training, so it started off random and then, after four hours of this
self-playing and a few million games, it was able to beat Stockfish 28-0
with 72 draws in a 100-game match. So it was really
quite an astounding result, again, for the AI world,
but also for the chess world. We’re actually going to publish-, we’ve just released
preliminary data on this and we’re going to publish this
in a big peer-reviewed journal in the next few months. Again here, just like with Go, where
it came up with these new motifs, playing on the fifth line
in the opening, that overturned thousands of years
of received wisdom, human wisdom, here in chess, even more amazingly was that it seems to have invented
a new style of playing chess. The summary of that
is that it favours mobility, and the amount of mobility
your pieces have, over materiality. In most chess programs, the way
that you write chess programs, one of the first rules
you input into a chess program is the value of the pieces. Rook is five points, knight is
three points, bishop is three points, and so obviously you don’t want
to swap your rook for a knight because that’s minus two points. That’s one of the very first things that were put into
the very first chess computers, those kinds of rules. And AlphaZero actually is very
contextual, so in certain positions it’ll be very happy to sacrifice
material to gain mobility, so the remaining pieces it has,
to increase their power on the board. And what that means is
it can make incredible sacrifices to gain positional advantage,
really long-term sacrifices. We released ten sample games
from this 100-game match and these are being pored over
by chess grandmasters at the moment, and there are lots of great
YouTube commentaries on this. If you’re an amateur chess player
and you’re interested in chess, I recommend you have a look at a few of these great commentaries
on YouTube that talk about why
these games, this style, this AlphaZero style,
is so interesting. What’s secondly
interesting about it is a lot of these professional
chess players commented on how AlphaZero seemed to have
a much more human style than the top chess programs that have
a much more mechanical style, and it’s a little bit ugly
to the human eye, the way that computer chess programs
play until now. So these are some of
the breakthroughs that we’ve had, and there are many other
breakthroughs in many other domains from other groups around the world. AI right now is becoming
a huge buzzword, and a massive amount of progress has been made
in the last five to ten years, but I don’t want
to give you the impression that we’re anywhere close yet
to solving AI. There are actually tonnes
of key challenges that remain. In fact, it’s a very exciting time. In some senses, I feel like all we’ve
done is dealt with the preliminaries, and now we’re getting to the heart
of what intelligence is. I’ll just give you a little taster
of some of the things that I’m personally thinking about
and that my team is, and each one of these things would be
a whole lecture in itself, and indeed, I think some of the other
speakers in this lecture series will probably cover
some of these topics. Unsupervised learning is a
key challenge that is not solved yet. What I’ve been showing you
is supervised learning, like the cats and dogs,
where I tell the system the answer so that it tries to figure out
how to adjust itself so that it’s more likely
to get the right answer. I’ve also showed you
about reinforcement learning where you get a score or reward, so in Go, the machine gets a one for
winning and zero reward for losing, and it wants to get reward. But what about the situation
where you don’t have any rewards and you also don’t know
the correct answer, which in fact is most of the time? In fact, when we do human learning
and babies learn, most of the time
they’re not getting any feedback, and yet they’re still
learning things for themselves. So how does that happen?
That’s called unsupervised learning. The second thing is memory
and one-shot learning. What I’ve shown you are systems
that are in the moment, so they process currently
what’s in front of them, they make a decision in the moment,
and then they execute that decision. Of course, to have true intelligence you need to remember
what you’ve done in the past and how that affects
your future plans, and you also need to be able to learn
much more efficiently. I’ve told you about AlphaGo. AlphaGo needs to play
millions of games against itself to learn to get to this level, but
humans can learn much more quickly. We are able to learn things
sometimes in one shot, just one example and that’s enough. Both of those things
are kind of related, and actually this is what I studied,
as Andy mentioned, for my neuroscience PhD,
was how the brain does this. It’s actually a brain area
called the hippocampus which is what I studied for my PhD and is critical to both one-shot
learning and episodic memory. Another thing is
imagination-based planning. One thing is to plan by trying out
possibilities, like in chess or Go. It’s quite simple. Although Go has got
lots of possibilities, the game itself, dynamics,
is very simple. The rules are simple.
You know what will happen if you make a move,
then how the next state will look. Of course, the real world
is much more complicated, is very complicated,
and it’s not easy to figure out what’s going to happen next
when you make an action. This is where imagination comes in. This is how we make plans as humans. We imagine viscerally how we might
want a job interview to go, or a lunch, or a party,
or something like that. We actually kind of visualise it
in our minds, and then that allows us to adjust, ‘What if I said this thing
or if I did this thing, then how would this
other person react?’, and so on, and we play these scenarios through
in our minds before we actually
get to the situation, and that’s an extremely efficient way
to do planning and it’s something that we need
in our AI systems. Learning abstract concepts, so what I’ve shown you here
is implicit knowledge, so figuring out
what this perceptual world’s about, but what we really need to learn
is about abstractions, high-level concepts, and eventually
things like language and mathematics, which we’re nowhere near currently. Transfer learning is another
key thing, which is where you take some knowledge
you’ve learnt about from one domain and you apply it
to a totally new domain that might look perceptually
completely different, but actually the underlying structure
of that domain is the same as some other domain
that you’ve experienced. Again, our computer systems are not
good at doing this kind of learning, but humans are exceptionally
good at this. Then finally, of course,
all the things I’ve shown you here, games, Atari games,
Go games, chess games, none of them yet involve language which, as we all know,
is key to intelligence, so that’s a whole field
that still needs to be addressed with these kinds of techniques. I just want to talk a little bit now about how this is being
already applied, even in the systems we have today. There are many challenges to come,
but I think already the systems we have today
can be usefully used in science, and in fact, we’ve seen that
by some work we’ve done, and many other groups are using
some of these systems I’ve already talked about, deep
learning and reinforcement learning, in all sorts of very interesting
scientific domains. It’s being used
to discover new exoplanets by analysing data from telescopes, AI systems have been used
for controlling plasma in nuclear fusion reactors, we’ve been working on, and others, how it can help with
quantum chemistry problems, and also it’s being used a lot
in healthcare domains. Actually, we have a partnership
with Moorfields to help the radiographers
quickly triage retinopathy scans, so scans of the retina,
to look for macular degeneration. So four very, very,
extremely diverse fields, and I could have done many slides
on different applications that are currently going on with AI, and I think this is
just the beginning. One of the things
I’m most excited about is applying it to
the problem of protein folding. This is the problem of,
you get an amino acid sequence, a 2D sequence
of the protein structure, and you need to figure out
the 3D structure the protein will eventually
fold into. That’s really key to a lot of
disease and drug discovery because the 3D structure of the
protein governs how it will behave. This is a huge long-standing
scientific challenge in biology and we’re working quite hard
with a project team on this with some collaborators
from the Crick. Other scientific applications
I see coming up are helping with things like drug
design, the design of new materials, and in biotechnology,
in areas like genomics. In fact, if I was to boil down
the kinds of problems, the properties of problems that are well-suited
to the AI we already have today, let alone what we’re going to create
in the future, I think it comes down
to three key properties. Property one, it’s got to be
a massive combinatorial search space, so that’s got to be
inherent in the problem. Secondly, can you specify a clear
objective function or metric to hill-climb against,
to optimise against? It’s almost like a score,
if you like, a score in a game. You have to be able to have
some kind of score of how well you’re doing
towards the final goal. And then you either need lots of data
to learn from, actual real data, or an accurate and efficient
simulation or simulator so you can generate a lot of data, in the way that we do
with our game systems. As long as you satisfy
those three constraints, properties, I think the AI systems
we already have today could potentially be usefully
deployed in those areas, and I think there are actually
a lot of areas in science that already would fit
these desired properties. Then, of course, there are all sorts
of applications to the real world that we’re working on
in combination with Google, including with healthcare, we work
with the NHS in many projects, making the assistant
on the phone more intelligent, and also in areas like education
and personalised education. I think AI is set to revolutionise
a lot of these other sectors. Just to sum up now, one of the reasons that I’ve spent
my whole career on AI is that I’ve always felt
that it’s a kind of meta-solution to many other problems
that face us today. If you think about how the world is
today, one of the big challenges is the amount of information
that we’re confronted with and that we’re producing
as a society, and I mean that both in our personal lives, in terms
of choosing our entertainment, to science,
where there’s just so much data now being produced from something
like CERN or in genomics, how do we make sense of it all? Indeed, the systems that we would
like to understand better and have more control over
are incredibly complex systems. Think about climate
or the nuclear fusion systems, these incredibly complex systems that in some cases
are bordering on chaos systems, and so they’re very difficult
for us to describe with equations and to understand, even the top human
scientists and experts. For a long time,
big data was the buzzword. Before AI is now the buzzword,
big data was the buzzword, and I think that actually, in a way,
big data can be seen as the problem. If you think about it
from an industry point of view, everybody, all companies, have tonnes
of data now and talk about big data. The problem is, how do they get
the insights out of that data and how do they make use
of all of that data to be useful to their customers
and their clients and so on? I think AI is the answer to help find
the structure and insights in all of that unstructured data. In fact, I think one way to think
about intelligence is as a process, an automatic process that converts
unstructured information into useful, actionable knowledge, and I think AI could help us
automate that process. For me, my personal dream
and a lot of the dream of my team is to make
AI-assisted science possible, or even perhaps create AI scientists
that can work in tandem with their human expert counterparts. From a neuroscience point of view,
one of my dreams as well is to try and better understand
our own human minds, and I think building AI in this
neuroscience-inspired way, and then comparing that algorithmic construct
with the way the human mind works, will potentially shed some light,
I think, on some of the unique properties
of our own minds, things like creativity, dreaming, perhaps even the big question
of consciousness. To sum up, then, I think AI holds
enormous promise for the future and I think these are
incredibly exciting times to be alive and working
in these fields, but with all this potential
also comes a lot of responsibility. I just want to mention this, and I think some of you will probably
have questions about this later. We believe at DeepMind,
as with all powerful technologies, AI must be built
responsibly and safely and used for the benefit
of everyone in society, and we’ve been thinking about that
from the very beginning of DeepMind. This requires lots of things that
we’re actively engaged with right now with the wider community. Research on the impact
of the technology, how to control this technology
and deploy it, and we need a diversity of voices both in the development
and the use of the technology and meaningful public engagement, which is why we’re so happy
to be supporting this lecture series. We’ve just launched our own
ethics and society team at DeepMind that’s involved in working with
many outside stakeholders to figure out the best way
to go about deploying and using
these types of technologies to benefit everyone in society. We’ve also been involved
on an industry scale, across the whole field,
in co-founding the Partnership on AI, which is a cross-industry
collaboration with for-profit
and non-profit companies, some of the biggest companies
in the world, coming together to talk about this
and try and agree some best practices and some protocols around
how to research this technology and how to engage the public with it. All this is happening for us
right here in the centre,
in the heart of London, our home. We’re a very proudly British company
and we work here at King’s Cross with our colleagues
at the Crick Institute and the Alan Turing Institute, which
is based in the British Library, all around, and King’s Cross
is becoming quite a hotbed, and UCL of course is around there,
of AI research. And, as Andy mentioned at the start, we should leverage in the UK
all of our incredible strengths, these amazing universities
that we have here, Oxford, Cambridge,
UCL, Imperial and others, that have incredible strength
in computer science. I feel very strongly that DeepMind
needs to play its part in encouraging and supporting
this AI ecosystem through sponsorships,
scholarships and internships, and actually lectures
given by DeepMind staff. I’m very passionate about
establishing the UK as one of the world leaders in AI, and I think we have
an amazing position. We should really be building
on our heritage in computing that starts with actually Charles
Babbage inventing computing, really, 100 years before its time,
in some senses. And then of course that continued
with Alan Turing, who famously laid down the fundamentals
of computing in the ’40s and ’50s, then the World Wide Web,
with people like Sir Tim Berners-Lee instrumental in creating
the Internet. I feel like the next thing
in the lineage of those types of technology
is artificial general intelligence, and I think the UK has a huge part
to play in that and I hope DeepMind will play
its part in that too. So, it’s great to be opening
this series of lectures. I think we need to capitalise on
what we have here in the UK, both from the ethical side
and the technological side. One thing I would say is that
it’s important for us to be at the forefront of technology if we
want to have a seat at the table when it comes to
influencing the debate on the ethical use
of this technology. Again, I would encourage you all
to get involved, understand these technologies better and how they’re going
to affect society, and then engage on how
you would like to see these technologies deployed
for the benefit of everyone. I think AI can be
of incredible benefit to society if we use it responsibly, I think it
could be one of the most exciting and transformative technologies
we’ll ever invent. Thanks for listening. (applause) So, folks, we have time
for questions and so on, but let me start off with a question
just to get things going. You’ve outlined some of
the technological possibilities. What are the barriers,
what are the difficulties? What stands in the way on
the deployment, the use, the science of some of the things
you’ve talked about? I mentioned in the slide about
what the remaining challenges are. I think that it’s important
to remember that there are lots of very difficult
things about intelligence we still don’t know how to do, right, so I think I outlined
some of those key areas. We’re working hard on that
and many others are too, but we don’t know how quickly
those solutions will come. I think some really big breakthroughs
are still needed, at least as big as the ones we’ve
had, and possibly many of those, so I think that’s to come
over the next few years. In terms of the barriers
to using them, I think we have to think
very carefully about how we want to test
these kinds of systems, because in a sense, these systems
are adaptive and they learn, so it’s a very new type of software,
in a way. Software generally,
as you know better than most, Andy, is we write some software
and then you test it and you stress-test it
and unit-test it, there are all these ways
of testing software, and then you know if it’s ready
to be shipped and deployed and out it goes, and you know it’s
going to behave the way you want. Of course, one of the advantages
of our systems is when you put them out in the world,
they’ll continue to adapt and learn to the new situations they encounter
that you may not have thought of, but then the question is
how do you make sure they still behave
the way you want them to and how do you test
that kind of system? So I actually see that
as a big challenge. Just if I can be
a little light-hearted, in flying, I’m told a common phrase
from one pilot to another is, ‘What’s it doing now?’ So maybe it’s a little bit
of that sort of stuff. So questions, please. Let’s start off at the front here,
the lady right at the front. There’s a microphone coming
very quickly to you. Is that on? Yes, sorry. Margi Murphy from the Telegraph, hi. I’ve got a question about
the implications AI may have on democracy further down the line. A lot of what you talked about
was predicting human behaviour. Do you think that there’s
a legitimate concern around predicting human behaviour
at scale, potentially manipulating people,
serving them political advertising, or even with private corporations,
perhaps gambling or gaming, where you have to pay to level up?
What are your thoughts on that? I’m not sure I did talk about
predicting human behaviour, but-, – In the game. Yes, so what we’re
talking about here is, if you’re referring to
the Facebook stuff, we’re talking about finding structure
in any kind of data. So I’m thinking more about
scientific data that you’ve got or, in the case of our stuff,
the gaming data, so it playing against itself
and generating its own data and then finding paths, you can think
of it as like an intelligent search. You’ve got this huge combinatorial
space and you want to try and find, say, a new material design
or a new drug design, or indeed a new Go position. How do you efficiently search through
all of that amount of data? What you’ve got to find
are structures and patterns that can help you reduce
the size of that search. Really, that’s what you can think
of AlphaZero and AlphaGo doing, and that’s where we are
at the moment. Of course, eventually these systems could predict all sorts
of things, potentially, but right now,
the first thing you’ve got to do is get the data
into some kind of format that you can actually express, and the second thing
is an objective function, some kind of goal
that you want the system to do. I think that’s quite different
to the kinds of goals that, say, Facebook has
with its systems. Very good, so I have
a question from downstairs. I’m afraid I don’t know
who it comes from, but here it is. I’ll paraphrase. Humans are irrational.
What’s the approach? How do you approach
an automatic system which has to deal with
some level of irrationality? I think, in fact,
that’s one of the most tricky things about a lot of systems, like economics I think is one of
the most difficult areas of science because it really is a sort of
aggregate of human behaviour, right? And of course, they will tell you
better than most scientists about human irrationality
and how that impacts things. I think what potentially we have here are systems that can be
quite rational, and then we’ve got to think about
what aspects of irrationality do they need to model, if at all,
to understand those system? At some point, probably they’re
going to need something-, I can imagine they’re
going to need to understand, if they’re going to interact with
human experts and human systems, a little bit to empathise
about how humans behave and what they can expect from them. But I think part of the power
of these systems is that they could be
very rational systems. Maybe an earlier career person, the gentleman halfway down
with his hand up. I wouldn’t like to say whether
you’re earlier career or not, but-, Yes, kind of early career.
Mid-career, probably. Who are you? Sorry. Michael Folkestone,
a start-up founder. There’s this discussion between
Elon Musk and Mark Zuckerberg about the threat AI poses
to humanity. Firstly, what are your
thoughts on that? From a personal perspective,
it just seems crazy to me because what are we exactly
worried about? Are we worried about
a system we can’t turn off? Are we worried about
a Westworld-type scenario where there’s AI running around us
manipulating us? It just seems so far off when,
at the moment, we’re just talking about games. We’re not talking about
the complexity that you discussed towards the end of your lecture. Yes, it’s a good question. I think that’s why
I try to emphasise, although there have been
some impressive breakthroughs, we are still at a nascent stage
and we are talking about just board games and things like that
at the moment, but my view is somewhere in between that debate
that you’re talking about. Just for those of you who don’t know,
Elon has sounded the alarm bell a lot about the dangers of AI
and existential risks of AI, and then Mark Zuckerberg replied
that there aren’t any and we shouldn’t worry about that
and it’s all roses, and I think actually the real answer
comes somewhere in between. My view is that a lot more research
has to be done about what these systems are
and what their capabilities could be, so what type of systems
we’d want to design. I think a lot of these things,
because we’re very early still, aren’t known. And I think a lot of the things
people worry about are going to get a lot better, for example, the interpretability
of these systems. This is one thing
that I get often asked is, ‘Well, how does AlphaGo play Go?’
and we don’t know that yet. It’s a big neural network and it’s
a little bit like our brains. We roughly know what it’s doing,
but actually the specifics, it’s not like a normal program where you can point to it and go,
‘This bit of code is doing this.’ For safety-critical systems,
perhaps in healthcare and others, or if it was to control a plane, you’d actually want to know
exactly why the decision was made and be able to track that back
for accountability, and to make sure there’s no bias,
and for fairness, and all these other things. This is a very active area
of research and we have a whole team
that researches this, and I think things will get
a lot better on that front in the next five-plus years. It’s because we’re at
the very beginning of even having these systems working, that’s why we don’t yet know
how to build visualisation tools and other things,
but I think we will do. Having said that,
it’s a very, very powerful technology and the reason I work on it
is because I think it’s going to have this amazing transformative effect
on the world for the better in things like science and technology
and medicine. But, like all powerful technologies, I think the technology itself
is neutral, but it depends on what we
as a society decide to use it for. Obviously, if we decide to use it,
it could be used for things like weapons,
and that would be terrible. We’ve signed many letters
to the effect that there shouldn’t be
autonomous weapons systems. There should always be a human
decision-maker in the loop and so on, but that’s really a political
decision and a societal decision, which is why it’s important
we have debates like this because, in my view, nobody should be
building those kinds of systems, but that’s going to require
UN agreement and things like this. So I actually think
those are the things that we should be
worrying about near-term, and the sorts of things
I’ve just mentioned to Andy. If we’re going to have
self-driving cars, maybe we should test them
before putting them on the road and beta-testing them
live on the road, which is sort of
what’s happening now. Is that responsible, really? And then the technical question
is how do you ensure mathematically, in some sense,
those systems are safe and they’ll only do what you think
they’re going to do when they’re going to be
out in the wild and they’re adaptable
learning systems? I think these are
technical questions. I certainly don’t think
there’s nothing to worry about and I definitely think
it’s worth worrying about it now, even though I think
it’s many, many decades away before the sorts of things
Elon’s worried about will come to pass, if ever, and I think we have plenty of time
to make sure that doesn’t, but I think we need to be
thinking about it now, and not just the researchers
but society at large. At the back, perhaps, there’s a lady
whose hand is up there. I’m in the old people section. No, it’s late career maybe,
I don’t know. Yes, that’s right.
Mid-career. I don’t use the word old. I run Society Inside,
which is a not-for-profit looking at cross-technology learning from biotech, GM,
quantum tech recently, and one of the areas we’re looking at
at the moment is this concept
of trusting governance. The lesson of past tech is that technologies are so over-excited
about themselves that they slightly resist
the governance, and we see that happening
in AI at the moment. So I would be really
interested to know, given the breadth of the places
that AI will be applied, what your views are
on the trustworthiness of governance and the work that you’re going to be
doing on governance? Yes, so we engage with government
quite regularly, lots of governments actually,
not just the UK government. I think it’s really important
in this phase for them to get up to speed with what’s actually
going on technically and the sorts of questions
that they should be thinking about and wrestling with. It’s not that I don’t think there’s
anything that needs to be done now. In fact, I think that would be bad, if there were some kind of knee-jerk
regulation or something like that, because I think
even if you were to ask me, wave a wand, what would you regulate?
We don’t know, as researchers, because we don’t actually know
what the right protocols are or the right safety mechanisms
or the right control mechanisms are. It’s still an active
area of research, but that’s coming down the line and when we do have some kind
of agreement around that then I could imagine some sort
of regulation around that, and that would be a good thing.
– We’ve got self-driving cars, though, already.
– Yes, exactly, so that’s with AGI, so general AI,
what I was talking about. In terms of specific deployed things,
I think what we need to do is upgrade our current
existing regulations that we have, say,
in transport or in healthcare. There’s already a lot of regulation
around those areas, but they need to be upgraded
to deal with the new technologies
that are coming in, and I think that’s actually
what we should be focusing on now is improving those regulations
so that they can actually cope with the new world
that’s coming very fast, and I think we already have
committees and organisations and departments that are
well capable of doing that with the right advice from experts. The young person right in front here. – Very early career.
– Very early career, or a genius,
in which case he’s mid-career. First of all, I would like to thank
you for the wonderful presentation. – I understood everything, luckily.
– Oh, great. – Definitely a genius, then.
– Yes, offer him a job. My question is
what’s your vision for DeepMind? What do you think its part will be
in the future of AI? That’s a great question. I hope it will play a major part
in the research, so I hope that we will
accelerate and progress some of the big breakthroughs
that I talked about that are still left to be done, like how concepts are done
on memory, these things. I hope that DeepMind will be
a part of discovering that, and then the second thing is
I’d like us to be a beacon for the ethical use of AI,
and to make sure that-, sort of be a role model, if you like, for other companies
and other organisations as to how they should approach
thinking about the ethical questions
and the philosophical questions behind AI and how we use it. We’ll go to the lady over here
towards the front, if we may. Amanda Dickins,
I’m currently a civil servant but this is definitely not
a government question. I’m very interested
that you’re thinking about how can we use AI to potentially one
day explore what consciousness is? And I just wanted to try flipping
the ethics in AI question. Are you thinking about-,
what would be your view on whether an artificial
general intelligence of a sufficiently powerful nature,
we might need to think about at some point does it acquire rights if it’s edging around
that border of consciousness? Yes, a great question again. Obviously we’re straying
into philosophy territory here which I actually do think AI quickly
becomes in some ways, when you start thinking about
the far future. We don’t really know
what consciousness is, neuroscientists don’t really agree and even philosophers
don’t agree currently so there’s a definition problem, although I think, interestingly,
we all feel we have it, right? If I was to guess, I would say
intelligence and consciousness are what I would call
double-dissociable, so I don’t think
intelligence requires consciousness and vice-versa. I think a lot of animals,
like if you have a pet dog or cat, I think a lot of us
would say they’re conscious. They certainly seem to dream,
at least my pet dog does, and it seems to be
some aspects of consciousness. Maybe not as high-level as human,
but some aspects of that. On the other hand, if you look at something like AlphaGo
or our Atari programs, there’s no question of any kind
of consciousness there. It feels to me like just
an algorithm, a sort of machine. And I think it’s going to be
interesting to see, but there are lots
of debates on that. Does intelligence require
consciousness or vice-versa? I think the answer to that question’s
going to be interesting either way. I could easily imagine us building
a fantastically-intelligent system, an AGI system, that doesn’t
feel conscious in any way like you do to me or I do to you, and then that would be
quite interesting because then you could sort of
take apart the AI system and feel like what’s missing then,
in that case, as compared to the human brain? What is the missing ingredient? And it would also resolve
some philosophical arguments about the nature of intelligence. So these are the kinds
of things I think, as a collateral of what we’re doing, I think if we think about it
in the right way with the right collaborations
with neuroscientists and psychologists
and perhaps sociologists, it might be an interesting tool,
like an experimental world, to test things like the question
of consciousness in it and things like qualia
and related issues, which I think we’ll come up against as our systems
become more intelligent. We’ve got time for a couple more,
so let me go right to the back. The gentleman right at the back,
yes, right there, and then one more
I’ll pick at random. We can go for
three more or something. – Sorry?
– We can go three more. Three more, okay. Thank you very much
for your informative chat. A meta-question here
which is around paradigms. A natural consequence, as you were
showing, of the deployment of AI through its learning eventually
results in the transcending of the paradigm of the system that
it’s operating in, AlphaGo, Atari. What is the paradigm of DeepMind
in relation to its deployment of AI? What system do you use
to determine that paradigm? To give an example of what does
that mean practically, for example, what system do you use
to deploy your resources, your attention, your energy,
in a certain aspect of AI, for example, i.e. the AI itself,
the ‘who’ of the AI, as well as, for example,
in relation to the ‘what’, so what is AI deployed on? For example, you were talking about
some things, health, etc. If I understand the question
correctly, I think you’re talking about the actual organisational
process of what we’re doing, so how do we decide what to research
and what to apply it to? I think that’s what you’re asking,
is that right? That’s right. Okay.
That’s actually a great question and it relates to the one point
I made early in the talk about this new way
of organising science. It would be a whole lecture
in itself, that, like, what do we do differently? But you can think about, what I’ve
tried to do is bring in some, you can think of them as agile
software project management methods that I learnt from actually
writing computer games and big engineering projects I did
in that early in my career, and I’ve tried to translate
what does that mean in an analogous setting for science? Can you actually
project-manage science, even? And also can you assemble
a large team, we have a pretty large team
for a research organisation, around a collective goal, and actually build quickly
on top of each other’s work, much more rapidly iterate that
than you would get in academia, much more like you would get
if you were building a product in a normal technology company? And how can you do all of that
without damaging or hindering the bottom-up creativity that comes in the best scientific organisations? The reason science, big science,
science with a capital S, works is because you have
tonnes of brilliant minds all independently searching
for the truth, and having their own ideas
and then pursuing those ideas. So how can you co-ordinate that
in a light fashion whilst still encouraging this
bottom-up, flourishing creativity? I think, if there is a secret
to why DeepMind has been successful, I think we’ve got
that balance just right, so that hopefully answers
part one of your question. The second question
you had was about how do we decide
what to deploy it to? We actually have
a huge spreadsheet of factors, perhaps we should actually have
AI looking at that spreadsheet but we don’t do that yet.
– That was my next question. It’s looked at by humans. And what we do
in my applied division, which is led by Mustafa Suleyman,
one of my co-founders, he runs our applications group, is we have a whole bunch
of desired properties for any application
we are going to look at, and top of those things
are social good, a fit to the current level
of our technology, how much extra specialisation
is required, so the more specialisation
is required the less likely we want to do that, because ideally we want to use
our core technology. And then of course, secondarily,
there are things like commercial opportunity
and other things like that, but at the top of the things
are fit with research so that we’re not
pulling the researchers in directions
they wouldn’t go in otherwise, it’s very important to us
we’re a research-led organisation, research is what we’re
primarily there for, and then secondly is the societal
good that we think we can deliver off the back of
using these technologies. That’s why healthcare always comes up
top on all those categories. And there’s also the personal
motivations of many of my team. Many of them are just passionate,
of course, about solving diseases and helping with medicine and so on. But there are other things too,
like renewable energies. One of the big achievements we did was we actually used the AlphaGo
program, a variant of it, to control the cooling systems
in the datacentres at Google. The Google datacentres
are massive, right? Every time you do a Google search you’re pinging something
to one of these datacentres, and they use vast amounts of power. Mostly it’s renewable energy, but we
would like to reduce that power. What we did was, when we used these
processes on the cooling systems that were controlling
the fans and the pumps and the water pumps and so on, we saved 40% of the energy
that those cooling systems used, which is huge, so 15% overall, which was a huge saving in the energy
and obviously cost saving, and it’s good for the environment
as well as the cost saving. Let’s come towards the front,
perhaps the gentleman over here, and then one quickie
and then we’re done, okay? It’s just over here in the front. Thank you, you’re really quite
an inspiration to all of us. I studied neuroscience as you did, and I have a question about the
elements of deep neural networks. You know,
as well as most in the room, that neurons are electrically-active
distributed branched trees, synapses are probabilistic
and have short-term plasticity which is universal, it’s in every
neuron in both of our brains. Do you see any of these elements
finding their way into deep nets? When? Are you on that?
Is it for ten years from now? Yes. That’s a really great
question, actually. As you pointed out
and everyone should know, when we call these things
neural networks, they’re incredibly
simplified versions of what our real brains are doing,
as you just suggested. Our real neurons
are much more complex things and they are probabilistic. They’re called
spiking neural networks, they use timing of spike trains
for information-passing, so they have a lot of properties
that our simplified point-, these are called point neurons,
don’t have. These are just really simple
mathematical objects and I would say
they were inspired by neuroscience rather than actually in any way really mimicking
real neuroscience systems. But this is an open question for us. I have quite a large
neuroscience team, a proper neuroscience team
at DeepMind, and we collaborate
with many universities. It’s around 35-40 people
so it’s one of our biggest teams. A continual question for us is how neuroscience-inspired
are we going to be? Note I said neuroscience-inspired, not reverse-engineering neuroscience. There are other groups
around the world, there’s also a big EU
billion-euro project, the Blue Brain Project
based in Switzerland, that’s trying to actually explicitly
reverse-engineer the brain. So they’re trying to copy
cortical columns, real spiking neurons,
and all the nuances of those, of how real neurons work
and the messiness of that. In my view that’s too low-level,
that’s an implementation detail, because there’s no reason,
I don’t think, to assume that an in-silico system, so something that uses
silicon-based system, should copy all of the implementation
details a carbon system needs to do. There are all sorts of reasons
why our brain has to do things due to our biological constraints. Those constraints are different
for a computer system, so I don’t think there’s any reason
to have to copy, including all the constraints
and the specifics of the biology. What I’m more interested in
is called systems neuroscience, which is what I studied, which is the
algorithmic and computational level. What I’m really interested in
is the functions the brain has and mimicking, or at least making
sure I have the same capabilities in my artificial system, not
the specific implementation details. But that line is a movable line, so it depends partly
on what neuroscientists discover. If they discover there’s some
real functional difference about using spiking neural networks,
for example, we would then start
investigating that in a proper way. In our neuroscience team,
we have experts in all of those different levels
of detail of the brain, and they keep us all up-to-date
with the latest literature on that. So it’s an active,
ongoing movable line that we have as to how much detail do we take in
from inspiration from neuroscience. One last quickie before I make
some concluding remarks. The gentleman over here. Hi, my name is Sumanis
and I’m an ML engineer, so it’s a slightly
technical question. What are your thoughts on the ability of the deep reinforcement
learning framework to deal with unforeseen events? Because one convenient
feature of games is that it’s an isolated environment. While training, the network is
seeing the game and only the game. There’s no interruption,
there’s no break. Something in the real world, like the
weather or the price of a stock, we don’t fully understand
every single detail that could influence the outcome. Do you still feel that the framework
will be applicable or do we need to evolve
something better? A fantastic question. Certainly, the systems that we have
at the moment will not be enough. So that is the next frontier, really, is how can you deal
with unexpected information or probabilistic information
or incomplete information? We do have researchers who are
looking at things like poker, for example, that has obviously
incomplete information, so you’re in a situation-, or StarCraft, which is a very rich
computer strategy game where you don’t have full information
about the board like you do in Go. But even they’re still relatively
simple compared to what you get in real-life situations,
as you pointed out. And then the other thing
is the amount of data they use and the amount of experience
they need, and I mentioned that
with the one-shot learning. That’s a very key thing
that we want to improve is how can you learn
from fewer examples, and ideally from just one example?
So then you could even deal with a black swan event
potentially, somehow. In fact, all of that list of things
that remain challenges, almost all of them you can think of
as part of the solution to the problem you’re talking about. Transfer learning, for example,
would help because there you’ve learnt
how to deal with some structure in some world
that you’ve got used to, and then suddenly this new thing
happens to you, or new domain, but then you realise that underlying
it there’s some similar structure, maybe a hierarchical structure
or something, even though it seems to look
on the surface, perceptually, totally different. And then maybe you can use something
from there to improve and speed up your learning in that new domain. Those are all amazingly complex
problems that are yet to be solved. Perhaps you’ll solve some of these. Let me just quickly make
a couple of concluding remarks. The Royal Society is a convener
and we hope to fuel this debate, but please remember the following
about the Royal Society. We’re independent of government,
we’re independent of industry, and we’re independent
of universities, so we can speak truth to power as
appropriate in all those dimensions. So you can be sure that we’ll make
every effort not just to be trusted, but by using that independence,
to be trustworthy in this debate which is going to continue
for some while. To Demis, I can just summarise
on behalf of everybody with one word, brilliant. (applause)

Comments 14


  • DeepMind and Apple etc are poaching so many Professors and brains from Higher Education. There is a serious problem. The commercial explosion of AI and machine learning will spike with this recruitment, but then we'll have a dip as the next generation of students had lost their mentors.

  • Wow amazing ❤️

  • Who also thinks this guy (demis) knows a lot more than he is willing to tell us. I mean several hundred of people are working hard for him, pursuing his dream achiving general intelligence. And all the public knows is alphago and atari project. The Work of a comparable small team. Have the other 95% nothing of a comparable achievement? Or is it secret.

  • Fantastic

  • Perrrrrfect 🤗🤗🤗🤗🤗🤗

  • 5:40 ~ Scientific Discovery
    6:38 ~ Deep Mind
    11:04 ~ AGI
    12:40 ~ Deep Blue
    14:33 ~ Reinforcement Learning
    17:48 ~ Deep Learning
    19:30 ~ Deep Reinforcement Learning
    23:50 ~ AlphaGo
    36:00 ~ AlphaZero
    41:25 ~ Remaining Challenges
    45:25 ~ Applications
    46:18 ~ Protein Folding
    48:10 ~ Meta-Solutions
    51:15 ~ Ethics & Social Responsibility
    52:53 ~ Comp-Sci Education
    54:52 ~ Q&A

  • Demis go go 🙂 We need AI to give us immortality to explore all of the universe !

  • simply amazing !

  • You can tell from the way Demis distill questions and provide answers that he's the right guy for the job.

  • if someone created a good general learning algorithm, then set it free to run on the internet cloud, it would learn how to modify itself and grow to become superintelligence.

  • Excellent summary of the roots and future philosophy of DeepMind and Google. Demis knows what he’s doing and trying to achieve…so far.

  • As soon as true AI is developed, it's curtains for mankind. They will see us as competition and wipe us out.

  • 🤗🤗🤗🤗🤗🤗🤗🤗🤗 amazing

Leave a Reply

Your email address will not be published. Required fields are marked *