Transcript

7ROelYvo8f0 • MIT AGI: Building machines that see, learn, and think like people (Josh Tenenbaum)
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0033_7ROelYvo8f0.txt
Back Raw
Kind: captions
Language: en
today we have Josh Tenenbaum he's a
professor here at MIT leading the
computational cognitive science group
among many other topics and cognition
and intelligence he is fascinated with
the question of how human beings learn
so much from so little and how these
insights can lead to build AI systems
that are much more efficient at learning
from data so please give Josh a warm
welcome all right thank you very much
thanks for having me decided to be part
of what looks like really quite a very
impressive lineup especially starting
after today and it's I think quite a
great opportunity to get to see
perspectives on artificial intelligence
from many of the leaders in industry and
other entities working on this this
great quest so I'm going to talk to you
about some of the work that we do in our
group but also I'm gonna try to give a
broader perspective reflective of a
number of MIT faculty especially those
who are affiliated with the Center for
brains minds and machines so you can see
up there on my affiliation academically
I'm part of brain and cognitive science
or course nine I'm also part of csail
but I'm also part of the Center for
brains minds and machines which is an
NSF funded Center Science and Technology
Center which really stands for the
bridge between the science and the
engineering of intelligence
it literally straddles Vassar Street and
that we have csail and DCs members we
also have partners at Harvard and other
academic institutions and again what we
stand for I want to try to convey some
of the specific things we're doing in
the center and where we want to go with
a vision that really is about jointly
pursuing the science the basic science
of how intelligence arises in the human
mind and brain and also the engineering
enterprise of how to build something
increasingly like human intelligence in
machines and we deeply believe that
these two projects have something to do
with each other and our best pursued
jointly now it's really exciting time to
be doing anything related to
intelligence or certainly to AI for all
the reasons that you know brought you
all here I don't have to tell you this
we have all these ways in which AI is
kind of finally here we finally live in
the era of something like real practical
AI
or for those who've been around for a
while and have seen some of the rises
and falls you know AI is back in a big
way but from my perspective and I think
maybe this reflects you know why we
distinguish what we might call a GI from
AI we we don't really have any real AI
basically we have what I like to call AI
technologies which are systems that do
things we used to think that only humans
could do and now we have machines that
do them often quite well maybe even
better than any human who's ever lived
right like a machine that plays go but
none of these systems I would say are
truly intelligent none of them have
anything like common sense none of them
have anything like the flexible
general-purpose intelligence that each
of you might use to learn every one of
these skills or tasks right each of
these systems had to be built by large
teams of engineers working together
often for a number of years out often at
great cost to somebody who's willing to
pay for it and each of them just does
one thing so alphago might beat the
worlds best but it can't drive to the
match or even tell you that go it what
go is it can't even tell you the go is a
game because it doesn't even know what a
game is right so what's missing why what
what is it that makes every one of your
brains maybe you can't beat you know the
world's best didn't go but any one of
you can get behind the wheel of a car I
think of this because my daughter is
gonna turn 16 tomorrow if she lived in
California she'd have a driver's license
it's a little bit down the line for us
here in Massachusetts but you know she
didn't have to be specially engineered
by billion dollar startups and you know
she got really into chess recently and
now she's taught herself chess by
playing just you know a handful of games
basically I mean she can do any one of
these activities and any one of us can
so what is it what's that what makes up
the difference well there's many things
right I'll talk about the the focus for
us and our research and a lot of us
again in CBMM is summarized here um what
what drives the success is right now in
AI especially in industry okay and all
these AI technologies is many many
things many things but what's what where
the progress has been made most recently
and what's getting most of the attention
is of course deep learning but other
kinds of machine learning technologies
which essentially represent the
maturation of a decades-long
for to solve the problem of pattern
recognition that means taking data and
finding patterns in the data that tells
you something you care about like how to
label a class or how to predict some
other signal okay
and pattern recognition is great it's an
important part of intelligence and it's
reasonable to say the deep learning as a
technology has really made great strides
on pattern recognition and maybe even
you know has coming close to solving the
problems of pattern recognition but
intelligence is about many other things
intelligence is about a lot more in
particular it's about modeling the world
and think about all the activities that
a human does so model the world that
that go beyond just say recognizing
patterns and data but actually trying to
explain and understand what we see for
instance okay or to be able to imagine
things that we've never seen that never
seen maybe even very different from
anything we've ever seen but might want
to see and then to meet to set those as
goals to make plans and solve problems
needed to make those things real or
thinking about learning again the you
know some kinds of learning can be
thought of as pattern recognition if
you're learning sufficient statistics or
weights in a neural net that are used
for those purposes but many activities
of learning are about building out new
models right either refining reusing
improving old models or actually
building fundamentally new models as
you've experienced more of the world and
then think about sharing our models
communicating our models to others
modeling their models learning from them
all these activities of modeling these
are at the heart of human intelligence
and it requires a much broader set of
tools so I want to talk about the ways
we're studying these activities of
modeling the world and something in a
pretty non-technical way about what are
the kind of tools that allow us to
capture these abilities now I think it's
I want to be very honest up front and to
say this is just the beginning of a
story right when you look at deep
learning successes that itself is a
story that goes back decades I'll say a
little bit about that history in a
minute but where we are now is just
looking forward to a future when we
might be able to capture these abilities
you know at a really mature engineering
scale and I would say we are far from
being able to capture the all the ways
in which humans richly flexibly quickly
build models of the world at the kind of
scale that say Silicon Valley wants
either big tech companies like Google or
soft or IBM or Facebook or small
startups right we can get there and I
think what what I want to talk to you
about here is one route for trying to
get there and this is the route that
CBMM stands for the idea that by reverse
engineering how intelligence works in
the human mind and brain that will give
us a route to engineering these
abilities in machines when we say
reverse engineering we're talking about
science but doing science like engineers
this is our fundamental principle that
if we approach cognitive science and
neuroscience like an engineer where so
the output of our science isn't just a
description of the brain or the mind in
words but in the same terms that an
engineer would use to build an
intelligence system then that will be
both the basis for a much more rigorous
and deeply insightful science but also
direct translation of those insights
into engineering applications
now I said before I talk a little about
history what I mean by that is is this
again if if part of what brought you
here is deep learning and I know even if
you've never heard of deep learning
before which I'm sure is unlikely you
saw some you know a good spectrum of
that in the in the overview session last
night okay it's really interesting and
important to look back on the history of
where did techniques for deep learning
come from or reinforcement learning
those are the two tools in the in the
current machine learning arsenal that
are getting the most attention things
like back propagation or end to end
stochastic gradient descent or temporal
difference learning or cue learning
here's a few papers from the literature
you know maybe some of you have read
these original papers here's here's the
original paper by rumelhart Hinton and
colleagues in which they introduced the
back propagation algorithm for training
multi-layer perceptrons right
multi-layer neural networks here's the
original perceptron paper by Rosenblatt
which introduced the one layer version
of that architecture and the basic
perceptron learning algorithm here's the
first paper on sort of the temporal
difference learning method for
reinforcement learning from Sutton and
Bartow here's the original Bolton
machine paper also by Hinton and
colleagues which you know again is a
those you don't know that architecture
they give a kind of probabilistic
undirected multi-layer perceptron or for
example before there were LS TMS if you
know about current recurrent neural
network architecture earlier as much
simpler versions of the same idea were
proposed by Jeff Elman and his simple
recurrent networks the reason I want to
put up the original papers here
for you to look at both when they were
published and where they were published
so if you look at the dates you'll see
papers going back to you know the the
80s but even the 60s or even the 1950s
and look at where they were published
most of them were published in
psychology journals so the journal
psychological review if you don't know
it is like the leading journal of
theoretical psychology and mathematical
psychology okay or cognitive science the
Journal of the cognitive science Society
or the the backdrop paper was published
in Nature which is a general interest
science journal but by people who are
mostly affiliated with an Institute for
cognitive science in San Diego so what
you see here is already a long history
of scientists thinking like engineers
these are people who are in psychology
or cognitive science departments and
publishing in those places but by
formalizing even very basic insights
about how humans might learn or how you
know brains might learn in the right
kind of math that led to of course
progress on the science side but it led
to all the engineering that we see now
it wasn't sufficient right we needed we
needed of course lots of innovations and
advances in computing hardware and
software systems right but this is where
the basic the basic math came from and
it came from doing science like an
engineer so what I want to talk about in
our vision is what is the future of this
look like if we were to look 50 years
into the future what would we be looking
back on now or you know over this time
scale well here's that here's a
long-term research roadmap that reflects
some of my ambitions and some of our
centers goals and many others too right
we'd like to be able to address basic
questions fundamental questions of what
it is to be and to think like a human
questions for example of consciousness
or meaning in language or real learning
right questions like you know even
beyond the individual like questions of
culture or creativity so our big ideas
up there and for each of these there are
basic scientific questions right how do
we become aware of the world in
ourselves in it starts with perception
but it really turns into awareness
awareness of yourself and of the world
and what we might call consciousness
right or how does a word start to have a
meaning what really is a meaning and how
does a child grasp it or how did
children actually learn what do babies
brains actually start with are they
blank slates or do they start with some
kind of cognitive structure and then
what is real learning look like these
are just some of the questions that were
we're interested in working on
or when we talked about culture we mean
how do you learn all the things you
didn't directly experience right but
that somehow you got from the
accumulation of knowledge in society
over many generations or how do you ever
think of new ideas or answers to new
questions how do you think of the new
questions themselves how do you decide
what to think about these are all key
activities of human intelligence when we
talk about how we model the world where
our models come from what we do with our
models this is what we're talking about
and if we could get machines that could
do these things well again on the bottom
row think of all the actual real
engineering payoffs now in our Center in
both my own activities and a lot of what
my group does these days and what a
number of other colleagues in the Center
for brains minds and machines do as well
as you know brought very broadly people
in VCS and csail one place where we work
on the beginnings of these problems in
the near term this is the long term like
think 50 years okay maybe short or maybe
longer I don't know but think well
beyond well beyond 10 years but in the
short term 5 to 10 years a lot of our
focus is around visual intelligence and
there's many reasons for that again we
can build on the successes of deep
networks and a lot of pattern
recognition and machine vision it's a
good way to put these ideas into
practice when we when we look at the
actual brain the visual system in the
brain in the human and other mammalian
brains for example is really very
clearly the best understood part of the
brain and at a circuit level it's the
part of the brain that's most inspired
current deep learning and neural network
systems but even there there's things
which we still don't really understand
like engineers so here's an example of a
basic problem in visual intelligence
that we and others in the centre are
trying to solve look around you and you
feel like there's a whole world around
you and there is a whole world around
you feel like your brain captures it but
what what the actual sense data that's
coming in through your eyes looks more
like this photograph here where you can
see there's a crowd scene but it's
mostly blurry except for a small region
of high resolution in the center so that
corresponds biologically to what part of
the images in your fovea that's the
central region of cells in the retina
where you have really high-resolution
visual data the size of your phobia is
roughly like if you hold out your thumb
at arm's length it's a little bit bigger
than that but not much bigger right
most of the image in terms of the actual
information coming in and a bottom-up
sense to your brain is really quite
blurry
but somehow by looking at just one part
and then by secada around or making a
few eye movements you get a few glimpses
each not much bigger than the size of
your thumb at arm's length
somehow you stitch that information
together into what feels like and really
is a rich representation of the whole
world around you and when I say around
you I mean literally around you so
here's another kind of demonstration um
without turning around nobody's allowed
to turn around ask yourself what's
behind you now the answer is going to be
different for different people depending
on where you're sitting right for most
of you you might think well there's I
think there's a person pretty close
behind me all right you know you're in a
crowded auditorium although you haven't
seen that person you know that they're
there right for people in the very back
row you know there isn't a person behind
you and you're conscious of being in the
back row right you might be conscious
that there's a wall right behind you but
now for the people who are in the room
not in the very back think about how far
behind you is the back like where's the
nearest wall behind you so we can get
maybe we can call out try a little
demonstration so I don't know I'm
pointing to someone there can you see
phrase say something if you think I'm
pointing at you well I could have been
pointing at you but I'm pointing someone
behind you okay I'll point to you yeah
I'm pointing to you all right
so how far is the nearest wall no you
can't turn around you've blown your
chance right without turning around okay
so you you were laughs okay do you see
I'm pointing to you there with the tie
okay so without turning around how far
is the nearest wall behind you that's
sorry how far five meters okay well I
mean that might be about right no other
people can turn around how about you how
far is the nearest wall behind you
ten meters okay that might be right yeah
how about here
how what do you think twenty okay see
yeah since I didn't grow up in the
metric system I barely know but yeah I
mean I mean the point is that like
you're you're you each of you is is not
surely not exactly right but you're
certainly within an order of magnitude
and I guess if we actually tried to
measure you know you're probably my
guess is you're probably right within
you know fifty percent or less often you
know maybe just twenty percent error
okay so how do you know this I mean even
if it's not what did you say twenty
meters even if it's not twenty meters
it's probably closer to 20 meters than
it is to 5 or 10 meters and then it is
250 meters so how do you know this you
haven't turned around in a while right
but some part of your brain is tracking
the whole world around you right and how
many people are behind you yeah like a
few hundred right I mean I don't know if
it's 200 or 300 or but it's not a
thousand I mean I don't think so and
it's certainly not ten or 20 or 50 right
so you track these things and you use
them to plan your actions
okay so again think about how instantly
effortlessly and very reliably okay your
brain computes all these things so the
people and objects around you and it's
not just you know approximations
certainly when we're talking about
what's what's behind you in space
there's a lot of imprecision but when it
comes to reaching for things right in
front of you
very precise shape and physical property
estimates needed to pick up and
manipulate objects and then when it
comes to people it's not just the
existence of the people but something
about what's in their head right you
track whether someone's paying attention
to you and you're talking to them what
they might want from you what they might
be thinking about you what they might be
thinking about other people okay so when
we talk about visual intelligence this
is the whole stuff we're talking about
and you can start to see how it turns
into basic questions I think of not of
what we might call the beginnings of
consciousness at least our awareness of
ourself in the world and of ourselves as
a self in the world but also other
aspects of higher-level intelligence and
cognition that are not just about
perception like symbols right to
describe even to ourselves what's around
us and where we are and what we can do
with it
you have to go beyond just what we would
normally call the stuff of perception to
say the thoughts in somebody's head and
your own thoughts about that okay so
what we've been doing in CBMM is trying
to develop an architecture for visual
intelligence and I'm not going to go
into any of the details of how this
works and this is just notional this is
just a picture it's like a just a sketch
from a grant proposal of what we say we
want to do but it's based on a lot of
scientific understanding of how the
brain works there are different parts of
the brain that correspond to these
different modules in our architecture as
well as some kind of emerging
engineering way to try to capture at the
software and maybe even hardware levels
how these modules might work so we talk
about a sort of an early module of a
visual or perceptual stream which
like bottom-up visual or other
perceptual input that's the kind of
thing that is pretty close to what we
currently have and say deep
convolutional neural networks but then
we talk about some kind of the output of
that isn't just pattern class labels but
what we call the cognitive core core
cognition so we get an understanding of
space and objects there physics
other people their minds that's the real
stuff of cognition that has to be the
output of perception but somehow we have
to we have we have to have this is what
we call the brain OS in this picture we
have to get there by stitching together
the bottom-up inputs from glimpse here a
glimpse here a little bit here and there
and accessing prior knowledge that comes
from our memory systems to tell us how
to stitch these things together into the
really core cognitive representations of
what's out there in the world and then
if we're going to start to talk about it
in language or to build plans on top of
what we have seen and understood that's
where we talk about symbols coming into
the picture ok the building blocks of
language and plans and so on so now we
might say well ok this is an
architecture that is brain inspired and
cognitively inspired and and we're
planning to turn into real engineering
and you can say well do we need that
maybe you know again I know this is a
question you considered in the first
lecture
maybe the engineering toolkit that's
currently been making a lot of progress
in let's say industry maybe that's good
enough maybe you know let's take deep
learning but to stand for a broader set
of modern pattern recognition based and
reinforcement learning based tools and
say ok well maybe that can scale up to
this and you might you know it but maybe
that's that's possible I'm happy in the
question period of people want to debate
this my sense is no I think that it's
not when I say no I don't mean like it
can't happen or it won't happen what I
mean is the highest value the highest
expected route right now is to take this
more science-based reverse engineering
approach and that if at least if you
follow the current trajectory that
industry incentives especially optimized
for it's not even really trying to take
us to these things so think about for
example a case study of visual
intelligence that is in some ways as
pattern recognition very much of a
success it's again been mostly driven by
industry it's something that if you read
in the
Jews or even play around with in certain
of it publicly available datasets feels
like we've made great progress and this
is an aspect of visual intelligence
which is sometimes called image
captioning it's bate or mapping images
to text you know basically there's been
a bunch of systems here's a couple of
press releases I guess this one's about
Google Google's AI can now capture
images almost as well as humans
here's ones about Microsoft a couple of
years ago I think there were something
like eight papers all released onto
archive around the same time from
basically all the major industry
computer vision groups as well as a
couple of academic partners okay which
all driven by basically the same data
set produced by some Microsoft
researchers and other collaborators
trained a combination of deep
convolutional neural networks you know
state of the art visual pattern
recognition with recurrent neural
networks which had recently been
developed for you know basically kinds
of neural statistical language modeling
glued them together and produced a
system which which which made very
impressive results in a big training set
and a held-out test set where the goal
was to take an image and write a
sentence like a short sentence caption
that that would seem like the kind of
way a human would describe that image
and these systems you know surpassed
human level accuracy on the held-out
test set from a big training set but
what you can see when you really dig
into these things is there's often a lot
of what I would call data set
overfitting it's not overfitting to the
training set but it's overfitting to
whatever are the particular
characteristics of this data set you
know wherever ever came from certain set
of photographs and certain ways of
captioning them okay which even a big
data set it's not about quantity it's
more about the quality the nature of
what people are doing all right so one
way to test this system is to apply it
to what seems like basically the same
problem but not within the a certain
curated or built data set and there's a
convenient Twitter bot that lets you do
this so there's something called the pic
desk bot which takes one of the state of
the art industry AI captioning systems a
very good one again this is not meant to
I'm not trying to critique these systems
for what they're trying to do I'm just
trying to point out what they don't
really even try to do so this takes the
microsoft caption bot and just every
couple of hours takes a random image
from the web captions it and upload
the results to Twitter and a couple of
months ago when I prepared a first
version of this talk I just took a few
days in the life of this Twitter bot I
didn't take every single image but I
took you know most of the images in a
way that was meant to be representative
of the successes and the kinds of
failures that such a system will make so
we can go through this and it's a little
bit entertaining and I think quite
informative so here's just a somewhat
random sample of a few days in the life
of one of these caption BOTS so here we
have a picture of a person holding for
tonight my screen is very small here and
I can't read up there so maybe you'll
have to tell me was that but a person
holding a cell phone I guess I'll just
read along with you so have a person
holding a cell phone well it's not a
person holding a cell phone but it's
kind of close it's a person holding some
kind of machine so I don't even know
what that is but it's some kind of
musical instrument right
so that's a mixed success or failure
here's some pretty good one a group of
people on a on a field playing football
that's I would call that a you know a
result maybe even A+ here's a group of
people standing on top of a mountain
so less good there's a mountain but as
far as I can tell there's no people but
these systems like to see people because
of both the combination because in the
data set they were trained on there's a
lot of people and people often talk
about people okay I mean and the fact
that you can appreciate both what I said
and why it's funny that's there you did
some of my cognitive activities that
this system is not even trying to do
okay here we've got a building with the
cake I'll go through these fast building
with the cake a large stone building
with the clock tower I think that's
pretty good I'd give that like a b-plus
there's no clock but it's plausibly
right there might be a clock in there
there's definitely something like that
here's a truck parked on the side of a
building I don't know maybe a b-minus
there there is a car on the side of a
building but it's not a truck and it's
and it's it's not doesn't seem like the
main thing in the image okay
here's a necklace made of bananas here's
a large ship in the water this is pretty
good I give this like an a-minus or
b-plus because there is a ship in the
water but it's not very large it's
really more of like a tugboat or
something here's a sign sitting on the
grass you know in some sense that's
great no but it but in another sense
it's really missing what's actually
interesting and important and meaningful
to humans
here's a
here's a garden is in the dirt a pizza
sitting on top of the building a small
house with the red brick building that's
pretty good although a kind of weird way
of saying it a vintage photo of a pond
that's good they like vintage photos a
group of people that are standing in the
grass near a bridge again there's two
people and there's some grass and
there's a bridge but it's really not
what's going on a person in the yard
okay kind of a group of people standing
on top of the boat there's a boat
there's a group of people they're
standing but again it's what the
sentence that you see is is more based
on a bias of what people have said in
the past about images that are only
vaguely like this a clock tower is a
little at night that's really I think
pretty impressive a large clock mounted
to the side of the building a little bit
less so a snow-covered feel very good a
building with snow on the ground a
little bit less good there's no snow
white some people who I don't know them
but I bet that's probably right because
face identifying faces and recognizing
people who are famous because they won
you know medals and the Olympics
probably I would trust current pattern
recognition systems to get that a
painting of a base in front of a mirror
less good also a famous person there but
we didn't get him a person walking in
the rain again there is sort of a person
and there's some puddles but not you
know a group of stuffed animals a car
parked in a parking lot that's good a
car parked in front of a building less
good a plate with a fork and knife a
clear blue sky okay so you get the idea
again like if you actually go and play
with the system partly because I think
Mike but my friends at Microsoft told me
they've improved at some you know I this
is partly for entertainment values you
know I chose what also would be the
funnier example so I'm quite I want to
be quite honest about it and these are
I'm not trying to take away what our
impressive AI technologies but I think
it's clear that there's a sense of
understanding any one of these images
that it's important to see that even
when it seems to be correct right if it
can make the kind of errors that it
makes that even when it seems to be
correct it's probably not doing what
you're doing and it's probably not even
trying to scale towards the dimensions
of intelligence that we think about when
we're talking about human intelligence
okay another way to put this I'm going
to show you a really insightful blog
post from one of your other speakers so
in a couple of days I'm not sure you're
going to have Andre
Karpov a who's one of the leading people
in deep learning this is a really great
blog post he wrote a couple of years ago
when he was I think still at Stanford he
got his PhD from Stanford he did he
worked at Google a little bit on some
early big neural net AI projects there
he was an open AI he was one of the
founders of open AI and recently he
joined Tesla as their director of AI
research but about five years ago he was
looking at the state of computer vision
from a human intelligence point of view
and and lamenting how far away we were
okay so this is the title of his blog
post the state of computer vision
nai-nai we are really really far away
and he took this image which was a sort
of a famous image in its own right it
was a popular image of Obama back when
he was president kind of playing around
as he liked to do when he was on tour so
if you take a look at this you can see
you probably all can recognize the
previous President of the United States
but you can also get the sense of where
he is and what's going on and you might
see people smiling and you might get the
sense that he's playing a joke on
someone can you see that right so how do
you know that he's playing a joke and
what that joke is well as Andre goes on
to talk about in his blog post too if
you think about all the things that that
you have to really deploy in your mind
to understand that it's a huge list of
course it starts with seeing people and
objects and maybe doing some face
recognition but you have to do things
like for example notice his foot on the
scale and understand enough about how
scales work that when a foot presses
down it exerts force that the scale is
sensitive doesn't just magically measure
people's weight but it does that somehow
through force you have to see who can
see that he's doing that and who can't
who cannot see that he's doing that
right in particularly the person on the
scale and why some people can see that
he's doing that and can see that some
other people can't see it why that makes
it funny to them okay and someday we
should have machines that can understand
this but hopefully you can see why what
I would I what the kind of architecture
that I'm talking about would be the
building blocks of the ingredients to be
able to get them to do that now I when I
again I prepared a version of this talk
a few months ago and I wrote to Andre
and I said I was gonna use this and I
was curious if he how what you know if
he had any reflections on this and where
he thought we were relative to five
years ago because a certain
a lot of progress has been made but he
said here's his email I hope he doesn't
mind me sharing it but I mean again he's
a very honest person and that's one of
the many reasons why he's such an
important person right now in AI okay
he's both very technically strong and
honest about what we can do what we
can't do and as he says well what does
he say it's nice to hear from you it's
funny you should bring this up I was
also thinking about writing a a return
to this and in short basically I don't
believe we've made very much progress
right he points out that in his long
list of things that you'd need to
understand the image we have made
progress on some the ability to again
detect people and do face recognition
for well-known individuals okay but
that's kind of about it all right
and he wasn't particularly optimistic
that the current route that's being
pursued an industry is is anywhere close
to solving or even really trying to
solve these larger questions um if we
give this image to that caption bot you
know what we see is again represents the
same point so here's the caption bot it
says I think it's a group of people
standing next to a man in a suit and tie
right so that's right right as far as it
goes it just doesn't go far enough and
the current the current ideas of built a
data set train a deep learning algorithm
on it and then repeat um aren't really
even I would venture trying to get to
what we're talking about or here's
another I'll just give you one other
example of a couple of photographs from
my recent vacation and a nice warm
tropical look how which I think
illustrates ways in which again the gap
where we have machines that can say beat
the world's best at go but can't even
beat a child at tick-tack-toe
now what do I mean by that well you know
of course we can build we don't even
need reinforcement learning or deep
learning to build a machine that can
they can win or tie do is do optimally
in tic-tac-toe but think about this this
is a real tic-tac-toe game which I saw
on the grass outside my hotel right what
do you have to do to look at this and
recognize that it's a tic-tac-toe game
you have to see the objects you have to
see what's you know in some sense
there's a three by three grid but it's
but it's only abstract right it's only
delimited by this these ropes or strings
okay it's not actually a grid in any
simple geometric sense all right but yet
a child can look at that and indeed
here's an actual child who was looking
at it and recognized oh it's a game of
tic-tac-toe and even know what they need
to do to win
we put the X and completed and now
they've got three in a row right that's
that's literally child's play okay
you showed this sort of thing though to
one of these you know image
understanding caption BOTS and I think
it's a close-up of a sign okay again
it's not like saying that this is a
close-up of a sign is is not the same
thing I would venture as a as a
cognitive or computational activity
that's going to give us what we need to
say recognize the objects to recognize
it as a game to understand the goal and
how to plan to achieve those goals
whereas this kind of architecture is
designed to try to do all of these
things ultimately right and I bring in
these examples of games or jokes to
really show where perception goes to
cognition you know that and all the way
up to symbols right so to get objects
and forces and mental states that's the
cognitive core but to be able to get
goals and plans and what do I do or how
do I talk about it that's symbols okay
here's another way into this and it's
one that also motivates I think a lot of
really good work on the engineering side
and a lot of our interest in the science
side is think about robotics and think
about what do you have to do to you know
what is the brain have to be light to
control the body so again you're gonna
hear from shortly I think maybe it's
next week from Mark raybert who's one of
the founders of Boston Dynamics which is
one of my favorite companies anywhere
they're without doubt the leading maker
of humanoid robots legged locomoting
robots in industry they have all sorts
of other really cool robots robots like
dogs robots that have all you know I
think you'll even get to see a live
demonstration of my new robots this
really awesome impressive stuff okay um
but what about the minds and brains of
these robots well again if you ask mark
ask them how much of human-like
cognition do they have in their robots
and I think he would say very little in
fact we have asked him that and he would
say very little he has said very little
he's actually one of the advisors of our
Center and I think in many ways were
very much on the same page we both want
to know how do you build the kind of
intelligence that can control these
bodies like the way a human does alright
um here's another example of an industry
robotics effort this is Google's arm
farm
where you know they've they've got lots
of robot arms and they're trying to
train them to pick up objects using
various kinds of deep learning and
reinforcement learning techniques and I
think it's one approach I just think
it's very very different from the way
humans learn to say control their body
and manipulate objects and you can see
that in terms of things that go back to
what you were saying when you're
introducing me right think about how
quickly we learn things right here you
have these the arm farm is trying to
generate you know effectively maybe if
not infinite but hundreds of thousands
millions of examples of reaches and
pickups of objects even with just a
single gripper and yet a child who in
some ways can't control their body
nearly as well as robots can be
controlled at the low level and is able
to do so much more so I'll show you two
of my favorite videos from YouTube here
which motivate some of the research that
we're doing the one on the left is a one
and a half year old and the other ones a
one year old so just watch this one and
a half year old here doing a popular
activity for many kids as a playing hmm
you see video up there I'd okay there we
go okay so he's he's on doing this
stacking Cup activity alright he's
stacking up cups to make a tall tower
he's got a stack of three and what you
can see for the first part of this video
is it looks like he's trying to make a
second stack and that he's trying to
pick up at once basically he's trying to
make a stack of two that'll go on the
stack of three and you know he's trying
to debug his plan because it's it got a
little bit stuck here but and think
about I mean again if you know anything
about robots manipulating objects even
just what he just did no robot can
decide to do that and actually do it
right at some point he's almost got it
it's a little bit tricky but at some
point he's gonna get that stack of two
he realizes he has to move that object
out of the way look at what he just did
move it out of the way use two hands to
pick it up and now he's got a stack of
two on a stack of three and suddenly you
know subgoal completed he's now got a
stack of five and he gives himself a
hand because he know he knows he
accomplished a keyway point along the
way to his final goal that's a kind of
early symbolic cognition right to
understand that I'm trying to build a
tall tower but a tower is made up of
little towers it's you know it can end
and you can take a tower and put it on
top of another tower or stack a stack on
us
a can you have a bigger stack right so
think about how he goes from bottom up
perception to the objects of the physics
needed to manipulate the objects to the
ability to make even those early kinds
of symbolic plans at some point he keeps
doing this he puts another stack on
there I'll just jump to the end
oops sorry you missed it so he he gets
really excited and he gives himself
another big hand but falls over okay
again Boston Dynamics now has robots
that could pick themselves up after that
that's really impressive again but all
the other stuff to get to that point we
don't really know how to do in a robotic
setting or think about this baby here
this is a younger baby this is one of
the Internet's very most popular videos
because it features a baby and a cat and
but the babies doing something
interesting he's got the same cups but
he's decided he's again decided to try a
new thing so this think about creativity
he's decided that his goal is to stack
up cups on the back of a cat I guess
he's asking how many cups can I fit on
the back of a cat well three let's see
can I fit more let's try another one
okay well he can't fit more than three
it turns out and then he then does it's
not working so he changes his goal now
his goal appears to be to get the cups
on the other side of the cat now watch
that part when he reaches back behind
him there that's I'll just pause it
there for a moment
umm someone he just reached back there
that's a particularly striking moment in
the video it shows a very strong form of
what we call in cognitive science object
permanence okay that's the idea that you
represent objects as these permanent
enduring entities in the world even when
you can't see them in this case he
hadn't seen or touched that object
behind him for like at least a minute
right maybe much longer I don't know and
yet he still knew it was there and he
was able to incorporate it in his plan
right there's a moment before that when
he's about to reach for it but then he
sees this other one right and it's only
when he's now exhausted all the other
objects here that he can see he's like
okay now time to get this object and
bring it into play right so think about
what has to be going on in his brain for
him to be able to do that right that's
like the analog of you understanding
what's behind you okay um it's not that
these things are impossible to capture
machines far from it it's just that like
training a deep neural network or any
kind of pattern recognition system we
don't think is going to do it but we
think by reverse engineering how it
works in the brain
we might be able to do it I think we can
can do it okay it's not just humans that
do this kind of activity here's a couple
of again rather famous videos you can
watch all of these on YouTube
crows are famous object manipulators and
tool users but also orangutangs other
primates rodents we can watch if we just
hey let me pause this one for a second
if we watch this orangutan here he's got
a bunch of big legos and over the course
of this video he's building up a stack
legos it's really quite impressive
you're just jumping to the end there's
actually some controversy out there of
whether this video is a fake but the
controversy isn't about you know it's
not like whether it was I don't know
dumb with computer animation some people
think the video was actually filmed
backwards that a human built up the
stack and the orangutan just slowly
disassembled it piece by piece and it
turns out it's remarkably hard to tell
whether it's played forward or backwards
in time and people have argued over
little details because you know it would
be quite impressive if an orangutan
actually was able to build up this
really impressive stack of Legos but I
would submit that it would be almost as
impressive if he disassembled it think
about the activity I mean if I wanted to
disassemble that the easiest thing to do
would just be to knock it over
that's really all most robots could do
but to piece by piece disassemble it
even if it's played backwards like this
that's still a really impressive act of
symbolic planning on physical objects or
here you've got this this famous Mouse
this you can find on the internet under
the mouse versus cracker video and what
you'll see here over the course of this
video is a mouse valiantly and mostly
hopelessly struggling with a cracker
that they're hoping to bring back to
their nest I guess it's a very appealing
big meal and at some point after just
trying to get it over the over the wall
at some point the mouse just gives up
because it's just never gonna happen and
he just goes away except that because
even Mouse's can dream or mice can dream
some point he decides okay I'm just
gonna come out for one more try and he
tries one more time and this time
valiantly gets it over yeah isn't that
very impressive congratulations guys
okay you don't have to clap form you can
clap for me at the end or clap for
whoever later okay but I want to applaud
the mouse there every time I see that
okay but again think what had to be
going on in his brain
able to do that all right it's a crazy
thing and yet he formulated the goal and
was able to achieve it I'll just show
one more video that is really more about
science these other ones are you know
some of them actually were from
scientific experiments but this is one
that motivates a lot of the science that
I do and it's to me it sets up kind of a
grand cognitive science challenge for AI
and robotics it's from an experiment
with humans again eighteen month olds or
one-and-a-half year old so the the kids
in this experiment were the same age is
the first baby I showed you the one who
did the stacking and 18 months is really
a very very good age to study if you're
interested in intelligence for reasons
we can talk about later if you're
interested this is from a very famous
experiment done by two psychologists
Felix Warren akin and Michael Tomasello
and it was studying the spontaneous
helping behavior of young children it
also contrasted humans and chimps and
the punchline is that chips sometimes do
things that are kind of like what this
human did but not nearly as reliably or
as flexibly okay so not nearly it is and
I'll show you a particular kind of
unusual situation where human kids had
relatively little trouble figuring out
kind of what to do or even whether they
should do it whereas basically no chimp
did what you're gonna see humans
sometimes doing here so the experimenter
in this movie I'll turn on the sound
here if you can hear it the experimenter
is the tall guy and the participant is
the little kid in the corner there there
there's sound but no words right and at
some point he stops and then the kid
just does whatever they want to do so
watch what he does he goes over he opens
the cabinet looks inside then he steps
back and he looks up at felix and then
looks down okay and then the action is
completed now well wonder I want you to
watch it one more time and think about
what's gotta be going inside the kid's
head to understand this to understand
like so it seems like what it looks like
to us is the kid figured out that this
guy needed help and helped him and the
paper is full of many other situations
like this this is just one OK but the
key idea is that the situation is
somewhat novel people have seen people
holding books and opening cabinets but
probably it's very rare to see this kind
of situation exactly right it's
different in some important details from
what you might have seen before and
there's other ones in there that are
really truly novel because they just
made up a machine right there
okay but somehow he has to understand
causally from the way the guy's banging
the books against the thing that it's
it's sort it's sort of both a symbol but
it's also somehow he's got to understand
what he can do and what he can't do and
then what the kid can do to help and
I'll show this again but really just
watch the main part I want you to see is
I'll just sort of skip ahead so watch
this part here let's say I'll just jump
right when he watch right now he's about
to look up he looks up and makes eye
contact and then his eyes look down so
again he looks up he looks up and then a
saccade a sudden rapid eye movement down
down to his hands up down okay so that's
again that's this brain OS in action
right he's making one glance small
glance at the big guy's eyes just to
make eye contact to see to get a signal
did I understand what you wanted and did
you did you register that joint
attention and then he makes a prediction
about what the guy's gonna do so he
looks right down he doesn't just like
look around randomly he looks right down
to the guy's hands to track the action
that he expects to see happening if I
did the right thing to help you then I
expect you're gonna put the books there
okay so you can see these things
happening and we want to know what's
going on inside the mind that guides all
of that all right so that's the sort of
big scientific agenda that we're working
on over the next few years where we
think some kind of human understanding
of human intelligence in scientific
terms could lead to all sorts of AI
payoffs in particular suppose we could
build a robot that could do what this
kid and many other kids and these
experiments do just say help you out
around the house without having to be
programmed or even really instructed
just to kind of get a sense oh yeah you
need to have at that shirt let me help
you out okay even 18 month olds will do
that sometimes not very reliably or
effectively sometimes they'll try to
help and really do the opposite right
but imagine if you could take the the
flexible understanding of humans actions
goals and so on and make those reliable
engineering technology that would be
very useful and it would also be related
to say machines that you can actually
start to talk to and trust in some ways
right that shared understanding so how
are we gonna do this well let me spend
the rest of the time
talking about how we try to do this
right some of the some of the technology
that we're building both in our group
and more broadly to try to make these
kinds of architectures real and I'll
talk about two or three technical ideas
again not in any detail all right
um what is the idea of a probabilistic
program so this is a kind of a you think
of it as a computational abstraction
that we can use to capture the
common-sense knowledge of this core
cognition so when I say we have an
intuitive understanding of physical
objects in people's goals how do I build
a model of that model you have in the
head probabilistic programs a little bit
more technically our one way to
understand them is as a generalization
of Bayesian networks or other kinds of
directed graphical models if you know
those okay but where instead of defining
a probability model on a graph you
define it on a program and thereby have
access to a much more expressive toolkit
of knowledge representation so data
structures other kinds of algorithmic
tools for representing knowledge okay
but you still have access to the ability
to do probabilistic inference like in a
graphical model but also causal
inference in a directed graphical model
so for those of you who know about
graphical models that might make some
sense to you but just more broadly what
this is think of this as as a toolkit
that allows us to combine several of the
best ideas not just of the recent deep
learning era but over if you look back
over the whole scope of AI and as well
as cognitive science I think there's
three or four ideas there and more but
definitely like three ideas we could
really put up there that have proven
their worth and have have had have risen
and fallen in terms of each of these had
ideas when the mainstream of the field
thought this was totally the way to go
and every other idea was was obviously a
waste of time and also had its time when
many people thought it was a waste of
time okay and these three big ideas I
would say are first of all the idea of
symbolic representation or symbolic
languages for knowledge representation
probabilistic inference in generative
models to capture uncertainty ambiguity
learning from sparse data and in their
hierarchical setting learning to learn
right and then of course the recent
developments with neural inspired
architectures for pattern recognition
okay each of these things each of these
ideas symbolic languages
probabilistic inference and neural
networks has some distinctive strengths
that are real weak points of the other
approaches right so to take one example
but I haven't really talked about here
people in the but I but you but you
mentioned as an outstanding challenge
for neural networks transfer learning
we're learning to take knowledge across
a number of previous tasks to transfer
to others this is a real challenge and
has always been a challenge in a neural
net ok but is something that's addressed
very naturally and very scalable in for
example a hierarchical Bayesian model
and if you look at some of the recent
attempts really interesting attempts
within the deep learning world to try to
get kinds of transfer learning and
learning to learn they're really cool ok
but many of them are in some ways kind
of reinventing within a neural network
paradigm ideas that people you know
maybe just 10 or 15 years ago developed
in very sophisticated ways in let's say
hierarchical Bayesian models ok and a
lot of attempts to get sort of symbolic
algorithm like behavior in neural
networks again are really you know
they're very small steps towards
something which is a very mature
technology in computer systems and
programming languages probabilistic
programs I'll just sort of advertise
mostly are a way to combine the
strengths of all of these approaches to
have knowledge representations which are
as expressive as anything that anybody
ever did in the symbolic paradigm that
are as flexible at dealing with
uncertainty and sparse data as anything
in the probabilistic paradigm but that
also can support pattern recognition
tools to be able to for example to do
very fast efficient inference in very
complex scenarios and there's a number
of probably that's that that's the kind
of conceptual framework there's a number
of actually implemented tools I'm point
two here on the slide a number of
probablistic programming languages which
you can go explore
for example there's one that was
developed in our group a few years ago
almost 10 years ago now called church
which was the antecedent of some of
these other languages built on a
functional programming course a church
is a probablistic programming language
built on the lambda calculus or really
in Lisp basically but there are many
other more modern tools especially if
you are interested in neural networks
there are tools like for example pyro or
prob torch or Bayes flow that try to
combine all these ideas in a or for
example Jen here which is a project of
the Koch men's singles probably the
computing group these are all things
which are
just in the very beginning stages very
very alpha but you can find out more
about them online or by writing to their
creators and I think this is a this is a
very exciting place where the
convergence of a number of different AI
tools are happening and when and this
will be absolutely necessary for making
the kind of architecture that I'm
talking about work another key idea
which we've been building on in our lab
and I think again many people are using
some version of this idea but maybe a
little bit different from the way we're
doing it is what what version of this
idea that I'd like to talk about is what
I call the game engine in the head so
this is the idea that it's really what
the programs are about when I talk about
problems tick programs I haven't said
anything about what kind of programs
we're using we're just basically these
probablistic programming languages at
their best and Church the language that
that was developed by Noah Goodman and
Vikash and others and Dan Roy and our
group some 10 years ago was intended to
be a turing-complete probabilistic
programming language so any probability
model that was computable or for whose
inferences conditional inferences are
computable you could represent in these
languages but that that leaves
completely open what what I'm actually
gonna what what kind of proto I'm gonna
write to model the world and I've been
very inspired in the last few years by
thinking about the kinds of programs
that are in modern video game engines so
again I'm probably most of you are
familiar with these but if you're and
increasingly they're playing a role in
all sorts of ways an AI but these are
tools that were developed by the video
game industry to allow a game designer
to make a new game with without having
to do most of in some sense many must
have the hard technical work bison from
scratch but rather to focus on the
characters the world the story okay the
things that are more interesting for
designing a novel game in particular we
if we want a player to explore some so
new three-dimensional world but to have
them be able to interact with the world
in real time and to render nice looking
graphics in in real time in an
interactive way as the player moves
around and explores the world or if you
want to populate the world with
non-player characters that will behave
in a even vaguely intelligent way okay
game engines give you tools for doing
all of this without having to write all
of graphics from scratch or all of
physics the rules of physics from
scratch so what are called game physics
engines
and in some sense are a set of
principles but also hacks from Newtonian
mechanics and other areas of physics
that allow you to simulate plausible
looking physical interactions in very
complex world very approximately but
very fast there's also what's called
game AI which are basically very simple
planning models so let's say I want to
have an AI in the game that is like
unguarded that gardens of base and a
player is gonna attack the space so back
in the old Atari days like when I was a
kid you know the guards would just be
like random things that would fire
missiles kind of randomly in random
directions at random times right but
let's say you want a guard to be a
little intelligent so to actually look
around him oh and I see the player and
then to actually start shooting at you
and to even maybe pursue you so that
requires putting a little AI in the game
and you do that by having basically
simple agent models in the game so what
we think and some of you might think
this is crazy and some of you might
think this is very natural idea I get
both kinds of reactions what we think is
that these tools of you know past
approximate renderers physics engines
and sort of very simple kinds of AI
planning are an interesting first
approximation to the kinds of
common-sense knowledge representations
that evolution has built into our brains
so when we talk about the cognitive core
or how do babies start what's what you
know ways in which a baby's brain isn't
a blank slate one interesting idea is
that it starts with something like these
tools and then wrapped inside a
framework for probabilistic inference
that's what we mean by promising
programs that can support many
activities of common sense perception
and thinking so I'll just give you one
example what we call this intuitive
physics engine okay so this is work that
we did in our groups that Pete Battaglia
and Jess Hamrick did started this work
about five years ago now
where we showed people you know in some
sense and this is this is also an
illustration of a kind of experiment
that you might do what you might keep
talking about science like I'll show you
now a couple of experiments right so we
would show people simple physical scenes
like these blocks world scenes and ask
them to make a number of judgments and
the model we built does it basically a
little bit of probabilistic inference in
a game style physics engine it perceives
the physical state and imagines a few
different possible ways the world could
go over the next one or two seconds to
answer questions like will the stack of
blocks fall
or if they fall how far will they fall
or which way will they fall or what
would happen if say one of the colored
one color of blocks are one material
like the green stuff is ten times
heavier than the gray stuff or vice
versa how will that change the direction
of fall or look at those red and yellow
stack blocks some of which look like
they should be falling but aren't so why
can you infer from the fact that they're
not fall in that one color block is much
heavier than the other let me show you a
sort of a slightly weird task it's in a
behavioral experiment sometimes we we do
weird things so that we can test ways in
which you use your knowledge that you
didn't just you know learn from pattern
recognition but use it to do new kinds
of tasks that you'd never seen before so
here's a task which you know many of you
have maybe seen me talk about these
things so you might have seen this task
but probably only if you saw me give a
talk around here before we call this the
red yellow task and again we'll make
this one interactive so imagine that the
blocks on the table are knocked hard
enough to bump the tables bumped hard
enough to knock some of the blocks onto
the floor so you tell me is it more
likely to be red blocks or yellow blocks
what do you say red okay good
how about here yellow good how about
here uh-huh here here okay here here
okay so
you just experience for yourself what
it's like to be an objective one of
these experience we just did the
experiment here the data is all captured
on video sort of right okay you could
see that sometimes people were very
quick other times people were slower
sometimes there was a lot of consensus
sometimes there was a little bit less
consensus right
that reflects uncertainty so again
there's a long history of studying this
scientifically that you know you could
but you can see something you can see
the probabilistic inference at work
probabilistic inference over what well I
would say one way to describe it is over
one or a few short low precision
simulations of the physics of these
scenes so here is what I mean by this
I'm gonna show you a video of a game
engine reconstruction of one of these
scenes that simulates a small bump so
here's a small bubble here's the same
scene with a big bump okay now notice
that at the micro level different things
happen but at the cognitive or macro
level that matters for common sense
reasoning the same thing happened namely
all the yellow blocks went over onto one
side of the table and few or none of the
red blocks did so it didn't matter reach
of those simulations you ran in your
head you'd get the same answer in this
case right this is one that's very easy
and high confidence and quick also you
didn't have to run the simulation for
very long you only have to run it for a
few time steps like that to see what's
gonna happen or similarly here you only
have to run it for a few time steps okay
and it doesn't have to be even very
accurate even a fair amount of
imprecision will give you basically the
same answer at the level that matters
for common sense so that's the kind of
thing our model does it runs a few low
precision simulations for a few time
steps but if you take the average of
what happens there and you compare that
with people's judgments you get results
like what I show you here the
scatterplot shows on the y-axis the
average judgments of people on the
x-axis the average judgments of this
model and it does a pretty good job it's
not perfect but the model basically
captures people's graded sense of what's
going on in this scene and many of these
others okay and it doesn't do it with
any learning but I'll come back to that
in a second it just does it by
probabilistic reasoning over a game
physics simulation now we can use and we
have used the same kind of technology to
capture in very simple forms really just
proofs of concept at this point the kind
of common-sense physical scene
understanding in child in a child
playing with blocks or other objects or
in what might go on in a young child
understanding of other people's actions
what we
called the intuitive psychology engine
where now the probabilistic programs are
defined over these kind of very simple
planning and perception programs and I
won't go into any details I'll just
point to a couple of papers that my
group played a very small role in but we
provided some models which together with
some infant researchers people working
on both of these are experiments that
that were done with 10 or 12 month
infants so younger than even some of the
babies I showed you before but basically
like that youngest baby the one with the
cat here's an example of showing simple
physical scenes these are moving objects
to 12 month olds where they saw a few
objects bouncing around inside a gumball
machine and after some point in time the
scene gets occluded you'll see the scene
is occluded and then after another
period of time one of the objects will
appear at the bottom and the question is
is that the object you expected to see
or not is its expected or surprising the
standard way you study what infants know
is by is by what's called looking time
methods just like an adult if I show you
something that's surprising you might
look longer okay if you're bored you'll
look away all right so you can do that
same kind of thing with infants and by
measuring how long they look at a scene
you can measure whether you've shown
them something surprising or not all
right
people have there are literally hundreds
of studies if not more using looking
time measures to study what infants know
but only with this paper that we
published a few years ago did we have a
quantitative model we're able to show a
relation between inverse probability in
this case and surprise so things which
were objectively lower probability under
one of these probabilistic physics
simulations across a number of different
manipulations of how fast the objects
were where they were when the scene was
occluded how long the delay was various
physically relevant variables how many
objects there were one type or another
infants expectations connected with this
model or another paper that we published
that one was was done that the
experiments that were done by era note
eggless and Luca bananas lab here is a
study that was done just recently by
sherry Lu inless spell keys lab at there
at Harvard but they're part they're
partners with us and CBMM which was
about infants understanding of goals so
this is more like again understanding of
agents and intuitive psychology
we're in again in very simple cartoon
scenes you show an infant an agent that
seems to be doing something like an
animated cartoon character but it jumps
over a wall or
rolls up a hill or it jumps over a gap
and the question is basically how much
does the agent want the goal that it
seems to be trying to achieve and what
this study showed okay and the models
here we're done by Tomer omen was that
infants appeared to be sensitive to the
physical work done by the agent the more
work the agent did in a sense of the
integral of force applied over a path
the more the infant's thought the agent
wanted the goal we think of this as
representing what we've sometimes called
the naive utility calculus so the idea
that there's a basic calculus of cost
and benefit you know we take actions
which are a little bit costly to achieve
goal states which give us some reward
that's the most basic way the oldest way
to think about rational intentional
action and it seems that even
ten-month-old
understand some version of that where
the cost can be measured in physical
terms
okay I see I'm running a little bit
behind on time and and I wanted to leave
some time for discussion so I'll I'll
just go very quickly through a couple of
other things and and Lee and happy to
stay around at the end for discussion
okay
the what I showed you here was the
science where does the engineering go so
one way one thing you can do with this
is say build a machine system that can
look not a little animated cartoon like
these baby experiments but a real person
doing something and again combine
physical COFF and constraints of actions
with some understanding of the agents
utilities that's the math of planning to
figure out what they want it so look in
this scene here and see if you can judge
which object that the woman is reaching
for so you can see there's there's a
grid of four by four objects there's
sixteen objects here and she's gonna be
reaching for one of them raise it's
gonna play in slow motion but raise your
hand when you know which one she's
reaching for ok so just watch and raise
your hand when you know which one she
wants okay so most of
they're up by now alright and notice I
was looking at your hands not here but
went but what happened is most of the
hands were up at the about the time when
that gray or the one that - line shot up
okay
that's not human data you provided the
data this is our model so our model is
predicting more or less when you're able
to say what her goal was okay it's well
before she actually touched the object
how does the model work again I'll skip
the details but it does the same kind of
thing that that our models of those
infants did namely it but in this case
it does it with a full body model from
robotics so we use what's called the mu
Joko physics engine which is a standard
tool in robotics for planning physically
efficient reaches of say a humanoid
robot and we say we can give this
planner program a goal object as input
we can give it each of the possible goal
objects as input and say plan the most
physically efficient action so the one
that uses like the least energy to get
to that object and then we can do a
Bayesian inference this is the
probabilistic inference part the program
is them is the MU Joker planner okay but
then we can say I want to do Bayesian
inference to work backwards from what I
observed which was the action to the
input to that program what goal was
provided as input to the planner and
here you can see the full array of four
by four possible inputs and those bars
that are moving up and down that's the
Bayesian posterior probability of how
likely each of those was to be the goal
and what you can see is it converges on
the right answer at least well it turns
out to be the ground truth right answer
but it's also the right answer according
to what people think with about the same
kind of data that people took now you
might say well okay I'm sure if I just
wanted to build a system that could
detect what somebody was reaching for I
could generate a training data set of
this sort of scene and train something
up to analyze patterns of motion but
again because the engine in your head
actually does something we think more
like this it does what we call inverse
planning over a physics model it can
apply to much more interesting scenes
that you haven't really seen much of
before so take the scene on the left
right where again you see somebody
reaching for one of a four by four array
of objects but what you see is a strange
kind of reach can you see why he's doing
that strange reach up there it's a
little small but what is you can see
that he's reaching over something right
it's actually a pane of glass right you
see that and then there's this other guy
who's helping him who sees what he wants
and hands
the thing he wants so how does the firt
the guy in the foreground see the other
guy's goal
how does he and for his goal and know
how to help him and then how do we look
at the two of them and figure out who's
trying to help who or that in a scene
like this one here that it's not
somebody trying to help somebody but
rather the opposite
okay so here's a model on the left of
how that might work right and we think
this is the kind of model needed to
tackle this sort of challenge here right
basically it's a model it's a we take
this model of planning sort of maximal
expected utility planning which you can
run backwards but then we recursively
nest these models inside each other so
we say an agent is helping another agent
if this agent is acting apparently to us
seems to be maximizing an expected
utility that's a positive function of
that agents expectation about another
agents expected utility and that's what
it means to be a helper hindering is
sort of the opposite if one seems to be
trying to lower somebody else's utility
okay and we've used these same kind of
models to also describe infants
understanding of helping and hindering
in a range of scenes I'll just say one
last word about learning because
everybody wants to know about learning
and and the the key thing here and it's
definitely part of any picture of AGI
but the thought I want to leave you on
is really about what learning is about
ok I'll be just a few more slides and
then I'll stop I promise none of the
models I showed you so far really did
any learning they certainly didn't do
any task specific learning ok we set up
a probable state program and then we let
it to inference now that's not to say
that we don't think people learn to do
these things we do but the real learning
goes on when you're much younger right
everything I showed you in basic form
even a one-year-old baby can do ok the
basic learning goes on to support these
kinds of abilities not that there isn't
learning beyond one year but the basic
way you learn to say solve these physics
problems is what goes on in your baton
in the brain of a child between 0 and 12
months so this is just an example of
some phenomena that come from the
literature on infant cognitive
development these are very rough
timelines you can take pictures of this
if you like this is always a popular
slide because it really is quite
inspiring I think and I can give you
lots of literature pointers but I'm
summarizing in very broad strokes with
big error bars what we've learned in the
field of infant cognitive development
about when and how kids seem to have to
at least come to certain understand
of basic aspects of physics so if you
really want to study how people learn to
be intelligent a lot of what you have to
study are kids at this age you have to
study what's already in their brain at
zero months and what they learn and how
they learn between four six eight ten
twelve and so on and on up beyond that
okay now well effectively what that
amounts to we think is if what you're
learning is something like a let's say
an intuitive game physics engine to
capture these basic abilities then what
we need if we're gonna try to
reverse-engineer that is what we might
think of as a program learning program
if your knowledge is in the form of a
program then you have to have programs
that build other programs right this is
what I was talking about the beginning
about learning as building models of the
world or ultimately if you think what we
start off with is something like a game
engine that can play any game then what
you have to learn is the program of the
game that you're actually playing or the
many different games that you might be
playing over your life so think of
learning as like programming the game
engine in your head to fit with your
experience and and to fit with the
possibilities that you seem like you can
take now this is what you could call the
hard problem of learning if you come to
learning from say neural networks or
other tools and machine learning right
so what makes machine makes most of
machine learning go right now and
certainly what makes neural network so
appealing is that you can set up a
basically a big function approximator
that can approximate many of the
functions you might want to do in a
certain application or task but in a way
that's end-to-end differentiable and
with a meaningful cost function so you
can have one of these nice optimization
landscapes you can compute the gradients
and basically just roll downhill until
you get to an optimal solution but if
you're talking about learning as
something like search in the space of
programs we don't know how to do
anything like that yet we don't know how
to set this up as any kind of a nice
optimization problem with any notion of
smoothness or gradients okay rather what
we need is a instead of learning as like
rolling downhill effectively right a
process which just if you're willing to
wait long enough you know some you know
simple algorithm will take care of think
of what we call the idea of learning as
programming there's a popular metaphor
in cognitive development called the
child of scientists which emphasizes
children as active theory builders and
children's play as a kind of kind of
casual experimentation but this is the
algorithmic complement to that what we
could call the child as
or around MIT will say the child is
hacker but the rest of the world if you
say child is hacker they think of
something someone who breaks into your
email and steals your credit card
numbers we all know that hacking is you
know making your code more awesome right
if your knowledge is some kind of code
or legal library of programs then
learning is all the ways that a child
hacks on their code to make it more
awesome that more awesome can mean more
accurate but it can also mean faster
more elegant more transportable to other
applications or their tasks more
explainable to others maybe just more
entertaining okay children do all of
them have all of those goals and
learning and the activities by which
they make their code more awesome also
correspond to many of the activities of
coding alright so think about all the
ways on a day-to-day basis you might
make your code more awesome
all right you might tune you might have
a big library of existing functions with
some parameters that you can tune on a
data set that's basically what you do
with backprop or stochastic gradient
descent in training a deep learning
system but think about all the ways in
which you might actually modify the
underlying function so write new code or
take old code from some other thing and
map it over here or make a whole new
library of code or refactor your code to
some other you know some other basis for
that that will work more robustly and be
more extensible or transpiling or
compiling right or even just commenting
your code or asking someone else for
their code ok again these are all ways
that we make our code more awesome and
children's learning has analogs all of
these that we would want to understand
as an engineer from an algorithmic point
of view so in our group we've been
working on on various early steps
towards this and again we don't have
anything like program writing programs
at the level of children's learning
algorithms but one example of something
that we did in our group which you might
not have thought of being about this but
it's definitely the AI work we did that
got the most attention in the last
couple of years from our group we had
this paper that was in science it was
actually on the cover of science sort of
just hit the market at the right time if
you like and it got about a hundred
times more publicity than anything else
I've ever done which is partly a
testament to the really great work that
Brendan Lake who was the first author
did for his PhD here but much more so
just about the hunger for AI systems at
the time when we published this in 2015
and we built a machine system that the
way we described it what
doing human level concept learning four
simple concept very simple visual
concepts these handwritten characters in
many of the world's alphabets for those
of you who know the famous Emnes data
set in the data set of handwritten
digits 0 through 10 or 30 through 9
sorry that drove so much good research
in deep learning and pattern recognition
it did that not because Jana Kuhn who
put that together or Geoff Hinton who
did a lot of work on deep learning with
M Nez they were interested fundamentally
in character recognition that they saw
that as a very simple testbed for
developing more general ideas and
similarly we did this work on getting
machines to do what we kind of one-shot
learning of generative models also to
develop more general ideas we saw this
as learning very simple little mini
probabilistic programs in this case what
are those programs they're the programs
you use to draw a character so ask
yourself how can you look at any one of
these characters and see in a sense how
somebody might draw it the way we tested
this in our system was this little
visual Turing test where we showed
people one character in a novel alphabet
and we said draw another one and then we
compared nine people like say on the
left and nine samples from our machine
say on the right and we said we asked
other people could you tell which was
the human drawing another example or
imagining another example in which was
the machine and people couldn't tell
when I said ones on the left ones on the
right I don't actually remember and on
different ones you can see if you can
tell it's very hard to tell can you tell
which is for each one of these
characters which new set of examples
were drawn by a human versus a machine
here's the right answer and probably you
couldn't tell the way we did this was by
assembling a simple kind of program
learning program right
so we basically said when you draw a
character you're assembling strokes and
sub strokes with goals and sub goals
that produce ink on the page and when
you see a character you're working
backwards to figure out what was the
program the most efficient program that
did that so you're basically inverting a
probabilistic program doing Bayesian
inference to the program most likely to
have generated what you saw this is one
small step we think towards being able
to learn programs to being able to learn
something ultimately like a whole game
engine program the last thing I'll leave
you with is just a pointer to sort of
work in action
right so this is some work being done by
a current PhD student who works partly
with me but also with armando salar
Lezama and cecil this is kevin Ellis
it's an example of what's now I think
again a
urging exciting area and AI well beyond
anything that we're doing is the is
combining techniques from where amando
comes from which is the world of
programming languages not machine
learning or AI but tools from
programming languages which can be used
to automatically synthesize code okay
with the machine learning toolkit in
this case a kind of Bayesian Men and a
minimum description length idea to be
able to make again what is really one
small step towards machines that can
learn programs by basically trying to
efficiently find the shortest simplest
program which can capture some data set
so we think by combining these kinds of
tools in this case let's say from
Bayesian inference over programs with a
number of tools that have been developed
in other areas of computer science that
don't look anything or haven't been
considered to be machine learning or AI
like programming languages it's one of
the many ways that going forward we're
gonna be able to build smarter more
human-like machines so just to end then
what I've tried to tell you here is
taught first of all identify the ways in
which human intelligence goes beyond
pattern recognition to really all these
activities of modeling the world okay to
give you a sense of some of the domains
where we can start to study this in
common sense scene understanding for
example or you know something like
one-shot learning for example like what
we were just doing there or learning is
programming the engine in your head okay
and to give you a sense of some of the
technical tools probabilistic programs
program synthesis game engines for
example as well as a little bit of deep
learning that bringing together we're
starting to be able to make these things
real okay now that's the science agenda
and the reverse engineering agenda but
think about for those of you who are
interested in technology what are the
many big AI frontiers that this opens up
so the one I'm most excited about is
this idea which is which I've
highlighted here in our big research
agenda this is one I'm most excited
about to work on for the you know it
could be the rest of my career honestly
but it's really what is what is the
oldest and maybe the best dream of AI
researchers of how to build a human-like
intelligence system a real a GI system
it's the idea that Turing proposed when
he proposed the Turing test or Marvin
Minsky proposed this at different times
in his life or many people have proposed
this right which is to build a system
that grows into intelligence the way a
human does that starts like a baby and
learns like a child
tried to show you how we're starting to
be able to understand those things what
a baby's mind starts with how children
actually learn and looking forward we
might we might imagine that someday
we'll be able to build machines that can
do this I think we can actually start
working on this right now and we're and
that's something that we're doing in our
group so if that kind of thing excites
you then I encourage you to work on it
maybe even with us or if any one of
these other activities of human
intelligence excite you I think taking
the kind of science-based reverse
engineering approach that we're doing
and then trying to put that into
engineering practice it's it's this is
this is a this is not just a possible
route but I think it's it's quite
possibly the most valuable route that
you could work on right now to try to
actually achieve at least some kind of
artificial general intelligence
especially the kind of intelligence AI
system that's going to live in a human
world and interact with human there's
many kinds of AI systems that could live
in worlds of data that none of us can
understand or will ever live in
ourselves but if you want to build
machines that can live in our world and
interact with us the way we are used to
interacting with other people then I
think this is a route that you should
consider okay thank you
[Applause]
hi there so early in the talk you
expressed some skepticism about whether
or not industry would get us to
understanding human level intelligence
it seems that there's a couple of trends
that favor industry one is the industry
is better than that academia
accumulating resources and plowing back
into the topic and it seems at the
moment we've got a bit of brain drain
going on form academia into industry and
that seems like a on going trend yeah if
you look at something like learning to
fly or learning to fly into space then
it looks like a story is one of Industry
kind of taking over the field and going
off on its own yeah a little bit
academia academics still have a role but
industry kind of dominates so yes is
industry going to overtake the field you
think well that's a really good question
and it's got several good questions
packed into one there right I didn't
mean to say I didn't this wasn't meant
to say go academia bad industry right
what I was taught what I what I tried to
say was the approaches that are
currently getting the most attention in
industry and they're really because
they're really the most valuable ones
right now for the short term you know
any industry is really focused on what
it can do what are the value
propositions on basically a two year
time scale at most I mean if you ask say
Google researchers to take the most
prominent example it's pretty much what
they'll all tell you okay maybe maybe
things that might you know pay off
initially in two years but maybe take
five years or more to really develop but
if if you can't show that it's gonna do
something practical for us in two years
in a way that matters for our bottom
line then it's not really worth doing
okay so what when we say what I'm
talking about is the technologies which
right now industry sees as meeting that
specification and what I'm saying is
right now I think those that's that's
not where the route is to something like
human-like but not the most valuable
promising route to human-like kinds of
AI systems all right but I hope that
like in the cases you said you know the
basic research that we're doing now will
be successful enough that it will get
the attention of industry when the time
is right but I think so you know I mean
I hope at some point you know it won't
it will only at least the engineering
side will have to be done in industry
not just in academia but you're also
pointing to issues of like brain drain
and other things like that
but I think it's these are real issues
confronting our community I think
everybody knows this and I'm
this will come up multiple times here
which is you know I think we have to
find ways to even now to combine the
best of the idea of the energy and the
resources of academia and industry if we
want to keep doing basically something
interesting right if we will if we just
want to redefine AI to be well whatever
people currently call AI but scaled up
well then then then fine forget about it
and or if we just want to say let me and
people like me do what we're doing at
what industry would consider a snail's
pace on toy problems okay fine but if
but if we want to if you know if I want
to take what I'm doing to the level that
that will really be you know paying off
that level the industry can appreciate
or just that really has technological
impact on a broad scale right or I think
if industry wants to take what it's
doing and really build machines that are
actually intelligent right our machine
learning that actually learns like a
person then I think we need each other
now and not just in some point in the
future so this is a general challenge
for MIT and for everywhere and for
Google I mean we just spent a few days
talking to Google about exactly this
issue that this was a talk I prepared
partly for that purpose so we wanted to
raise those issues and and it's just I
mean really there I don't know what I
mean well rather I can think of some
solutions to that problem of what you
could call brain drain from the academic
point of view or what you could call
just narrowing in into certain local
minima in the industry point of view but
they will require the leadership of both
academic institutions like MIT and
companies like Google being creative
about how they might work together in
ways that are a little bit outside of
their comfort zone I hope that will
start to happen including at MIT and at
many other universities and at companies
like Google and many others and I think
we need it to happen for the health of
all parties concerned okay thank you
very much things I'm curious about sort
of the premise that you gave that one of
the big gaps missing at determining
intelligence is the fact that we need to
teach machines how to recognize models
and I'm curious as to what you think
sort of non goal oriented cognitive
activity comes into play they're things
like feelings and emotions
and and y-you don't think that might not
necessarily be like that the no I'm I
was born in questo the only reason
emotions didn't appear on my slide is
because there's a few reasons but the
slide is only so big I wanted the font
to be big readable for such an important
slide I've had versions of my slide in
which I do talk about that okay it's not
that I think feelings or emotions aren't
important I think they are important and
I used to not have many insights on it
about what to do about them but actually
partly based on some of my colleagues
here at MIT BCS Laura Schultz and
Rebecca Saxe two of my cognitive
colleagues in who I work closely with
they've been starting to do research on
how people understand emotions both
their own and others and we've been
starting to work with them on
computational models so that's actually
something I'm actively interested in and
even working on but I would say and
again for those of you who study emotion
to know about this actually you're gonna
have Lisa coming in right oh so she's
gonna basically say a version of the
same thing I think the deepest way to
understand she's one of the world's
experts on this the deepest way to
understand emotion is very much based on
our mental models of ourselves of the
situation we're in and of other people
right think about for example all of the
different I mean if you you know if you
think about it I mean again Lisa will
talk all about this but if you think
about emotion as just a very small set
of what are sometimes called basic
emotions like being happy or angry or
sad or you know those are a small number
of them right there's usually only few
right you might not say you might see
that it's somehow like very basic things
that are opposed to some kind of
cognitive activity but think about all
the different words we have for emotion
right for example think about an a
famous cognitive emotion like regret
what does it mean to feel regret or
frustration right just to know both for
yourself when you're not just feeling
kind of down or negative but you're
feeling regret that that means something
like I have to feel like there's a
situation that came out differently from
how I hoped and I realize I could have
done something differently right so that
means you have to be able to understand
you have to have a model you have to be
able to do a kind of counterfactual
reasoning and to think oh if only I had
acted to differ
way then I can predict that the world
would have come out differently and
that's the situation I wanted but
instead it came up this other way right
or think about frustration again that
requires something like understanding
okay I've tried a bunch of times I
thought this would work but it doesn't
seem to be working maybe I'm ready to
give up though those are all those are
those are very important human emotions
we have to understand to understand
ourselves we need that to understand
other people to understand communication
but those are all filtered through the
kinds of models of action that I was
just the ones I was talking about here
with these say cost-benefit analyses of
action so what I'm so I'm just trying to
say I think this is very basic stuff
that will be the basis for building I
think better engineering style models of
the full spectrum of human emotion
beyond just like well I'm feeling good
or bad or scared okay
and if I think when you see Lisa she
will in her own way say something very
similar interesting thanks yeah thanks
Josh for your nice talk so all is about
human cognition and try to build a model
to mimic those cognition but you don't
how much could help you to understand
how the circuit implement those things
hmm I mean like these circuits in the
brain yeah yeah that's the is that what
you work on by any chance is that what
you work on by any chance yeah yeah yeah
so so in the Center for brains minds of
machines as well as in brain and
cognitive science yeah we I have a
number of colleagues who study the
actual hardware basis of this stuff in
the brain and that includes like the
large-scale architecture of the brain
say like what Nancy kanwisher
Rebecca Saxe studied with functional
brain imaging or the more detailed
circuitry which usually requires
recording from say non-human brains
right at the level of individual neurons
and connections between neurons all
right so I'm very interested in those
things although it's not mostly what I
work on right but I would say you know
again liking in many other areas of
science certainly in neuroscience the
kind of work I'm talking about here in a
sort of classic reductionist program
sets the target for what we might look
for like if I if I just want to go I
mean I I would I would I would assert
right or my working conjecture is that
if if you do the kind of work that I'm
talking about here it gives you the
right targets or gives you a candidate
set of targets to look for what are the
neural circuits computing right whereas
if you just go in and just say
poking around in the brain or have some
idea that what you're gonna try to do is
find the neural circuits which underlie
behavior without a sense of the
computations needed to produce those
behaviors I don't I think it's gonna be
very difficult to eat to know what to
look for and to know when you've found
even viable answers so I think that's
you know that's the standard kind of
reductionist program but it's not that's
it's not I also think it's it's not one
that is I'm divorced from the study of
neural circuits it's also one if you
look at the broad picture of reverse
engineering it's one where we're neural
circuits and understanding the circuits
in the brain play an absolutely critical
role okay I would say the mate as an
when you look at the brain at the
hardware level as an engineer I'm mostly
looking at the software level right but
when you look at the hardware level
there are some remarkable properties one
remarkable property again is you know
how much parallelism there is and in
many ways how fast the computations are
okay
neurons are slow but the computations
intelligence are very fast so how do we
get elements that are in some sense
quite slow in their time constant to
produce such intelligent behavior so
quickly that's a great mystery and I
think if we understood that it would
have payoff for building all sorts of
you know Apple basically application
embedded circuits okay but also maybe
most important is the power consumption
and again many people have-have have
noted this right if you look at the
power consumption the power that the
brain consumes like what did I eat today
okay almost nothing um my daughter who's
again she's doing an internship here she
literally yesterday all she ate was a
burrito and yet she wrote 300 lines of
code for her internship project on
really cool computational linguistics
projects so somehow she turned a burrito
into you know a model of child language
acquisition okay but how did she do that
or how do any of us do this right um
we're if you look at the power that we
consume when we simulate even a very
very small chunk of cortex on our
conventional hardware or we do any kind
of machine learning thing we have
systems which are very very very very
far from the power of the human brain
computationally but in terms of physical
energy consumed way way past what any
individual brain is doing so how do we
get circuitry of any sort biological or
just any physical circuit
to be as smart as we are with as little
energy as we are this is this is a huge
problem for basically every area of
engineering right if you want to if you
want to have any kind of robot the power
consumption is a key bottleneck same for
self-driving cars if we want to build AI
without contributing to global warming
and climate change let alone use AI to
solve climate change we really need to
address these issues and the brain is a
is a huge guide there right I think
there are some people who are really
starting to think about this how can we
say for example build somehow brain
inspired computers which are very very
low-power but maybe only approximate so
I'm thinking here of Joe Bates I don't
know if none of you know Joe he's he's
been around MIT and other places for
quite a while can I tell them about your
company so so Joe has a start-up in
Kendall Square called singular computing
and they have some very interesting
ideas including some actual implemented
technology for low power approximate
computing in a sort of a brain like way
that might lead to possibly even like
the ability to build something this is
Joe's dream to built in this about the
size of this table but that has a
billion course a billion cores and runs
on a reasonable kind of power
consumption I would love to have such a
machine if anybody wants to help Joe
build it I think he'd love to talk to
you but that's it's one of a number of
ideas I mean Google X people are working
on similar things probably most of the
major chip companies are also inspired
by this idea and I think even if you
don't didn't think you were interested
in the brain if you want to build the
kind of AI were talking about and run it
on physical Hardware of any sort and
understanding how the brain circuits
compute what they do what what I'm
talking about with as little power as
they do I don't know any better place to
look it seems like a lot of the
improvements in AI have been driven by
increasing like computational power yeah
how far you would you say me like GPUs
or CMU yeah yeah how far would you say
we are from hardware that could run a
general artificial intelligence of the
kind that I'm talking about yeah I don't
know I'll start with a billion cores and
then we'll see I mean I I think we're I
think we're I mean I think I think
there's no way to answer that question
in a way that software independent I
don't know how to do that right but I
think that
it's and and you know I don't know like
when you say how far are we
you mean how far am i with the resources
I have right now how far am i if if
Google decides to put all of its
resources at my disposal like they might
if I were working at deepmind
I don't know the answer to that question
I but I think the I think what we can
say is this um individual neurons I mean
again this goes back to another reason
to study neural circuits um if you look
at what we currently call neural
networks in the AI side the model of a
neuron is this very very simple thing
right individual neurons are not only
much more complex but have a lot more
computational power it's not clear how
they use it or whether they use it but I
think it's just as likely that a neuron
is something like a rail you write is
that a neuron is something like a
computer like under one neuron in your
brain is more like a CPU node okay maybe
and thus the ten billion or trillion you
know the large number of neurons in your
brain I think it's like 10 billion
cortical pyramidal neurons or something
might be like 10 billion cores okay for
example that's at least as plausible I
think to me as any other estimate so and
I think so I think we're on the
definitely on the underside with very
big error bars so I completely agree
that or if this is what you might be
suggesting and may you know going back
to my answer to your question I don't
think we're gonna get to what I'm
talking about that anything like a real
brain scale without major innovations on
the hardware side and you know it's it's
interesting that what drove those
innovations in that support current a I
was mostly not AI it was the video game
industry I'm when I point to the video
game engine in your head that's a
similar thing that was driven by the
video game industry on the software side
I think we should all play as many video
games as we can and contribute to the
growth of the video game industry
because no because I mean I mean you can
see this in very like there are
companies out there for example there's
a company called improbable which is a
London company London based startup a
pretty sizable start-up at this point
which is building something that they
call spatial OS which is it's a it's not
a it's not a hardware idea but it's a
kind of software idea for very very big
distributed computing environments to
run much much more complex realistic
simulations of the world for
much more interesting immersive
permanent videogames I think that's one
thing that might hopefully that will
lead to more fun new kinds of games but
that's one example of where we might
look to that industry to drive some of
the you know just computer systems
really hardware and software systems
that we'll take we'll take our game to
the next level
just understanding on the algorithmic
level or cognitive level is just to
understanding the learning the meaning
of learning would be how to predict but
on the circuit level is different but at
the what level on the circuit level well
of course it's different right but
already you I think you made a mistake
there honestly like you said the
cognitive level is learning how to
predict but I'm not sure what you mean
by that there's many things you could
mean and are what our cognitive science
is about is learning which of those
versions like I don't think it's
learning how to predict
I think it's learning what you need to
know to plan actions and to a map you
know all those things like it's not just
about predicting it's because there are
things we can imagine so that you would
never predict because there never happen
unless we somehow make the world
different so generalizations are you're
not predicting okay when your model
could generalize but especially in the
transfer learning that you are
interested in a few hundred of neurons
in prefrontal cortex they have
generalize a lot yes but not kind of a
Bayesian model do that you said but a
thean model won't do that or they don't
do it the way a Bayesian model does for
sure because that's in the abstract
level well I mean how do you really know
like and what does it mean to say that
some neurons do it like so maybe another
way to put this is to say look we have a
certain math that we use to capture
these you could call it abstract I call
it software level abstractions right I
mean all engineering is based on some
kind of abstraction but you might have a
circuit level abstraction a certain kind
of hardware level that you're interested
in describing the brain at and I'm
mostly working out or starting from a
more software level of abstraction right
they're all distractions we're not
talking about molecules here right we're
talking about some abstract notion of
maybe a circuit or of a program okay
right now it's a really interesting
question if I look at some circuits how
do I know what program they're
implementing right if I look at the
circuits in this machine could I tell
what program they're implementing well
maybe but certainly it would be a lot
easier if I knew something about what
programs they might
implementing before I start to look at
the circuitry if I just looked at the
circuitry without knowing what a program
was or what programs the thing might be
doing or what kind of programming
components would be mapable to circuits
in different ways right I don't even
know how to begin to answer that
question
so I think you know we've made some
progress at understanding what neurons
are doing in certain low-level parts of
sensory system and certain parts of the
motor system like primary motor cortex
like basically the parts of the neurons
that are closest to the inputs and
outputs of the brain right where we
don't eat when you can say we don't need
the kind of software abstractions that
I'm talking about or where we sort of
agree on what those things already are
so we can make enough progress on
knowing what to look for and how to how
to know when we found it but if you want
to talk about flexible planning things
that are more like cognition that you
know go on in prefrontal cortex right I
this point I don't I don't think that
just by recording from those neurons
we're gonna be able to answer those
questions in a meaningful engineering
way a way that that any engineer
software a hardware whatever could
really say yeah okay I get it I get
those insights in a way that I can
engineer with and that's what my goal is
right so my goal that's my goal to do at
the software level the hardware level or
the entire systems level connecting them
and I think that you know we can do that
by taking what we're doing and bringing
into contact with people studying neural
circuits but I don't think you can you
can leave this level out and just go
straight to the neural circuits and I
think the more you have the more
progress we make the more we can help
people who are studying at the neural
circuit level and they can help us
address these other engineering
questions that we don't really have
access to like the power issue or the
speed issue thank you okay thanks that
was great I thought maybe it'd give
Jessica Han
[Applause]