Transcript

0jspaMLxBig • Andrew Ng: Deep Learning, Education, and Real-World AI | Lex Fridman Podcast #73
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0310_0jspaMLxBig.txt
Back Raw
Kind: captions
Language: en
the following is a conversation with
Andrew and one of the most impactful
educators researchers innovators and
leaders in artificial intelligence and
technology space in general he
co-founded Coursera and Google brain
launched deep learning AI landing AI and
the AI fund and was the chief scientist
at Baidu
as a Stanford professor and with
Coursera and deep learning AI he has
helped educate and inspire millions of
students including me
this is the artificial intelligence
podcast if you enjoy it subscribe on
YouTube give it five stars an apple
podcast supported on patreon simply
connect with me on Twitter at Lex
Friedman spelled Fri D ma n as usual
I'll do one or two minutes of ads now
and never any ads in the middle that can
break the flow of the conversation I
hope that works for you and doesn't hurt
the listening experience this show is
presented by cash app the number one
finance side up in the App Store when
you get it use collects podcast cash app
lets you send money to friends buy
Bitcoin and invest in the stock market
with as little as one dollar brokerage
services are provided by cash up
investing a subsidiary of square and
member si PC since gap allows you to buy
Bitcoin let me mention that
cryptocurrency in the context of the
history of money is fascinating I
recommend a cent of money as a great
book on this history debits and credits
on Ledger's started over 30,000 years
ago the US dollar was created over 200
years ago and Bitcoin the first
decentralized cryptocurrency released
just over ten years ago so given that
history cryptocurrency still very much
in its early days of development but
it's still aiming to and just might
redefine the nature of money
so again if you get cash app from the
App Store or Google Play and use the
collects podcast you'll get $10 and cash
app will also donate $10 the first one
of my favorite organizations that is
helping to advance robotics and STEM
education for young people around the
world
and now here's my conversation with
Andrew Eng the courses you taught on
machine learning in Stanford and later
on Coursera the co-founded have educated
and inspired millions of people so let
me ask you what people are ideas
inspired you to get into computer
science and machine learning when you
were young when did you first fall in
love with the field there's another way
to put it
growing up in Hong Kong Singapore I
started learning to code when I was five
or six years old at that time I was
learning the basic programming language
and they would take these folks and you
know they'll tell you typed this program
into your computer so typed that
programs my computer and as a result of
all that typing I would get to play
these very simple shoot-'em-up games
that you know I had implemented on my
own minds old computer so I thought was
fascinating as a young kid that I could
write this code that's really just
copying code from a book into my
computer to then play these cool of
video games another moment for me was
when I was a teenager and my father
because his doctor was reading about
expert systems and about neural networks
so he got me read some of these books
and I thought was really cool you could
write a computer that started to exhibit
intelligence then I remember doing an
internship was in high school this isn't
Singapore where I remember doing a lot
of photocopying and and I was office
assistants and the highlight of my job
was when I got to use the shredder
so the teenager me remote thinking boy
this is a lot of photocopying if only we
could write software build a robot
something to automate this maybe I could
do something else so I think a lot of my
work since then has centered on the
theme of automation even the way I think
about machine learning today were very
good at writing learning algorithms they
can automate things that people can do
or even launching the first MOOCs
massive open online courses that later
led to Coursera I was trying to also
meet what could be automatable in how I
was teaching on campus process of
Education tried to automate parts of
that make it more
to have more impact from a single
teacher single educator yeah I felt you
know teaching Stanford teaching machine
learning it's about 400 students a year
at the time and I found myself filming
the exact same video every year telling
the same jokes the same room and I
thought why am I doing this well just
take last year's video and then I can
spend my time building a deeper
relationship with students so he has
process of thinking through how to do
that that led to the first first moves
that we launched and then you have more
time to write new jokes are their
favorite memories from your early days
at Stanford teaching thousands of people
in person and then millions of people
online you know teaching online
what not many people know was that a lot
of those videos were shot between the
hours of 10:00 p.m. and 3:00 a.m. a lot
of times we were watching the first
moves that fit with our announcer course
but a hundred thousand people have
signed up we just started to write the
code and we had not yet actually filmed
the video so you know a lot of pressure
a hundred thousand people waiting for us
to produce the content so many Friday
Saturday's I would go out have dinner my
friends and then I was thinking okay do
I want to go home now or do you want to
go to the office to film videos and the
thoughts of you know that helped hundred
thousand people potentially learn
machine learning unfortunately that made
me think okay I'm gonna go to my office
go to my time in the recording studio I
would adjust my Logitech webcam adjust
my you know Wacom tablet make sure my
lapel mic was on and then I was not
recording often until 2:00 a.m. or 3:00
a.m. I think I'm fortunate it doesn't
doesn't show that it was recorded that
late at night but it was really
inspiring the the thought that we could
create content to help so many people
learn about machine learning how does
that feel the fact that you're probably
somewhat along maybe a couple of friends
recording with a logitech webcam and
kind of going home alone at 1:00 and
2:00 a.m. at night and knowing that
that's going to reach sort of
thousands of people eventually millions
of people is what's that feeling like I
mean is there a feeling of just
satisfaction of pushing through I think
is humbling and I wasn't thinking about
what I was viewing I think one thing we
I'm proud to say we caught right from
the early days was I told my whole team
back then that the number one priority
is to do what's best for learnis to
asbestos students and so when I went in
a recording studio the only thing on my
mind was what can I say how can I design
my slides ready to draw a right to make
these concepts as clear as possible for
lehre news
I think you know I've seen sometimes
instructors is tempting hey let's talk
about my work maybe if I teach you about
my research
someone will cite my papers a couple
more times and I think one things we got
right launch the first few MOOCs and
later building Coursera was putting in
place that bedrock principle let's just
do what's best for learners then forget
about everything else and I think that
that is a guiding principle turns out to
be really important to the to the rise
of the movement and the kind of learner
your imagined in your mind is as as
broad as possible as global as possible
so really try to reach as many people
interested in machine learning and AI as
possible I really want to help anyone
that had an interest in machine learning
to break into fields and and I think
sometimes eventually people ask me hey
why you spend so much time explaining
gradient descent and then and my answer
was if I look at what I think to learn
they need somewhat benefit from I felt
that having that a good understanding of
the foundations coming back to the
basics would put them in a better stead
to then build on a long term career so
you've tried to consistently make
decisions on that principle so one of
the things you actually revealed to the
narrow AI community at the time and to
the world is that the amount of people
who are actually interested in AI is
much larger than we imagined by you
teaching the class and how popular
became it showed that wow this isn't
just a small community of
sort of people who go to Europe's and
and it's much bigger it's the
developers it's people from all over the
world from front I mean I'm Russian so
as everybody in Russia is really
interested this is a huge number of
programmers who are interested in
machine learning India China South
America everywhere that there's just
millions of people who are interested
machine learning so how big you get a
sense that this number of people is that
are interested in your perspective I
think the numbers grown over time I
think I'm one of those things that maybe
it feels like it came out of nowhere
but it's an insider building it it took
years there's all those overnight
successes that took years to get there
my first foray into this type of
education was when we were filming my
Stanford class and sticking the videos
on YouTube and then some other things
with uploading the holes and so on but
you know basically the one hour fifteen
minute video that we put on YouTube and
then we had four or five other versions
of websites that had built most of what
you would never have heard of because
they reach small audiences but that
allowed me to iterate allow my team and
me to innovate to learn what the ideas
that work and what doesn't for example
one of the features I was really excited
about and really proud of was build this
website where multiple people could be
logged into the website at the same time
so today if you go to a website you know
if you're logged in and then I want to
log in you need to log out it was the
same browser the same computer but I
thought well one of two people say you
and me were watching a video together in
front of the computer what if a website
could have you type your name and
password hit me type in their password
and then now the computer knows both of
us are watching together and it gives
both of us credit for anything we do as
a group influences feature rolled it out
in a higher in school in San Francisco
we had about 20-something users worth
the teacher there Sacred Heart Cathedral
prep teachers great and guess what
zero people use the speaker it turns out
people studying online they want to
watch the videos by themselves so you
can playback pause at your own speed
rather than in groups so that was one
example of a tiny lesson learned out of
many that allows us to hone in to the
set of features and it sounds like a
brilliant feature so I guess the lesson
to take from that is you
there's something that looks amazing on
paper and then nobody uses it doesn't
actually have the in the impact that you
think it might have and so yeah I saw
that you really went through a lot of
different features and a lot of ideas
you had to arrive at the final at
Coursera its final kind of powerful
thing that showed the world that MOOCs
can educate millions and I think with
how um machine learning movements as
well I think it didn't come out of
nowhere instead what happened was as
more people learned about machine
learning they will tell their friends
and their friends will see how the big
world to their work and then and in the
community kept on growing um and I think
we're still growing you know I don't
know in the future what percentage of
our developers would be AI developers I
could easily see it being more for 50
percent right because so many a I
developers broadly construed not just
people doing the machine learning
modeling but the people but the
infrastructure data pipelines you know
all the software's surrounding the old
machine learning model maybe is even
bigger
I feel like today almost every software
engineer has some understanding of the
clouds no oh you know but maybe this is
my microcontroller developer doesn't
need to do the cloud but I feel like the
vast majority of software in Jesus today
are sort of having appreciated the cloud
I think in the future maybe we'll
approach nearly a hundred percent of all
developers being you know in some way an
AI developer or at least having an
appreciation of machine learning and my
hope is that there's this kind of effect
that there's people who are not really
interested in soft being a programmer or
being into software engineering like
biologists chemists and physicists even
mechanical engineers and all these
disciplines that are now more and more
sitting on large data sets and here they
didn't think they're interested in
programming until they have this data
set and they realized there's the set of
machine learning tools that allow you to
use the data set so they actually become
they learn to program and they become
new programmer so like the not just
because you've mentioned a larger
percentage of developers become machine
learning people the it seems like more
and more the the kinds of people who are
becoming
developers is also growing significantly
yeah yeah I think I think once upon a
time only a small part of humanity was
literate you could read and write and
and and maybe you thought maybe not
everyone needs to learn to read and
write you know you just go listen to a
few monks write me to you and maybe that
was enough or maybe we just need a few
handful of authors to write the
bestsellers and then no one else needs
to write but what we found was that by
giving as many people you know in some
countries almost everyone basically
literacy it dramatically enhanced human
to human communications and we can now
write for an audience of one such as if
I send you an email you send me an email
I think in computing we're still in that
phase where so few people know how the
codes that the code is mostly have to
code for relatively large audiences but
if everyone well most people became
developers at some level similar to how
most people and develop economies are
somewhat literate I would love to see
the owners of a mom-and-pop store be
with a very little code to customize the
TV display for their special this week
and I think of it enhance human to
computer communications which is
becoming more more important today as
well so you think you think it's
possible that machine learning becomes
kind of similar to literacy where we're
yeah like you said the owners of a
mom-and-pop shop is basically everybody
in all walks of life would have some
degree of programming capability
I could see society getting there um
there's one other interesting thing you
know if I go talk to the mom and pop
store if I toss a lot of people in their
daily professions I previously didn't
have a good story for why they should
learn to code yeah we give them some
reasons but what I found with the rise
of machine learning and data science is
that I think the number of people with a
concrete use for data science in their
daily lives and their jobs may be even
larger than the number of people of a
country used for software engineering
for example if you were actually run a
small mom-and-pop store I think if you
can analyze the data about your sales
your customers I think there's actually
real value there maybe even more than
traditional software engineer
so I find that for a lot of my friends
in various professions being recruiters
or accountants or you know people that
work in the factories which I deal with
more and more these days I feel if they
were data scientists at some level they
could immediately use that in their work
so I think that data science and machine
learning
may be an even easier entree into the
developer world for a lot of people then
the software engineering that's
interesting and I grew that but that's a
beautifully put we live in a world where
most courses and talks have slides
PowerPoint keynote and yet you famously
often still use a marker and a
whiteboard the simplicity of that is
compelling in for me at least fun to
watch so let me ask why do you like
using a marker and whiteboard even on
the biggest of stages I think it depends
on the concepts you want to explain for
mathematical concepts it's nice to build
at the equation one piece at a time
and the whiteboard marker or the pen is
stylus is a very easy way you know to
build up the equation a build up a
complex concept one piece at a time
while you're talking about it and
sometimes that enhances
understandability the downside of
writing is as it slow and so if you want
a long sentence it's very hard to write
that so I think their pros and cons in
sometimes I use slides and sometimes
they use a whiteboard or a stylus the
slowness of a whiteboard is also it's
upside is it forces you to reduce
everything to the basics some of some of
your talks and involve the whiteboard I
mean there's really none but you go very
slowly and you really focus on the most
simple principles and that's a beautiful
that enforces a kind of a minimalism of
ideas that I think is surprisingly least
for me is is great for education like a
great talk I think is not one that has a
lot of content a great talk is one that
just clearly says a few simple ideas and
I think you the white board somehow
enforces that Peter erbil who's now one
of the top roboticists and reinforcement
learning
experts in the world was your first PhD
student hey so I bring him up just
because I kind of imagine this is this
was must have been an interesting time
in your life do you have any favorite
memories of working with Peter your
first student in those uncertain times
especially before deep learning really
really sort of blew up any favorite
memories from those times you know I was
really fortunate to have had Peter of
you as my first PhD students and I think
even my long-term professional success
builds on early foundations or early
work that that Peter was so critical to
so I was really grateful to him for
working at me you know what
not a lot of people know is just how
hard research was and and so is Peter's
PhD thesis was using reinforcement
learning to fly helicopters and so you
know actually even today the website
Helly thought stanford.edu heö I don't
Stanford are you still up here watch
videos of us using reinforcement
learning to make the helicopter fly
upside down five loops rolls this is
cool so one of the most incredible
robotics videos ever so how do you still
watch it oh yeah
thanks firing that's from like 2000 it's
eight or seven or six like that really
my dad's like yeah so is over ten years
old that was really inspiring to a lot
of people yeah but not many people see
is how hard it was so Peter and Adam
codes and Morgan Quigley and I work on
various versions of the helicopter and a
lot of things did not work for example
turns out one of the hardest problems we
had was when the helicopters flying
around upside down doing stunts how do
you figure out the position how do you
localize a helicopter so we want to try
all sorts of things having one GPS unit
doesn't work because you're flying
upside down the GPS units facing down so
you can't see the satellites so we tried
them we experimented trying to have two
GPS units one facing up one facing the
house if you flip over that didn't work
because the downward facing one couldn't
synchronize if you're flipping quickly
um Morgan quickly was exploring this
crazy complicated configuration of
specialized hardware to interpret GPS
signal
look into FPGA is completely insane
spent about a year working on that
didn't work so I remember Peter great
guy him and me you know sitting down in
my office looking at saw the latest
things we had tried that didn't work and
saying you know Don it like what now
because because we tried so many things
in it and it just didn't work in the end
what we did when Adam Cole's was was
crucial to this was put cameras on the
ground and used cameras on the ground to
localize a helicopter and that soft a
localization problem so that we couldn't
focus on the reinforcement learning and
inverse reinforcement learning
techniques so didn't actually mean to
helicopter fly and you know I'm reminded
when when was doing um this work at
Stanford around that time there was a
lot of reinforcement learning
theoretical papers but not a lot of
practical applications so the autonomous
helicopter work for fine helicopters was
this one of the few you know practical
applications of reinforcer learning at
the time which which caused it to become
pretty well known I I feel like we might
have almost come full circle with today
there's so much but so much hype so much
excitement yeah about reinforcement
learning but again we're hunting for
more applications and all of these great
ideas that delica he's come up with what
was the drive sort of in the face of the
fact that most people doing theoretical
work what motivate you in the
uncertainty and the challenges to get
the helicopter sort of to do the the
applied work to get the actual system to
work yeah in the face of fear
uncertainty is sort of the setbacks the
you mentioned for localization I like
stuff that works
III know physical world so like it's
this back to the shredder and you know
III like theory but when I work on
theory myself and this personal taste
I'm not seeing anyone else should do
what I do but when I work on theory I
Percy enjoyed more if I feel that my the
work I do will influence people have
positive impact will help someone
I remember when many years ago our
speaking with a mathematics professor
and it kind of just said hey ytt what
you do and then he said he you know he
had stars in his eyes when he answered
and this mathematician not from Stanford
different University he said I do what I
do
because it helps me to discover truth
and beauty in the universe here starts
analyzing he said yeah and I thought
that's great
um I don't want to do that I think it's
great that someone does that fully
supportive people that do a lot of
respect review that but I am more
motivated when I can see a line to how
the work that my team's and I are doing
house people the world needs all sorts
of people I'm just one type hoping
everyone should do things the same way
as I do but when I delve into either
theory or practice if I personally have
conviction you know that here's a
pathway to help people I find that more
satisfying to have that conviction that
that's your path you were a proponent of
deep learning before it gained
widespread acceptance what did you see
in this field that gave you confidence
what was your thinking process like in
that first decade of the I don't know
that's called 2000s the odds yeah I can
tell you the thing we got wrong with the
thing we got right the thing we really
got wrong was the importance of the
early importance of unsupervised
learning so early days of Google brain
we put a lot of effort into unsupervised
learning rather than supervised learning
and those as argument I think was around
them 2005 after a new Europe's at that
time called nips but now in Europe said
ended and Geoff Hinton and I were
sitting in the cafeteria outside you
know the conference we had lunch was
chatting and Geoff pulled up this napkin
he started sketching this argument on
her on a napkin
it was very compelling as our repeated
human brain has about a hundred trillion
so there's 10 to the 14 synaptic
connections you will live about 10 10
and 9 seconds that's 30 years you
actually live for two to two by ten to
nine maybe three right
nine seconds so just let's say ten to
nine so if each synaptic connection each
weight in your brains new network has
just a one bit parameter that's 10 to
the 14 bits you need to learn in up to
10 to 9 seconds of your life so via the
simple argument which is a lot of
problems it's very simplified that's 10
to the 5 bits per second you need to
learn in your life and I have a
one-year-old daughter I am NOT pointing
out 10 to 5 bits per second of labels to
her so and and I think I'm a very loving
parent but I'm just not gonna do that
so from this you know very crude
definitely problematic argument there's
just no way that most of what we know is
through supervised learning the wife you
get so many visit information is from
sucking in images audio those
experiences in the world and so that
arguments and a lot of known forces
argument you go going to really convince
me that there's a lot of power to
unsupervised learning so that was the
part that we actually maybe maybe gone
wrong I still think I was learning is
really important but we but but in the
early days you know 10 15 years ago and
all of us thought that was the path
forward oh so you're saying that that
that perhaps was the wrong intuition for
the time for the time that that was the
part we got wrong the part we got right
was the importance of scale so Adam
calls another wonderful person fortunate
said worth of him he was in my group of
Stanford at the time and Adam had run
these experiments at Stanford showing
that the bigger we train a learning
algorithm the better performance and it
was based on that it was a graph that
hadn't generated you know where the
x-axis y-axis lines going up into the
right so bigger paint make this thing
the better his performance accuracy is
the vertical axis so it's really based
on that chart that Adam generated that
he gave me the conviction that you could
scale these models way bigger than what
we could on the few CPUs we should
understand that then we could get even
better results and it was really based
on that one figure that Adam generated
that gave me the conviction to go of
Sebastian's
to pitch you know starting starting a
project at Google which became the
CooCoo brain crunch brain you know
filing Google brain and there the
intuition was scale will bring
performance for the system so we should
chase
larger and larger scale and I think
people don't don't realize how how
groundbreaking of it is simple but it's
a groundbreaking idea that bigger data
sets will result in better performance
it was cultural first it was
controversial at the time some of my
well-meaning friends you know see any
people in the machine or in community I
won't name but whose people told people
of some some of whom we know my
well-meaning friends came and we're
trying to give me friendly meze hey
Andrew why are you doing this is crazy
it's in the near enough architecture
look at these architectures of building
you just like go for scale like there's
a bad career move so so my well-meaning
friends you know we're trying to some of
them we're trying to talk me out of it
if I find it if you want to make a
breakthrough you sometimes have to have
conviction and do something before it's
popular since that lets you have a
bigger impact let me ask you just a
small tangent on that topic I find
myself arguing with people saying that
greater scale especially in the context
of active learning so it's very
carefully selecting the data set but
growing the scale of the data set is
going to lead to even further
breakthroughs in deep learning and
there's currently pushback at that idea
that larger datasets are no longer that
so you want to increase the efficiency
of learning you want to make better
learning mechanisms and I personally
believe that bigger data sets will still
with the same learning methods we have
now result in better performance what's
your intuition at this time on those I
Anna this dual side is do we need to
come up with better architectures for
learning or can we just get bigger
better data sets that will improve
performance I think both are important
and there's also problem dependent so
for a few data sets we may be
approaching your Bayes error rate of
approaching or surpassing human level
performance and then there's that
theoretical ceiling that we will
a surface of a CRE but then I think
there plenty of problems where we're
we're still quite far from either human
of a performance all from Bayes error
rate and bigger data says with new
networks but without further elaborate
innovation will be sufficient to take us
further but on the flip side if we look
at the recent breakthroughs using you
know transforming networks for language
models it was a combination of novel
architecture but also scale has a lot to
do with it if we look at what happened
with your GP - and birds
I think scale was a large part of the
story yeah that's that's not often
talked about is the scale of the data
set it was trained on and the quality of
the data set because there's some so it
was like reddit threads that had they
were operated highly so there's already
some weak supervision on a very large
data set that people don't often talk
about right
I find it today we have maturing
processes to managing cold things like
get right version control it took us a
long time to evolve the good processes I
remember when my friends and I were
emailing each other C++ files in email
you know but then we had was that CVS
subversion get maybe something else in
the future we're very immature in terms
of Susa managing data and think about
how the creator and how the soft I'm
very hot messy data problems I think
there's a lot of innovation there to be
I still I love the idea that you were
versioning through email I'll give you
one example um when we work with
manufacturing companies is not at all
uncommon for there to be multiple late
lists that disagree of each other right
and so we were doing the work in visual
inspection we will you know take say a
plastic cards and show to one inspector
and the inspector sometimes very
opinionated there go clearly that's the
defector scratch understand so gonna
check this part take the same parts of
different inspector different very
opinionated clearly the scratch is small
is fine
don't throw it away you're gonna make us
yours and then sometimes you take the
same plastic part show it to the same
inspector in the afternoon and I suppose
in the morning
and very affinity go in the morning to
say clearly is okay in the afternoon
equally confident clearly this is a
defect and so what does the i-team
supposed to do if if sometimes even one
person doesn't agree of himself or
herself in the span of a day so I think
these are the types of um very practical
very messy data problems that that you
know that my teams wrestle with in the
case of large consumer Internet
companies where you have a billion users
you have a lot of data you don't worry
about they just take the average it kind
of works but in a case of other industry
settings we don't have big data if just
a small data very small the users maybe
100 defective parts or 100 examples of a
defect if you have only 100 examples
these little labeling errors you know if
10 of your hundred labels aren't wrong
that actually is 10% it is that has a
big impact so how do you clean this up
what you're supposed to do this is an
example of the of the types of things
that my team's did this is a landing AI
example are wrestling with to deal with
small data which comes up all the time
once you're outside consumer internet
yeah that's fascinating so then you
invest more effort in time in thinking
about the actual labeling process what
are the labels what are the how our
disagreements resolved in all those
kinds of like pragmatic real world
problems that's a fascinating space yeah
I find it actually when I'm teaching at
Stanford I increasingly encourage
students at Stanford to try to find
their own project
for the end of term project rather than
just downloading someone else's nicely
clean data set it's actually much harder
if you need to go and define your own
problem and find your own dataset rather
than you go to one of the several good
websites very good websites with with
creams scopes
datasets that you could just work on
you're now running three efforts the AI
fund landing AI and deep learning AI as
you've said the AI fund is involved in
creating new companies from scratch
Landing AI is involved in helping
already established companies do AI and
deep learning AI is for education of
everyone else or of individuals
interested of getting into the field and
excelling in
so let's perhaps talk about each of
these areas first deep learning that AI
how the basic question how does a person
interested in deep learning get started
in the field the Atlanta AI is working
to create courses to help people break
into AI so my machine learning course
that I taught through Stanford is one of
the most popular causes on Coursera to
this day it's probably one of the
courses sort of if I ask somebody how
did you get into machine learning or how
did you fall in love with machine
learning or will get you interested they
it always goes back to rain and Drang at
some point you won't find the amount of
people you influence is ridiculous so
for that I'm sure I speak for a lot of
people say big thank you no yeah thank
you you know I was once reading a news
article I think it was tech review and
I'm gonna mess up the statistic but I
remember reading article that said um
something like one-third of all
programmers are self-taught I may have
the number one third Romney was
two-thirds but I rent an article I
thought this doesn't make sense
everyone is self-taught because you
teach yourself I don't teach people and
it's no good haha oh yeah so how does
one get started in deep learning and
where does deep learning that AI fit
into that
so the define specialization offered by
today is is this I think one it was
called service specialization it might
still be so it's very popular way for
people to take that specialization to
learn about everything from new networks
to how to tune in your network so what
is it confident to what is a RNA nor
sequence model or what is an attention
model and so the design specialization
um steps everyone's through those
algorithms so you deeply understand it
and can implement it and use it you know
for whatever a from the very beginning
so what would you say the prerequisites
for somebody to take the deep learning
specialization in terms of maybe math or
programming background you know need to
understand basic programming since there
are Pro exercises in Python
and the map prereq is quite basic
so no calculus is needed if you know
calculus is great you get better
intuitions but deliberately try to teach
that specialization without requirement
calculus so I think high school math
would be sufficient if you know how to
Mouse by two matrices I think I think
that that deaths that desperates
so little basically in your algebra it's
great
basically the algebra even very very
basically the algebra and some
programming I think that people that
done the machine learning also find a
deep learning specialization a bit
easier but is also possible to jump into
the divine specialization directly but
it'll be a little bit harder since we
tend to you know go over faster concepts
like how does gradient descent work and
what is an objective function which
which is covered mostly in the machine
learning course could you briefly
mention some of the key concepts in deep
learning that students should learn that
you envision them learning in the first
few months in the first year or so so if
you take the d-line specialization you
learned foundations of what is in your
network how do you build up in your
network from a you know single which is
a unit stack of layers to different
activation functions you don't have a
trained in your networks one thing I'm
very proud of in that specialization as
we go through a lot of practical
know-how of how to actually make these
things work so what the differences
between different optimization
algorithms so what do you do of the
algorithm over things so how do you tell
the algorithm is overfitting when you
collect more data when should you not
bother to collect more data
I find that um even today unfortunately
there are your engineers that will spend
six months trying to pursue a particular
direction such as collect more data
because we heard more data is valuable
but sometimes you could run some tests
and could have figured out six months
earlier therefore this problem
collecting more data isn't going to cut
it so just don't spend seconds
collecting more data spend your time
modifying the architecture or trying
something also go through a lot of the
practical know-how also that when when
when when someone will you take the
deviant specialization you have those
skills to be very efficient in how you
build is net
so dive right in to play with the
network to train it to do the inference
on a particular data set to build an
intuition about it without without
building it up too big to where you
spend like you said six months learning
building up your big project without
building an intuition of a small small
aspect of the data that could already
tell you everything needs you know about
that date yes and also the systematic
frameworks of thinking for how to go
about building practical machine
learning maybe to make an analogy um
when we learn to code we have to learn
the syntax of some Korean language right
be a Python or C++ or octave or whatever
but that equally important that may be
even more important part of coding is to
understand how to string together these
lines of code into coherent things so
you know when should you put something
in the function call and when should you
not know how do you think about
abstraction so those frameworks are what
makes the programmer efficient even more
than understanding to syntax I remember
when I was an undergrad at Carnegie
Mellon um one of my friends with debug
their codes by first trying to compile
it and then it was T plus s code and
then every line did a syntax error they
want to care for the syntax errors as
quickly as possible so how do you do
that well they would delete every single
line of code with a syntax error so
really efficient for general syntax
errors were horrible service I
think so we learned how the debug and I
think in machine learning the way you
debug the machine learning program is
very different than the way you you know
like do binary search or whatever use
the debugger I traced through the code
in in traditional software engineering
so isn't evolving discipline but I find
that the people that are really good at
debugging machine learning algorithms
are easily 10x
maybe 100x faster at getting something
to work so and the basic process of
debugging is so the the bug in this case
why is in this thing learning learning
improving sort of going into the
questions of overfitting and all those
kinds of things that's that's the
logical space that the debugging is
happening in would in your network
yeah the often question is why doesn't
it work yet well can I expect it
eventually work and what are the things
I could try
change the architecture malteaser more
regularization different optimization
algorithm you know the different types
of data are so to answer those questions
systematically so that you don't heading
down so you don't spend six months
hitting down the blind alley before
someone comes and says why you spent six
months doing this what concepts and deep
learning do you think students struggle
the most with or sort of this is the
biggest challenge for them was to get
over that hill
it's it hooks them and it inspires them
and they really get it similar to
learning mathematics I think one of the
challenges of deep learning is that
there are lot of concepts that build on
top of each other if you ask me what's
hard about mathematics I have a hard
time pinpointing one thing is it
addition subtraction is it carry is it
multiplication long there's lot of stuff
I think one of the challenges of
learning math and of learning certain
technical fields is that a lot of
concepts and you miss a concept then
you're kind of missing the prerequisite
for something that comes later so in the
deep learning specialization try to
break down the concepts to maximize the
answer each component being
understandable so when you move on to
the more advanced thing we learn your
confidence hopefully you have enough
intuitions from the earlier sections to
then understand why we structure
confidence in a certain certain way and
then eventually why we build you know
our nn zone ellos tienen or attention
model in a certain way a building on top
of the earlier concepts I'm curious you
you you do a lot of teaching as well do
you have a do you have a favorite this
is the hard concept moment in your
teaching
well I don't think anyone's ever turned
the interview on me I think that's a
really good question yeah it's it's it's
really hard to capture the moment when
they struggle I think you put a really
eloquently I do think there's moments
that are like aha moments that really
inspire people I think for some reason
reinforcement learning especially deep
reinforcement learning is a really great
way to really inspire people and get
what the use of neural networks can do
even though you know networks really are
just a part of the deep RL framework but
it's a really nice way to the to paint
the entirety of the picture of a neural
network being able to learn from scratch
knowing nothing and explore the world
and pick up lessons I find that a lot of
the aha moments happen when you use deep
RL to teach people about neural networks
which is counterintuitive I find like a
lot of the inspired sort of fire and
people's passion people's eyes comes
from the RL world do you find I mean
first of all learning and to be a useful
part of the teaching process or not I
still teach me forceful learning and one
of my Stanford classes and my PhD thesis
wonderful so nice thank you I find it if
I'm trying to teach students the most
useful techniques for them to use today
I end up shrinking the amount of time
and talk about reinforce another in
English it's not what's working today
now our world changes so fast maybe it
does be totally different in a couple
years I think we need a couple more
things for reinforcement learning to get
there if you get there yeah one of my
teams is looking to reinforce the
learning for some robotic control toss
so I see the applications but if you
look at it as a percentage of all of the
impact of you know the types of things
we do is at least today outside of you
know playing video games right in a few
of the games the the scope nearest a
bunch of us was standing around saying
hey what's your best example of an
actual deploy reinforcement learning
application and you know among your like
scene in machine learning researchers
right and again there are some emerging
ones but
there are there are not that many great
examples well I think you're absolutely
right the sad thing is there hasn't been
a big application impactful real-world
application reinforcement learning I
think its biggest impact to me has been
in the toy domain in the game domain in
a small example that's what I mean for
educational purpose it seems to be a fun
thing to explore new networks with but I
think from your perspective and I think
that might be the best perspective is if
you're trying to educate with a simple
example in order to illustrate how this
can actually be grown to scale and have
a real world impact then perhaps
focusing on the fundamentals of
supervised learning in the context of
you know a simple data set even like an
eminence data set is the right way is
the right path to take I just the amount
of fun I've seen people have of the
reinforcement learning it's been great
but not in the applied impact on the
real-world setting so it's a it's a
trade-off how much impact you want to
have versus how much fun you want to
have yeah that's really cool and I feel
like you know the world actually needs
also
even within machine learning I feel like
deep learning is so exciting but the AI
team shouldn't just use deep learning
I find that my team's use a portfolio of
tools and maybe that's not the exciting
thing to say but some days we use
internet some days we use a you know the
PC a the other day are sitting down with
my team looking at PC residuals trying
to figure out what's going on with PC
applied to manufacturing problem and
sometimes we use the promising graphical
model sometimes you use a knowledge
trough where some of the things that has
tremendous industry impact but the
amount of chat about knowledge drops in
academia has really thin compared to the
actual rower impact so so I think
reinforcement learning should be in that
portfolio and then it's about balancing
how much we teach all of these things
and the world the world should have
diverse skills if he said if you know
everyone just learn one one narrow thing
yeah the diverse skill help you discover
the right tool for the job what is the
most beautiful surprising or inspiring
idea in deep learning to you something
that captivated your imagination
at the scale that could be a the
performance I give you achieve of scale
or there are other ideas I think that if
my only job was being an academic
researcher and have an unlimited budget
and you know didn't have to worry about
short-term impact and only focus on long
term in fact I pretty spent all my time
doing research on unsupervised learning
I still think unsupervised learning is a
beautiful idea at both this Pastner
herbs and I CML I was attending
workshops on the center Vera's talks
about self supervised learning which is
one vertical segment maybe of sort of
unsupervised learning I'm excited about
maybe just to summarize the idea I guess
you know the idea of describe movie no
please so here's the examples self
supervised learning let's say we grab a
lot of unlabeled images off the internet
so with infinite amounts of this type of
data I'm going to take each image and
rotate it by a random multiple of 90
degrees and then I'm going to train a
supervised near Network to predict what
was the original orientation so has
something rotated 90 degrees hundred
eighty degrees turns in seven degrees or
zero degrees so you can generate an
infinite amount of label data because
you rotate to the image so you know
what's the branch of label and so
various researchers have found that by
taking unlabeled data and making up
label datasets and training a large
neural network on these thoughts you can
then take the hidden layer
representation and transfer to a
different toss very powerfully um
learning word embeddings when we take a
sentence to leave the word predict the
missing word which is how we learn you
know one of the ways we learn where the
embeddings is another example and I
think there's now this portfolio of
techniques for generating these made-up
toss another one called jigsaw what
behave you take an image cut it up into
a you know three by three grid so like a
nine 3x3 puzzle piece jump out the nine
pieces and have a neural network predict
which of the nine factorial possible
permutations it came from so are many
groups including your opening I Peter P
has been doing some work on this to
Facebook
Google brain I think deep mind Oh Aaron
menthols has great work on the CPC
objective so many teams are doing
exciting work and I think this is a way
to generate infinitely both data and and
I find this a very exciting piece of an
supervisor and he's a long-term you
think that's going to unlock a lot of
power and in machine learning systems is
this kind of unsupervised learning I
don't think there's the whole enchilada
I think that's just a piece of it and I
think this one piece unsuited is self
supervised learning it's starting to get
traction we're very close to it being
useful well what embedding is really
really useful I think we're getting
closer and closer to just having a
significant real world impact maybe in
computer vision and video but I think
this concept and then I think there'll
be other concepts around it you know
other unsupervised learning things that
I worked on I've been excited about I
was really excited about sparse coding
and I see a slow feature analysis I
think all of these are ideas that
various of us were working on about a
decade ago before we all got distracted
by how well supervised learning was
wearing work yeah it was a we would
return we were returned to the
fundamentals of representation learning
that that really started this movement
of deep learning I think there's a lot
more work that one could explore around
the steam of ideas and other ideas to
come or better algorithms so if we could
return to maybe talk quickly about the
specifics of deep learning that AI the
deep learning specialization perhaps how
long does it take to complete the course
would you say the official length of the
divine specialization is I think 16
weeks so about 4 months but is go at
your own pace so if you subscribe to the
divine socialization there are people
that finish that in less than a month by
working more intensely and study more
intensely so it really depends on on the
individual who created the divine
specialization we wanted to make it very
accessible and very affordable and with
you know Coursera and Devon dyers
education mission one thing that's
really important to me is that if
there's someone for whom paying anything
is a it's a financial hardship then just
apply for financial
and get it for free if you were to
recommend a daily schedule for people in
learning whether it's through the deep
learning that a a specialization or just
learning in the world of deep learning
what would you recommend how do they go
about day two days or a specific advice
about learning about their journey in
the world of deep learning machine
learning I think I'm getting the habit
of learning is key and that means
regularity so for example we send out
our weekly newsletter the batch every
Wednesday so people know is coming
Wednesday you can spend a little bit of
time on Wednesday catching up on the
latest news through the batch on the on
on on Wednesday and for myself I've
picked up a habit of spending some time
every Saturday and every Sunday reading
or studying and so I don't wake up on
the Saturday and have to make a decision
do I feel like reading or studying today
or not it's just it's just what I do and
the fact is a habit makes it easier so I
think if someone can get in that habit
it's like you know just like we brush
our teeth every morning I don't think
about it if I thought about this a
little bit annoying to have to spend two
minutes doing that but it's a habit that
it takes no cognitive loads but this
would be so much harder if we have to
make a decision every morning so and
actually that's the reason why we're the
same thing every day as well it's just
one less decision I just get out in
there where I'm sure so I think you can
get that habit that consistency of
studying then then it actually feels
easier so yeah it's kind of amazing in
my own life like I play guitar every day
for life forced myself to at least for
five minutes play guitar it's just it's
a ridiculously short period of time but
because I've gotten into that habit it's
incredible what you can accomplish in a
period of a year or two years you could
become you know
exceptionally good at certain aspects of
a thing by just doing it every day for a
very short period of time it's kind of a
miracle that that is how it works
it's adds up over time yeah and I think
is this something is often not about the
bursts of sustained effort and all-night
is because you can only do that in a
limited
of times it's the sustained effort over
a long time I think you know reading two
research papers there's a nice thing to
do but the power is not reading through
research papers this reading through
research papers a week for a year then
you've read a hundred papers and and you
actually learn a lot when you read a
hundred papers so regularity and making
learning a habit do you have do you have
general other study tips for
particularly deep learning that people
should in in their process of learning
is there some kind of recommendations or
tips you have as they learn one thing I
still do when I'm trying to study
something really deeply is take
handwritten notes in theories I know
there are a lot of people that take the
deep learning courses during the
commutes or something where maybe mobile
quit to take notes so I know it's may
not work for everyone but when I'm
taking courses on Coursera you know and
that still takes on my every now and
then the most recent I took was a was a
course on clinical trials because those
engines of all that I got my little
moleskin notebook and I was sitting in
my desk is just taking down notes so
what the instructor was saying and that
Act we know that that act of taking
notes preferably handwritten notes
increases retention so as you're sort of
watching the video just kind of pausing
maybe and then taking the basic insights
down on paper yeah so I should have been
a few studies if you know search online
you find for some of these studies that
taking handwritten notes because
handwriting is slower as were saying
just now um it causes you to recoat the
knowledge in your own words more and
that process of recoding promotes
long-term attention this is as opposed
to typing which is fine again typing is
better than nothing
or in taking across and nautical is
better than nothing any cause law but
comparing handwritten notes and typing
um you can usually type faster for a lot
of people do you can hand write notes
and so when people type they're more
likely to transcribe verbatim what they
heard and that reduces the amount of
recoding and that actually results in
less long-term retention I don't know
what the psychological effect there is
but so true there's something
fundamentally different about
in handwriting I wonder what that is I
wonder if it is as simple as just the
time it takes to write it slower
yeah and and and because because you
can't write as many words you have to
take whatever they said and summarize it
into fewer words and that summarization
process requires deeper processing of
the meaning which then results in better
attention that's fascinating oh and then
I spent I think yeah because of course
error I spent so much time studying
pedagogy thank you my passion that I
really love learning how to more
efficiently help others learn yeah one
of the things I do both in creating
videos or when we write the batch is um
I kind of think is one minute spent of
us going to be a more efficient learning
experience than one minute spent
anywhere else and we really try to you
know make a time efficient for the
learning it's good to know everyone's
busy so when when we're editing them I
often tell my teams everywhere it needs
to fight for his life and if can delete
it where this is the lead to that not
wait that's not waste than during this
time wow that's so it's so amazing that
you think that way because there is
millions of people that are impacted by
your teaching and sort of that one
minute spent has a ripple effect right
three years of time which is just
fascinating talk about how does one make
a career out of an interest in deep
learning give advice for people we just
talked about sort of the beginning early
steps but if you want to make it a
entire life's journey or at least a
journey of a decade or two how did it
how do you do it so most important thing
is to get started right and ever I think
in the early part of a career coursework
um like the divine specialization or
it's a very efficient way to master this
material so because you know instructors
be me or someone else or you know
Laurence Moroney teaches our tensor
field specialization and other things
we're working on spend effort to try to
make a time efficient for you to learn
new concepts of coursework because
actually a very efficient way for people
that learn concepts and the beginning
parts of break into new fields in fact
one thing I see at Stanford
some of my PhD students want to jump in
the research right away and actually
tend to say look when you first copy
yours the piece didn't spend time
ticking causes because it lays the
foundation it's fine if you're less
productive in your first couple of years
you'd be better off in the long term
um beyond a certain point there's
materials that doesn't exist in courses
because it's too cutting edge the
courses we created yeah there's some
practical experience that we're not yet
that good as teaching in a in a course
and I think after exhausting the
efficient course were then most people
need to go on to either ideally work on
projects and then maybe also continue
their learning by reading blog polls and
research papers and thing like that
doing practice is really important and
again I think is important to start
small it's just do something today you
read about deep learning if you like all
these people doing such exciting things
whatever I'm not building a neural
network they change the world and what's
the point well the point is sometimes
building that time in your network you
know be it m-miss or upgrade to a
fashion amnesty whatever it's doing your
own fun hobby project that's how you
gain the skills to let you do bigger and
bigger projects I find this to be true
at the individual level and also at the
organizational level for company to
become good at machine learning
sometimes the right thing to do is not
to tackle the giant project is instead
to do the small project that lets the
organization learn and then build up
from there
but this triple for individuals and and
and for and for companies just taking
the first step and then taking small
steps it's the key should students
pursue a PhD do you think you can do so
much that's the one of the fascinating
things in machine learning you can have
so much impact without ever getting a
PhD so what are your thoughts should
people go to grad school should people
get a PhD I think that there are
multiple good options of which doing a
PhD could be one of them I think that if
someone's admitted to top ph.d program
you know that MIT Stanford top schools I
think that's a very good experience or
someone gets a job at a top organization
at the top a I team I think that's also
good experience there are some things
you still need a PhD to do if someone's
aspiration is to be a professor here at
the top academic University you just
need a PhD to do that but if it goes to
you know start a complete build a
complete do great technical work I think
PhD is a good experience but I would
look at the different options available
to someone you know where the places
where you can get a job where the place
isn't getting a PhD program and kind of
weigh the pros and cons of those so just
to linger on that for a little bit
longer what final dreams and goals do
you think people should have so the what
options for they explore so you can work
in industry so for a large company like
Google Facebook buy do all these large
companies already have huge teams of
machine learning engineers you can also
do with an industry sort of more
research groups that kind of like Google
research Google brain that you can also
do like we say the professor neck as in
academia and what else oh you can still
build your own company you can do a
start-up is there anything that stands
out between those options or are they
all beautiful different journeys that
people should consider I think the thing
that affects your experience more is
less are you in discomfort versus that
company your academia versus industry I
think the thing that affects to
experience Moses who are the people
you're interacting with you know in the
daily basis so even if you look at some
of the large companies the experience of
individuals and different teams is very
different and what matters most is not
the logo above the door when you walk
into the giant building every day what
matters the most is who are the 10
people who are the 30 people you
interact with every day
so I actually tend to advise people if
you get a job from from a company also
who is your manager who are your peers
who are you actually going to talk to
you we're all social creatures we tend
to you know become more like the people
around us and if you're working with
great people you will learn faster or if
you get admitted if you get a job at a
great company or a great university
maybe the logo you walk in you know is
great but you're actually stuck on some
team doing really worth it doesn't
excite you and then that's actually
really bad experience so this is true
both
universities and for large companies for
small companies you can kind of figure
out who you be working quite quickly and
I tend to advise people if a company
refuses to tell you who you work with
someone say oh join us the rotation
system will figure out I think that that
that's a worrying answer because it
because it means you may not get sense -
you mean not actually get to team with
with great peers and great people to
work with it's actually a really
profound advice that we kind of
sometimes sweep we don't consider to
rigorously or carefully the people
around you are really often this
especially when you accomplish great
things
it seems the great things are
accomplished because of the people
around you so that that's a it's not
about the the worry whether you learn
this thing or that thing or like you
said the logo that's hangs up top it's
the people that's a fascinating and it's
such a hard search process of finding
just like finding the right friends and
somebody to get married with and that
kind of thing it's a very hard search
process a people search problem yeah but
I think when someone interviews you know
at a university or the research lab at a
large corporation it's good to insist on
just asking who are the people who is my
manager and if you refuse to tell me I'm
gonna think well maybe that's because
you don't have a good answer it may not
be someone I like
and if you don't particularly connect if
something feels off for the people then
don't stick to it you know that's a
really important signal to consider yeah
and that's yeah I am in my standard
cause cs2 30s was an ACN talk I think I
gave like a hour long talk on career
advice including on the job search
process and then some of these those are
yours if you can find those videos on
also and others I'll point people to
them beautiful so the AI fund helps ai
startups get off the ground or perhaps
you can elaborate all the fun things
it's evolved with what's your advice and
how does one build a successful hey I
start up you know in second Valley a lot
of starter failures come from building
our products that no one wanted
so when you know cool technology but
who's gonna use it so I think I tend to
be very outcome driven um and then
customer obsess ultimately we don't get
to vote if we succeed or fail is only
the customer that the only one that gets
a thumbs up or thumbs down those in the
long term in the short term you know
there are various people who get various
votes but in the long term that's what
really matters so as you build to start
where to cast as the question well the
customer gives a thought and give a
thumbs up on this I think so I think
startups that are very customer focused
customer says deeply understand the
customer and are oriented to serve the
customer are more likely to succeed
with the provision that I think all of
us should only do things that we think
create social good and lose the world
for words I'm sorry I personally don't
want to build addictive digital products
just so long as you know the things that
that could be lucrative but I won't do
but if we can find ways to serve people
in meaningful ways I think those can be
those can be great things to do either
the academic setting or in a corporate
setting real startup setting so can you
give me the idea of why you started the
AI fund I remember when I was leaving
the AI group at Baidu I had two jobs two
parts of my job one was to build an AI
engine to support the existing
businesses and that wasn't running you
know just read this performed by itself
the second part of my job at the time
which was to try to systematically
initiate new lines of businesses using
the company's aiq abilities so you know
the self-driving car team came out my
group the spot speaker team
similar to what is some amazonica a
lexer in the US but we announced it
before Amazon did so we were goodbye to
wasn't following him wasn't following an
Amazon that that came out of my group
and I found that to be um actually that
the most fun part of my job so what I
what to do was to build AI fund as a
startup studio to systematically create
new startup firms
with all the things we can now do of AI
I think the ability to build new teams
to go after this rich space of
opportunities is a very important way to
very important mechanism to get these
projects done that I think will move the
world forward so of unfortunate that
don't the few teams that had a
meaningful positive impact and I felt
that we might present do this in the
most systematic repeatable way so a
start-up studio is a relatively new
concept there there are maybe dozens of
startup studios you're right now but I
feel like all of us many teams are still
trying to figure out how do you
systematically build companies with a
high success rate so I think even though
my you know venture capital friends are
seem to be more and more building
companies rather than investing
companies but I find a fascinating thing
to do to figure out the mechanisms by
which we could systematically build
successful teams successful businesses
in in areas that we find meaningful so
startup studio is something is is a
place and a mechanism for startups to go
from zero to success so try to develop a
blueprint it's actually a place for us
to build startups from scratch so we
often bring in found this and work with
them or maybe even have existing ideas
that we match founders with and then
this launches yo hopefully into
successful companies so how close are
you to figuring out a way to automate
the process of starting from scratch and
building successful AI startup yeah I
think we've we've been constantly
improving and iterating on our processes
but how we do that so things like you
know how many customer calls do we need
to make and all they get customer
validation how do we make sure this
technology can be built
well all of our businesses need
cutting-edge machine learning algorithms
so you know kind of Alrosa develop in
the last one or two years and even if it
works in a research paper it turns out
taking the production it's really hard a
lot of issues for making these things
work in the real life didn't know why
the actress in academia so how do you
validate
is actually doable how do you build a
team get the specialize domain knowledge
speed in education or healthcare or
whatever staffing are focusing on so I
think we're actually getting we've been
getting much better at giving the
entrepreneurs a high success rate but I
think we're still I think the whole
world is still in the early phases
freaking us out but do you think there
is some aspects of that process the
transferable from one startup to another
to another to another yeah very much so
you know starting a company to most
entrepreneurs is is a really lonely
thing and I've seen so many
entrepreneurs not know how to make a
certain decision like when do you need -
how do you do PDP sales right if you
don't know that this is really hard or
how do you market this efficiently other
than you're buying ads which is really
expensive
other more efficient tactics that know
from machine learning project you know
basic decisions can change the course of
whether machine learning product works
or not and so there are so many hundreds
of decisions that entrepreneurs need to
make and making a mistake in a couple of
key decisions can have a huge impact on
the fate of the company so I think a
starter studio provides a support
structure that makes starting a company
much less of a lonely experience and
also um when facing with these key
decisions like trying to hire your first
the VP of Engineering what's a good
selection criteria do you sauce should I
hire this person or not but helping by
having by having an ecosystem around the
entrepreneurs the founders to hope I
think we help them at the key moments
and hopefully cyclically
make them more enjoyable and in higher
success rate there's somebody to
brainstorm with in these very difficult
decision points and also to help then
recognize what they may not even realize
is a key decision point right that's
that's the first probably the most
important part yeah you can say one
other thing um you know I think the
building companies is one thing but I
feel like is really important that we
build companies
move the world forward for example
Lavinia funteam does once an idea for a
new company that if it had succeeded but
have resulted in people watching a lot
more videos in a certain narrow vertical
type of video looked at it the business
case was fine the revenue case was fine
but a look that I just said I don't want
to do this that you know I don't
actually just want to have a lot more
people watch this type of video wasn't
educational is the educational Haiti and
so and so III code the idea on the basis
that didn't think it would actually help
people so what the building companies or
work of enterprises or doing personal
projects I think it's up to each of us
to figure out what's the difference we
want to make in the world with learning
AI you helped already established
companies grow their AI and machine
learning efforts how does a large
company integrate machine learning into
their efforts AI is a general purpose
technology and I think it will transform
every industry our community has already
transformed the logic center software
internet sector most software internet
companies outside the top right five or
six or three or four already have
reasonable machine learning capabilities
or or getting there is still room for
improvement but when I look outside the
software internet sector everything from
manufacturing agriculture healthcare
they're just X translation there's so
many opportunities that very few people
are working on so I think the next way
for AI is first also transform all of
those other industries there was a
McKinsey study estimating 13 trillion
dollars of global economic growth the
u.s. GDP is 19 trillion dollars or
thirteen trillion this is a big number
or PwC it's been 16 trillion dollars so
whatever number is this large but the
interesting thing to me was a lot of
that impact would be outside the
software internet sector so we need more
teams to work with these companies to
help them adopt AI and I think this is
one things that make you hope drive
global economic growth and make humanity
more powerful and like you said the
impact is there so what are the best
industries the biggest industries where
AI can
perhaps outside the software tech sector
um frankly I think is all of them some
of the ones I'm spending a lot of time
on are manufacturing agriculture looking
to healthcare for example in
manufacturing we do a lot of our work in
visual inspection where today there are
people standing around using the AI
humanoid to check it you know this
plastic part or the smartphone or this
thing has a stretch or gentle something
in it um we can use a camera to take a
picture use a algorithm deep learning
and other things to check if it's
defective or not and does our factories
improve you then improve quality and
improve throughput it turns out the
practical problems we run into are very
different than the ones you might read
about in most research papers the data
says they're really small so if a small
D the problems you're the factories keep
on changing the environment so it works
well on your test set but guess what
you know the something changes in the
factory the lights go on they're off
recently we there was a factory in which
M burned through through the factory and
pooped on something and so that you know
so that changed stuff and so increasing
our algorithm of making robustness so
all the changes happen the factory I
find that we runs a lot of practical
problems that that are not as widely
discussed in in academia and is really
fun kind of being on the cutting edge
solving these problems before you know
maybe before many people are even aware
that there is a problem there and that's
such a fascinating space you're
absolutely right but what is the first
step that a company should take it's
just scary leap into this new world of
going from the human eye inspecting to
digitizing that process having a camera
having an algorithm what's the first
step like what's the early journey that
you recommend that you see these
companies taking I published a document
called the AI transformation playbook
that's online and talk briefly if
everyone course on Coursera about the
long term journey that companies should
take but the first step is actually to
start small I've seen lot more companies
fail by starting to bake than by
starting to small um take even Google
you know most people
realize how hard it was and how
controversial was in the early days so
when it's not the Google brain um it was
controversial
you know people thought deep-learning
Nunez tried it didn't work why would you
want to do deep learning
so my first internal customer rule in
Google was the Google speech team which
is not the most lucrative project in
Google but not the most important it's
not web search or advertising but by
starting small on my team helped the
speech team build a more accurate speech
recognition system and this caused their
peers other teams to start at more faith
and deep learning my second internal
customer was the Google Maps team where
we use computer vision to read house
numbers from basic Street view images
the more accurately locate houses within
Google Maps so improve the quality later
and there's only after those two
successes that I then started the most
serious conversation with a Google Ads
team and so there's a ripple effect that
you showed that it works in these in
this cases and then it just propagates
through the entire company that this
this thing has a lot of value and use
for us I think the early small-scale
projects it helps the teams gain faith
but also hosts the team's learn what
these technologies do I still remember
when our first GPU server it was a
server under some guys desk and you know
and and then that taught us early
important lessons about how do you have
multiple users share a set of GPUs which
is really non-obvious at the time but
those early lessons were important we
learned a lot from that first GPU server
then later helped the teams think
through how to scale without too much
large deployments are there concrete
challenges that companies face that the
UC is important for them to solve I
think building and deploying machine
learning systems is hard there's a huge
gulf between something that works and I
drew the notebook on your laptop versus
something runs their production
deployment setting in a factory or
culture plant or whatever um so I see a
lot of people you know get something to
work on your laptop you say wow look
without done and that's that's that's
great that's hot that's a very important
first step but all teams underestimate
the rest of the steps
um so for example I've heard this exact
same conversation between a lot of
machine learning people and
businesspeople
the machine learning person says look my
algorithm does well on the test set and
the clean test said I didn't a peak and
then machine and the business person
says thank you very much but your
algorithm sucks it doesn't work and the
machine learning person says no wait I
did well on the test set um and I think
there is a gulf between what it takes to
do well on a test set on your hard drive
versus what it takes to work well in a
deployment setting some some common
problems robustus in generalization you
know yuuta for something the factory
maybe they chopped down a tree outside
the factory so the tree no longer covers
the window and the lighting is different
so the first set changes and in machine
learning and especially in academia we
don't know how to deal with test set
distributions that are dramatically
different than the training set
distribution this research there's stuff
like domain annotation transfer learning
you know that the people working on it
but we're really not good at this
so how do you actually get this to work
because your test set distribution is
going to change and I think um also if
you look at the number of lines of code
in a software system the machine
learning model it's maybe five percent
or even fewer relative to the entire
software system we need to build so how
to get all that work done and make it
reliable and systematic a good software
engineering work is fundamental here to
building a successful small machine
learning system yes and and and the
software system needs to interface with
people's work clothes so machine
learning is automation on steroids
if we take one task all the many tasks
that done in factories so in factory
does lots of things one tosses visual
inspection if we automate that one task
it can be really valuable but you may
need to redesign a lot of other tasks
around that one task for example say the
machine learning algorithm says this is
defective what is supposed to do is you
throw the way to get a human to double
check do you want to rework it or fix it
so you need to redesign a lot of toss
around that thing you've now automated
so
planning for the change management and
making sure that the software he write
is consistent with the new work though
and you take the time to explain to
people when he so happens I think what
Lani AI has become good at and I think
we learned by making mistakes and you
know painful experiences for my ring
what would become good at is working
with our partners to think through all
the things beyond just the machine
learning model don't you put a notebook
but build the entire system manage the
change process and figure out how to
deploy this in a way that has an actual
impact the processes that the large
software tech companies use for
deploying don't work for a lot of other
scenarios there for example when I was
leading you know large speech teams um
if the speech my vision system goes down
what happens what allowance goes off and
then someone like me will say hey you 20
engineers please fix this baby with an
American but if you have a system garden
in the factory there are not 20 machine
learning engineers sitting around you
can page the duty and have them fix it
so how do you deal with the maintenance
or the or the DevOps or the mo ops or
the other aspects of this so these are
concepts that I think landing AI and a
few other teams on the cutting edge uh
but we don't even have systematic
terminology yet to describe some of the
stuff we do because I think we're we're
indenting it on the fly so you mentioned
some people are interested in
discovering mathematical beauty and
truth in the universe and you're
interested in having big positive impact
in the world so let me ask the two are
not inconsistent no they're all together
I'm only half joking because you're
probably interested a little bit in both
but let me ask a romanticized question
so much of the work your work and our
discussion today has been on the applied
AI maybe you can even call narrow AI
where the goal is to create systems that
automate some specific process that adds
a lot of value to the world but there's
another branch of AI starting with Alan
Turing the kind of dreams of creating
human level or superhuman level
intelligence is this something you dream
of as well do you think we human beings
will ever build a human level
they're superhuman level intelligent
system I would love to get the AGI and I
think humanity will but whether it takes
a hundred years or 500 or 5,000 I find
hard to estimate do you have some folks
have worries about the different
trajectories that path would take even
existential threats of an AGI system do
you have such concerns whether in the
short term or the long term I do worry
about the long term fate of humanity um
I do wonder as well I do worry about
overpopulation on the planet Mars just
not today
I think there will be a day when maybe
maybe someday in the future mass will be
polluted there are these children dying
and some will look back at this video
and say Andrew how is Anja so heartless
you didn't care about all these children
dying on the planet Mars and I apologize
to the future viewer I do care about the
children but I just don't know how to
productively work on that today your
picture will be in the dictionary for
the people who are ignorant about the
overpopulation on Mars okay yes so it's
a long term problem is there something
in the short term we should be thinking
about in terms of aligning the values of
our AI systems with the values of us
humans sort of something this to Russell
and other folks are thinking about as
this system develops more and more we
want to make sure that it represents the
better angels of our nature
the ethics the values of our society you
know if you take so driving cars um the
biggest problem with self-driving cars
is not that there's some trolley dilemma
and you teach this so you know how many
times when you're driving your car did
you face this moral dilemma as it would
I food I crash into you so I think
itself Giancarlo runs that problem
roughly as often as we do when we drive
our cars um the biggest problem Sir John
calls is when there's a big white truck
across the road and what you should do
is break and not crash into it and the
search on car fails and it crashes into
it so I think we need to solve that
problem for us I think the problem with
some of these discussions about a
gi you know alignments the paperclip
problem is that is a huge distraction
from the much harder problems that we
actually need to address today some hard
problems yesterday
I think I'm bias is a huge issue um I
worry about wealth inequality the AI and
Internet are causing an acceleration of
concentration of power because we can
now centralized data use there to
process it and so industry after
industry we've affected every industry
so the internet industry has a lot of
winner-take- modes are willing to take
all dynamics but if infected all these
other industries so also giving these
other industries when they take most I'm
going to take all flavors so look at
what uber and lyft into the taxi
industry so we're doing this type of
things along so this so creating
tremendous wealth but how do we shoulder
the wealth is fairly shared I think that
and then how do we help people whose
jobs are displace you know I think
education is part of it there may be
even more that we need to do then
education I think bias is a serious
issue there adverse users of AI and like
deep fakes being used for various
nefarious purposes so I worry about some
teams maybe accidentally and I hope not
deliberately making a lot of noise about
things that problems in the distant
future rather than focusing on senses
much harder problems
yeah the overshadow the problems that we
have already today they're exceptionally
challenging like those you said and even
the silly ones but the ones that have a
huge impact which is the lighting
variation outside of your factory window
that that ultimately is what makes the
difference between like you said the
jupiter notebook and something that
actually transforms an entire industry
potentially yeah and I think and then
just to some companies when a regulator
comes to you and says no your product is
messing things up fixing it may have a
revenue impact was much more fun to talk
to them about how you promise not to
wipe out humanity in this interface
they're actually really hard problems we
face so your life has been a great
journey from teaching to research to
entrepreneurship
two questions one are there regrets
moments that if you went back you would
do differently and two are there moments
you're especially proud of moments that
made you truly happy you know I've made
so many mistakes it feels like every
time I discover something I go why
didn't I think of this you know five
years earlier or even ten years earlier
and Reese's and then sometimes I read a
book and I go I wish I read this book
ten years ago my life we've been so
different although that happened
recently and then I was thinking if only
I read this book when we're a start-up
Coursera
could have been so much better but I
discovered that book had not yet been
written we're starting Coursera so that
means even but I find that the process
of discovery we keep on finding out
things that seem so obvious in hindsight
but it always takes us so much longer
than then I wish to figure it out so on
the second question are there moments in
your life that if you look back that
you're especially proud of or especially
happy the that fills you with happiness
and fulfillment well two answers one
despite all turnover yes of course you
say no matter how much time I spend for
I just can't spend enough time with her
congratulations weather thank you and
then second is helping other people I
think to me I think the meaning of life
is um helping others achieve whatever
are their dreams and then also to try to
move the world forward by making
humanity more powerful as a whole so the
times that I felt most happy most proud
works when I felt um someone else
allowed me the good fortune of helping
them a little bit on the path to their
dreams I think there's no better way to
end it than talking about happiness and
the meaning of life so enter it's a huge
honor me and millions of people thank
you for all the work you've done thank
you for talking to thank you so much
thanks
thanks for listening to this
conversation with Andrew Aang and thank
you to our presenting sponsor cash app
downloaded use coal export cast you'll
get ten dollars and $10 will go to first
an organization that inspires and
educates young minds to become science
and technology innovators of tomorrow if
you enjoy this podcast subscribe on
youtube give it five stars and Apple
podcast supported on patreon or simply
connect with me on Twitter at Lex
Friedman and now let me leave you with
some words of wisdom from NGO Aang ask
yourself if what you're working on
succeeds beyond your wildest dreams
which you have significantly helped
other people if not then keep searching
for something else to work on otherwise
you're not living up to your full
potential
thank you for listening and hope to see
you next time
you