Kind: captions Language: en the following is a conversation with michael littman a computer science professor at brown university doing research on and teaching machine learning reinforcement learning and artificial intelligence he enjoys being silly and lighthearted in conversation so this was definitely a fun one quick mention of each sponsor followed by some thoughts related to the episode thank you to simply safe a home security company i use to monitor and protect my apartment expressvpn the vpn i've used for many years to protect my privacy and the internet masterclass online courses that i enjoy from some of the most amazing humans in history and better help online therapy with a licensed professional please check out the sponsors in the description to get a discount and to support this podcast as a side note let me say that i may experiment with doing some solo episodes in the coming months or two the three ideas i have floating in my head currently is to use one a particular moment in history two a particular movie or three a book to uh drive a conversation about a set of uh related concepts for example i could use 2001 a space odyssey or x machina to talk about agi for one two three hours or i could do an episode on the yes rise and fall of hitler and stalin each in a separate episode using relevant books and historical moments for reference i find the format of a solo episode very uncomfortable and challenging but that just tells me that it's something i definitely need to do and learn from the experience of course i hope you come along for the ride also since we have all this momentum built up on announcements i'm giving a few lectures on machine learning at mit this january in general if you have ideas for the episodes for the lectures or for just short videos on youtube let me know in the comments that i still definitely read despite my better judgment and the wise sage device of the great joe rogan if you enjoy this thing subscribe on youtube review it with five stars on apple podcast follow on spotify support on patreon or connect with me on twitter lex friedman and now here's my conversation with michael littman i saw a video of you talking to charles this bell about westworld the tv series you guys were doing a kind of thing where you're watching new things together but let's rewind back is there a sci-fi movie or book or shows that you that was profound that had an impact on you philosophically or just like specifically something you enjoyed nerding out about yeah interesting i think a lot of us have been inspired by robots in movies the one that i really like is uh there's a movie called robot and frank which i think is really interesting because it's very near-term future where uh robots are being deployed as uh helpers in people's homes and it was it was and we don't know how to make robots like that at this point but it seemed very plausible it seemed very realistic or imaginable and i thought that was really cool because they did they're awkward they do funny things it raised some interesting issues but it seemed like something that would ultimately be helpful and good if we could do it right yeah he was an older cranky gentleman right he was an older cranky uh jewel thief yeah it's kind of funny little thing which is you know he's a dual thief and so he pulls the robot into his life which is like which is something you could imagine taking a home robotics thing and pulling into whatever quirky thing that's involved in your this is meaningful to you exactly so yeah and i think i think from that perspective i mean not all of us are jewel thieves and so when we bring our robots into it for yourself uh explains a lot about this apartment actually but no the idea that that people should have the ability to you know make this technology their own that that it becomes part of their lives and and i think that's it's hard for us as technologists to make that kind of technology it's easier to mold people into what we need them to be and um just that opposite vision i think is really inspiring and then there's a anthropomorphization where we project certain things on them because i think the robot was kind of dumb but i have a bunch of roombas that play with and they you immediately project stuff onto them much greater level of intelligence we'll probably do that with each other too much much greater degree of compass that's right one of the things we're learning from ai is where we are smart and where we are not smart yeah you also enjoy as people can see and i enjoyed myself uh watching you sing and even dance a little bit a little bit a little bit a little bit of dancing a little bit of dancing that's not quite my thing as a as a method of education or just in life you know in general so easy question what's the definitive objectively speaking top three songs of all time maybe something that you know uh to walk that back a little bit maybe something that others might be surprised by the three three songs that you kind of enjoy that is a great question that i cannot answer but instead let me tell you a story so pick a question you do want it that's right i've been watching the presidential debates and vice president debates and turns out yeah it's really you can just answer any question you want so so it's a related question [Laughter] yeah well said i really like pop music i've enjoyed pop music ever since i was very young so 60s music 70s music 80s music this is all awesome and then i had kids and i think i stopped listening to music and i was starting to realize that the like my musical taste had sort of frozen out and so i decided in 2011 i think to start listening to the top 10 billboard songs each week so i'd be on the on the treadmill and i would listen to that week's top 10 songs so i could find out what was popular now and what i discovered is that i have no musical taste whatsoever i like what i'm familiar with and so yeah the first time i'd hear a song it's the first week that was on the charts i'd be like and then the second week i was into it a little bit and the third week i was loving it and by the fourth week is like just part of me and so i'm afraid that i can't tell you the most my favorite song of all time because it's whatever i heard most recently yeah that's interesting people have told me that um there's an art to listening to music as well you can start to if you listen to a song just carefully like explicitly just force yourself to really listen you start to uh i did this when i was part of jazz band and fusion band in college is there's they you you start to hear the layers of the instruments you start to hear the individual instruments and you start to uh you can listen to classical music or to orchestra this way you can listen to jazz this way i mean uh it's funny to imagine you now to walk in that forward to listening to pop hits now as like a scholar listening to like cardi b or something like that or justin timberlake is he no not temple like bieber i guess they've both been in the top 10 since i've been listening they're still still up there oh my god i'm so clueless if you haven't heard justin timberlake's top 10 in the last few years there was one song that he did where the music video was set at essentially nurips oh wow oh the one with the robotics yeah yeah yeah yeah yeah yeah he's like at an academic conference and he and he's doing it he was presenting it was sort of a cross between the apple like steve jobs kind of talk and nurips um so i you know it's always fun when ai shows up in pop culture i wonder if he consulted somebody for that that's very that's really interesting so maybe on that topic i've seen your um your celebrity multiple dimensions but one of them is you've done cameos in different places i've seen you in a turbo tax commercial as like i guess the the brilliant einstein character and the the point is that turbo tax doesn't need somebody like you doesn't need a brilliant very few things need someone like me but yes they were specifically emphasizing the idea that you don't need to be a like a computer expert to be able to use their software how did you end up in that world i think it's an interesting story so i was teaching my class it was an intro computer science class for non-concentrators non-majors and sometimes when people would visit campus they would check in to say hey we want to see what a class is like can we sit on your class so a person came to my class who was the daughter of the brother of the hus husband of the best friend of my wife anyway basically a family friend came to campus to to check out brown and asked to come to my class and and came with her dad her dad is uh who i've known from various kinds of family events and so forth but he also does advertising and he said that he was recruiting scientists for this this this ad this this turbotax set of ads and he said we wrote the ad with the idea that we get like the most brilliant researchers um but they all said no so can you help us find the like b level scientists i'm like sure that's that's who i hang out with so that should be fine so i put together a list and i did what some people call the dick cheney so i included myself on the list of possible candidates uh you know with a little blurb about each one and why i thought it would make sense for them to to do it and they reached out to a handful of them but then they ultimately they youtube stalked me a little bit and they thought oh i think he could do this and um they said okay we're gonna offer you the commercial i'm like what so um it was it was such an interesting experience because it's it's they have another world the people who do like nationwide kind of ad campaigns and and television shows and movies and so forth it's quite a a remarkable system that they have going because like a set yeah so i went to uh it was just somebody's house that they rented in new jersey um but it in the in the commercial it's just me and this other woman in reality there were 50 people in that room and another i don't know half a dozen kind of spread out around the house in various ways there were people whose job it was to control the sun they were in the backyard on ladders putting filters up to try to make sure that the sun didn't glare off the window in a way that would wreck the shot so there was like six people out there doing that there was three people out there giving snacks the craft table there was another three people giving healthy snacks because that was a separate craft table there was one person whose job it was to keep me from getting lost and the i think the reason for all this is because so many people are in one place at one time they have to be time efficient they have to get it done this the morning they were going to do my commercial in the afternoon they were going to do a commercial of a mathematics professor from princeton they had to get it done no you know no wasted time or energy and so there's just a fleet of people all working as an organism and it was fascinating i was just the whole time just looking around like this is so neat like one person whose job it was to take the camera off of the camera man so that someone else whose job it was to remove the film canister because every couple's takes they had to replace the film because you know film gets used up it was just i don't know i was geeking out the whole time it was so fun how many takes did it take it looked the opposite like there was more than two people there it was very relaxing right yeah the super i mean the person who i was in the scene with um is a professional she's a you know uh she's an actor improv comedian okay in your community and when i got there they had given me a script as such as it was and then i got there and they said we're gonna do this as improv i'm like i don't know how to improv like this is not i don't know what this i don't know what you're telling me to do here don't worry she knows okay okay we'll see how this goes i get i guess i got pulled into the story because like where the heck did you come from i guess in the scene like how did you show up in this random person's house i don't know yeah well i mean the reality of it is i stood outside in the blazing sun there was someone whose job it was to keep an umbrella over me because i started to schvitz i started to sweat and so i would wreck the shot because my face was all shiny with sweat so there was one person who would dab me off had an umbrella um but yeah like the reality of it like why is this strange stalkery person hanging around outside somebody's house yeah we're not we're not sure when you have to look in we'll have to wait for the book but are you uh so you make you make like you said youtube you make videos yourself you make awesome parody sort of uh parody songs that kind of focus in on particular aspects of computer science how much those seem really natural how much production value goes into that do you also have a team of 50 people videos almost all the videos except for the ones that people would have actually seen were just me i write the lyrics i sing the song i i generally find a um like a backing track online because i'm unlike you can't really play an instrument and then i do in some cases i'll do visuals using just like powerpoint lots and lots of powerpoint to make it sort of like an animation the the most produced one is the one that people might have seen which is the overfitting video that i did with charles isbell um and that was produced by the georgia tech and udacity people because we were doing a class together it was kind of i usually do parody songs kind of to cap off a class at the end of a class so that one you're wearing so it's a this the thriller yeah you're wearing the michael jackson the red leather jacket the interesting thing with podcasting that you're also uh into is that i really enjoy is that there's not a team of people it's kind of more because you know the the there's something that happens when there's more people involved than just one person that just the way you start acting i don't know there's a censorship you're not given especially for like slow thinkers like me you're not and i think most of us are if we're trying to actually think we're a little bit slow and and careful it it kind of large teams get in the way of that and i don't know what to do with ice like that's the to me like if you know this it's very popular to criticize quote unquote mainstream media i but there is legitimacy to criticizing them the same i love listening to npr for example but every it's clear that there's a team behind it there's a commercial there's constant commercial breaks there's this kind of like rush of like uh okay i have to interrupt you now because we have to go to commercial just this whole it creates it destroys the possibility of nuanced conversation yeah exactly evian uh which charles uh isabel who i i talked to yesterday told me that evian is naive backwards which the fact that his mind thinks this way is just uh it's quite brilliant anyway there's a freedom to this podcast he's dr awkward which by the way is a palindrome that's a palindrome that i happen to know for from other parts of my life and i just you just throw it out well you know use it against charles dr awkward so what uh what was the most challenging parody song to make was it the thriller one hmm no that was really fun i wrote the lyrics really quickly um and then i gave it over to the product production team they recruited a a cappella group to to sing that went it went really smoothly it's great having a team because then you can just focus on the part that you really love which in my case is writing the lyrics uh for me the most challenging one not challenging in a bad way but challenging in a really fun way was i did one of this one of the parody songs i did is is about the halting problem in computer science the the fact that you can't create a program that can tell for any other arbitrary program whether it actually going to get stuck in infinite loop or whether it's going to eventually stop and so i i did it to an 80s song because that's i hadn't started my new thing of learning current songs and it was billy joel's the piano man nice which is a great song great song yeah yeah and sing me a song you get the piano man yeah yeah so the lyrics are great because first of all it rhymes uh not all songs rhyme i did i've done rolling stone songs which turn out to have no rhyme scheme whatsoever they're just sort of yelling and having a good time which makes it not fun from a parody perspective because like you can say anything but this you know the lines rhymed and there was a lot of internal rhymes as well and so figuring out how to sing with internal rhymes a proof of the halting problem was really challenging and it was i really enjoyed that process what about uh last question on this topic what about the dancing in the thriller video how many takes that take so i wasn't planning to dance they they had me in the studio and they gave me the jacket and it's like well you can't if you have the jacket and the glove like there's not much you can do yeah so i um i think i just danced around and then they said why don't you dance a little bit we there was a scene with me and charles dancing together they did not use it in the video but we recorded it um yeah yeah no it was it was pretty funny and charles who has this beautiful wonderful voice doesn't really sing he's not really a singer and so that was why i designed the song with him doing a spoken section and me doing things very like barry white yeah it's a smooth baritone yeah yeah it's great that was awesome so one of the other things charles said is that you know everyone knows you as like a super nice guy super passionate about teaching and so on uh what he said i don't know if it's true that despite the fact that you're you are cold like okay i will admit this finally for the first time that was that was me it's the johnny cash song the man in reno just to watch him die uh that you actually do have uh some strong opinions on some topics so if this in fact is true what uh strong opinions would you say you have is there ideas you think maybe an artificial intelligence machine learning maybe in life that you believe is true that others might you know some number of people might disagree with you on so i try very hard to see things from multiple perspectives there's there's this great calvin and harp's calvin and hobb's cartoon where cal do you know okay so calvin's dad is always kind of a bit of a foil and he he was he talked to calvin and just calvin had done something wrong the dad talks him into like seeing it from another perspective and calvin like this breaks calvin because he's like oh my gosh now i can see the opposite sides of things and so the it's it becomes like a cubist cartoon where there is no front and back everything's just exposed and it really freaks him out and finally he settles back down it's like oh good no i can make that go away but like i'm that i'm that i live in that world where i'm trying to see everything from every perspective all the time so there are some things that i've formed opinions about that i would be harder i think to disavow me of one is um the super intelligence argument and the existential threat of ai is one where i feel pretty confident in my feeling about that one like i'm willing to hear other arguments but like i am not particularly moved by the idea that if we're not careful we will accidentally create a super intelligence that will destroy human life let's talk about that let's get you in trouble and record your video it's like bill gates uh i think he said like some quote about the internet that that's just gonna be a small thing it's not gonna really go anywhere and i think uh steve ballmer said uh i don't know why i'm sticking on microsoft uh that's something that like smartphones are useless there's no reason why microsoft should get into smartphones that kind of so let's get let's talk about agi as agi is destroying the world we'll look back at this video and see no uh i think it's really interesting to actually talk about because nobody really knows the future so you have to use your best intuition it's very difficult to predict it but you have spoken about agi and the existential risks around it and sort of based on your intuition that we're quite far away from that being a serious concern relative to the other concepts we have can you maybe uh unpack that a little bit yeah sure so so as as i understand it that uh for example i read boston's book and a bunch of other reading material about this sort of general way of thinking about the world and i think the story goes something like this that we will at some point create computers that are smart enough that they can help design the next version of themselves which itself will be smarter than the previous version of themselves and eventually bootstrapped up to being smarter than us at which point we are essentially at the mercy of this sort of more powerful intellect which in principle uh we don't have any control over what its goals are and so if its goals are at all out of sync with our goals like the ex for example the continued existence of humanity we won't be able to stop it it'll be way more powerful than us and we will be toast so there's some i don't know very smart people who have signed on to that story and it's a it's a compelling story i once now i can really get myself in trouble i once wrote an op-ed about this specifically responding to some quotes from elon musk who has been you know on this very podcast uh more than once and well the e-e-a-i's summoning the demon that you get i think he said but then he came to providence rhode island which is where i live and said uh to the governors of all the states uh you know you're worried about entirely the wrong thing you need to be worried about ai you need to be very very worried about ai so uh and peop journalists kind of reacted to that and they wanted to get people's people's take and i was like okay my my my belief is that one of the things that makes elon musk so successful and so remarkable as an individual is that he believes in the power of ideas he believes that you can have you can if you know if you have a really good idea for getting into space you can get into space if you have a really good idea for a company or for how to change the way that people drive you just have to do it and and it can happen it's really natural to apply that same idea to ai you see these systems that are doing some pretty remarkable computational tricks uh demonstrations and then to take that idea and just push it all the way to the limit and think okay where does this go where is this going to take us next and if you're a deep believer in the power of ideas then it's really natural to believe that those ideas could be taken to the extreme and kill us so i think you know his strength is also his undoing because that doesn't mean it's true like it doesn't mean that that has to happen but it's natural for him to think that so another way to phrase the way he thinks and i find it very difficult to argue with that line of thinking uh so sam harris is another person from neuroscience perspective that things like that is saying well is there something fundamental in the physics of the universe that prevents this from eventually happening and this nebosh from things in the same way they're kind of zooming out yeah okay we humans now uh are existing in this like time scale of minutes and days and so our intuition is in this time scale of minutes hours and days but if you look at the span of human history is there any reason we you can't see this in in 100 years and like is there is there something fundamental about the laws of physics that prevent this and if it doesn't then it eventually will happen or will we will destroy ourselves in some other way it's very difficult i find to actually argue against that yeah me too and not sound like not sound like you're just like rolling your eyes uh i'm like i have like science fiction we don't have to think about it but even even worse than that which is like i don't know kids but like i gotta pick up my kids now like this okay i see there's more pressing shortcuts yeah there's more pressing short-term things that like uh stop over this existential crisis where much much shorter things like now especially this year there's cova so like any kind of discussion like that is like there's there's p you know there's pressing things uh today it's it's and then so the sam harris argument well like any day the exponential singularity can can occur it's very difficult to argue against i mean i don't know but part of his story is also he's he's not going to put a date on it it could be in a thousand years it could be in 100 years it could be in two years it's just that as long as we keep making this kind of progress it's ultimately has to become a concern i i kind of am on board with that but the thing that the the piece that i feel like is missing from that that way of extrapolating from the moment that we're in is that i believe that in the process of actually developing technology that can really get around in the world and really process and and and do things in the world in a sophisticated way we're going to learn a lot about what that means which that we don't know now because we don't know how to do this right now if you believe that you can just turn on a deep learning network and eventually give it enough compute and it'll eventually get there well sure that seems really scary because we won't we won't be in the loop at all we want we won't be helping to design or or target these kinds of systems but i don't i don't see that that feels like it is against the laws of physics because these systems need help right they need they need to surpass the the the difficulty the wall of complexity that happens in arranging something in the form that that will happen in yeah like i believe in evolution like i believe that the that that there's an argument right so there's another argument just to look at it from a different perspective that people say well i don't believe in evolution how could evolution it's it's sort of like a random set of parts assemble themselves into a 747 and that could just never happen yeah so it's like okay that's maybe hard to argue against but clearly 747s do get assembled they get assembled by us basically the idea being that there's a process by which we will get to the par the point of making technology that has that kind of awareness and in that process we're going to learn a lot about that process and we'll have more ability to control it or to shape it or to build it in our own image it's not something that is going to spring into existence like that 747 and we're just gonna have to contend with it completely unprepared it's very possible that in the context of the long arc of human history it will in fact spring into existence but that springing might take like if you look at nuclear weapons like even 20 years is a springing in in the context of human history and it's very possible just like with nuclear weapons that we could have i don't know what percentage you want to put at it but the the possibility could have knocked ourselves out yeah the possibility of human beings destroying themselves in the 20th century with nuclear weapons i don't know you can if you really think through it you could really put it close to like i don't know 30 40 percent given like the certain moments of crisis that happen so like i think one like fear in the shadows that's not being acknowledged is it's not so much the ai will run away is is that as it's running away we won't have enough time to uh think through how to stop it right fast takeoff or foom yeah i mean my much bigger concern i wonder what you think about it which is we won't know it's happening so i kind of that argument i think that there is an agi situation already happening with social media that our minds our collective intelligence of human civilization is already being controlled by an algorithm and like we're we're already super like the the level of a collective intelligence thanks to wikipedia people should donate to wikipedia to feed the agi man if we had a super intelligence that that was in line with wikipedia's values that it's a lot better than a lot of other things i can imagine i've i trust wikipedia more than i trust facebook or youtube as far as trying to do the right thing from a rational perspective yeah now that's not where you were going i understand that but it it it does strike me that there's sort of smarter and less smart ways of of exposing ourselves to each other on the internet yeah the interesting thing is that wikipedia and social media have very different forces you're right i mean wikipedia if if agi was wikipedia it'd be just like this cranky overly competent editor of uh articles uh you know there's there's something to that but the social media aspect is is is not so the vision of agis is as a separate system that's super intelligent that's super intelligent that's one key little thing i mean there's the paper clip argument that's super dumb but super powerful systems but with social media you have a relatively like algorithms we may talk about today very simple algorithms that when uh something charles talks a lot about which is interactive ai when they start like having at scale like tiny little interactions with human beings they can start controlling these human beings so a single algorithm can control the minds of human beings slowly to what we might not realize it could start wars it could start it can change the way we think about things it feels like in the long arc of history if i were to sort of zoom out from all the outrage and all the tension on social media that it's progressing us towards uh better and better things it feels like chaos and toxic and all that kind of stuff but it's chaos and toxic yeah but it feels like actually the chaos and toxic is similar to the kind of debates we had from the founding of this country you know there was a civil war that happened over that over that period and ultimately it was all about this tension of like something doesn't feel right about our implementation of the core values we hold as human beings and they're constantly struggling with this and that results in people calling each other uh like just just being shitty to each other on twitter but i ultimately the algorithm is managing all that and it feels like there's a possible future in which that algorithm controls us to into the direction of self-destruction whatever that looks like yeah so so all right i do believe in the power of social media to screw us up royally i do believe in the power of social media to benefit us too i do think that we're in a yeah it's sort of almost got dropped on top of us and now we're trying to as a culture figure out how to cope with it there's a sense in which i don't know there's there's some arguments that say that for example i guess college-age students now late college-age students now people who are in middle school when when social media started to really take off maybe maybe really damaged like me this may have really hurt their development in a way that we don't have all the implications of quite yet that's the generation who if and i hate to make it somebody else's responsibility but like they're the ones who can fix it they're the ones who can who can figure out how do we keep the good of this kind of technology without letting it eat us alive and if they're successful we move on to the next phase the next level of the game if they're not successful then yeah then we're going to wreck each other we're going to destroy society so you're going to in your old age sit on the porch and watch the world burn because the tick tock generation that uh i believe well so my this is my kids age right and that's certainly my daughter's age and she's very tapped in to social stuff but she's also she's trying to find that balance right of participating in it and then getting the positives of it but without letting it eat her alive um and i think sometimes she ventures hopes just to watch this sometimes i think she ventures a little too far and is in and is consumed by it and other times she gets a little distance um and if you know if there's enough people like her out there they're gonna they're gonna navigate this this choppy waters that's that's an interesting uh skill actually to develop i talked to my dad about it you know i've uh now somehow this podcast in particular but other reasons has received a little bit of attention and with that apparently in this world even though i don't shut up about love and i'm just all about kindness i i have now a little mini army of trolls oh it's kind of hilarious actually but it also doesn't feel good but it's a skill to learn to not look at that like to moderate actually how much you look at that the discussion i have with my dad is similar to uh it doesn't have to be about trolls it could be about checking email which is like if you're anticipating you know there's uh my dad runs a large institute at drexel university and there could be stressful like emails you're waiting like there's drama of some kind and so like there's a temptation to check the email if you send an email you cut it and that pulls you in into it doesn't feel good and it's a skill that he actually complains that he hasn't learned i mean he grew up without it so he hasn't learned the skill of how to shut off the internet and walk away and i think young people while they're also being quote-unquote damaged by like uh you know being bullied online all those stories which are very like horrific you basically can't escape your bullies these days when you're growing up but at the same time they're also learning that skill of how to be able to shut off uh the like disconnect with it be able to laugh at it not take it too seriously it's fascinating like we're all trying to figure this out just like you said it's been dropped on us and we're trying to figure it out yeah i think that's really interesting and i i guess i've become a believer in the human design which i feel like i don't completely understand like how do you make something as robust as us like we're so flawed in so many ways and yet and yet you know we dominate the planet and we do seem to manage to get ourselves out of scrapes eventually not necessarily the most elegant possible way but somehow we get we get to the next step and i don't know how i'd make a machine do that i i i generally speaking like if i train one of my reinforcement learning agents to play a video game and it works really hard on that first stage over and over and over again and it makes it through it succeeds on that first level and then the new level comes and it's just like okay i'm back to the drawing board and somehow humanity we keep leveling up and then somehow managing to put together the skills necessary to achieve success some semblance of success in that next level too and you know i hope we can keep doing that you mentioned reinforcement learning so you've have uh a couple years in the field no quite you know quite a few quite a long career in artificial intelligence broadly but reinforcement learning specifically can you maybe give a hint about your sense of the history of the field and in some ways has changed with the advent of deep learning but has a long roots like how is it weaved in and out of your own life how have you seen the community change or maybe the ideas that it's playing with change i've had the privilege the pleasure of being of having almost a front row seat to a lot of this stuff and it's been really really fun and interesting so uh when i was in college in the 80s early 80s uh the neural net thing was starting to happen and i was taking a lot of psychology classes a lot of computer science classes as a college student and i thought you know something that can play tic-tac-toe and just like learn to get better at it that ought to be a really easy thing so i spent almost almost all of my what would have been vacations during college like hacking on my home computer trying to teach it how to play tic-tac-toe and programming language basic oh yeah that's that's i was i that's my first language that's my native language is that when you first fell in love with computer science just like programming basic on that uh what was the computer do you remember i had i had a trs-80 model one before they were called model ones because there was nothing else uh i got my computer in 1979 uh instead so i was i was i would have been bar mitzvahed but instead of having a big party that my parents threw on my behalf they just got me a computer because that's what i really really really wanted i saw him in the in the in the mall in radio shack and i thought what how are they doing that i would try to stump them i would give them math problems like one plus and then in parentheses two plus one yeah and i would always get it right i'm like how do you know so much message like i've had to go to algebra class for the last few years to learn this stuff and you just seem to know so i was i was i was smitten and i got a computer and i think ages 13 to 15 i have no memory of those years i think i just was in my room with the computer listening to billy joel communing possibly listening to the radio listening to billy joel that was the one album i had uh on vinyl at that time and um and then i got it on cassette tape and that was really helpful because then i could play it i didn't have to go down to my parents wi-fi or hi-fi sorry uh and at age 15 i remember kind of walking out and like okay i'm ready to talk to people again like i've learned what i need to learn here and um so yeah so so that was that was my home computer and so i went to college and i was like oh i'm totally going to study computer science i opted the college i chose specifically had a computer science major the one that i really wanted the college i really wanted to go to didn't so bye-bye to them which college did you go through so i went to yale uh princeton would have been way more convenient and it was just beautiful campus and it was close enough to home and i was really excited about princeton and i visited i said so computer science major like well we have computer engineering i'm like oh i don't like that word engineering i like if you're science i really i want to do like you're saying hardware and software they're like yeah like i just want to do software i i couldn't care less about hardware you grew up in philadelphia i grew up outside philly yeah yeah okay uh so the you know local schools were like penn and drexel and uh temple like everyone in my family went to temple at least at one point in their lives except for me so yeah philly philly family yale had a computer science department and that's when you it's kind of interesting you said 80s and you're all that works that's when you know that which is a hot new thing or a hot thing period uh so what is that in college when you first learned about neural networks yeah yeah was she learned like it was in a psychology class not in a cs wow yeah was it psychology or cognitive science or like do you remember like what context it was yeah yeah yeah so so i was a i've always been a bit of a cognitive psychology groupie so like i studied computer science but i like i like to hang around where the cognitive scientists are because i don't know brains man they're like they're wacky cool and they have a bigger picture view of things they're a little less engineery i would say they're more they're more interested in the nature of cognition and intelligence and perception it's called like the vision system work they're asking always bigger questions now with the deep learning community there i think more there's a lot of intersections but i do find in that the neuroscience folks actually and uh cognitive psychology cognitive science folks are starting to learn how to program how to use your own artificial neural networks and they are actually approaching problems in like totally new interesting ways it's fun to watch that grad students from those departments like approach the problem of machine learning right they come in with a different perspective yeah they don't care about like your imagine that data set or whatever they they want like to understand the the like the basic mechanisms at the at the neuronal level and the functional level of intelligence it's kind of it's kind of cool to see them work but yeah okay so you always you're always a group you have cognitive psychology yeah yeah and so uh so it was in a class by richard garrick he was kind of my my favorite uh psych professor in college and i took uh like three different classes with him and yeah so that we they were talking specifically the class i think was kind of a there was a big paper that was written by stephen pinker and uh prince i don't i'm blanking on prince's first name but prince and pinker and prince they wrote kind of a they were at that time kind of like ah i'm blanking on the names of the current people um the cognitive scientists who are complaining a lot about deep networks oh uh gary gary marcus sorry marcus and who else i mean there's a few but gary gary's the most feisty sure gary's very feisty and with this with his co-author they they you know they're kind of doing these kind of takedowns where they say okay well yeah it does all these amazing amazing things but here's a shortcoming here's a shortcoming here's your shortcoming and so the pinker prince paper is kind of like the that generation's version of marcus and davis right where they're they're trained as cognitive scientists but they're looking skeptically at the results in the in the artificial intelligence neural net kind of world and saying yeah it can do this and this and this but like it can't do that and it can't do that and it can't do that maybe in principle or maybe just in practice at this point but but the fact of the matter is you're you've narrowed your focus too far to be impressed you know you're impressed with the things within that circle but you need to broaden that circle a little bit you need to look at a wider set of problems and so um so we have so i was in this seminar in college that was basically a close reading of the pinker prince paper which was like really thick there was a lot going on in there and um and and it talked about the reinforcement learning idea a little bit i'm like oh that sounds really cool because behavior is what is really interesting to me about psychology anyway so making programs that i mean programs are things that behave people are things that behave like i want to make learning that learns to behave in which way was reinforcement learning presented is this uh talking about human and animal behavior or are we talking about actual mathematical constructs ah that's right so that's a good question right so this is i think it wasn't actually talked about as behavior in the paper that i was reading i think that it just talked about learning and to me learning is about learning to behave but really neural nets at that point were about learning like supervised learning so learning to produce outputs from inputs so i kind of tried to invent reinforcement learning i uh when i graduated i joined a research group at bellcore which had spun out of bell abs recently at that time because of the divestiture of the of long distance and local phone service in the 1980s 1984 and i was in a group uh with dave ackley who was the first author of the boltzmann machine paper so the very first neural net paper that could handle xor right so xor sort of killed neural nets the very first the zero with the first winter yeah um the the perceptron's paper and hinton along with his student dave ackley and and i think there was other authors as well showed that no no with both machines we can actually learn non-linear concepts and so everything's back on the table again and that kind of started that second wave of neural networks so dave ackley was he became my mentor at bellcore and we talked a lot about learning and life and computation and how all these things fit together now dave and i have a podcast together so um so i get to kind of enjoy that sort of his his perspective uh once again even even all these years later and so i said so i said i was really interested in learning but in the concept of behavior and he's like oh well that's reinforcement learning here and he gave me rich sutton's 1984 td paper so i read that paper i honestly didn't get all of it but i got the idea i got that they were using that he was using ideas that i was familiar with in the context of neural nets and and like sort of backprop uh but with this idea of making predictions over time i'm like this is so interesting but i don't really get all the details i said to dave and dave said oh well why don't we have him come and give a talk and i was like wait what you can do that like these are real people i thought they were just words i thought it was just like ideas that somehow magically seeped into paper he's like no i i i know rich like we'll just have him come down and and he'll give a talk and so i was you know my mind was blown and uh so rich came and he gave a talk at bellcore and he talked about what he was super excited which was they had just figured out at the time uh q learning so uh watkins had visited the rich sutton's lab at umass or it's andy barto's lab that rich was a part of and um he was really excited about this because it resolved a whole bunch of problems that he didn't know how to resolve in the in the earlier paper and so uh for people who don't know td temporal difference these are all just algorithms for reinforcement learning right and td separate difference in particular is about making predictions over time and you can try to use it for making decisions right because if you can predict how good a future action and action outcomes will be in the future you can choose one that has better and or but the theory didn't really support changing your behavior like the predictions had to be of a consistent process if you really wanted it to work and one of the things that was really cool about q-learning algorithm for reinforcement learning is it was off policy which meant that you could actually be learning about the environment and what the value of different actions would be while actually figuring out how to behave optimally yeah so that was a revelation yeah and the proof of that is kind of interesting i mean that's really surprising to me when i first read that and then enriched rich sutton's book on the matter it's it's kind of beautiful that a single equation can capture an equation one line of code and like you can learn anything yeah like enough time so equation and code you're right like you can the code that you can arguably at least if you like squint your eyes can say this is all of intelligence is that you can implement that in a single wall i think i started with lisp which is uh shout out to lisp uh like a single line of code key piece of code maybe a couple that you could do that it's kind of magical it's uh feels too good to be true well and it sort of is yeah it's kind of kind of it seems to require an awful lot of extra stuff supporting it but yeah but nonetheless the ideas the the idea is really good and as far as we know it is it is a very reasonable way of trying to create adaptive behavior behavior that gets better at something over time did you find the idea of optimal uh at all compelling that you could prove that it's optimal so like one part of computer science that it makes people feel warm and fuzzy inside is when you can prove something like that a sorting algorithm worst case runs and and log n and it makes everybody feel so good even though in reality it doesn't really matter what the worst case is what matters is like does this thing actually work in practice on this particular actual set of data that i that i enjoy did you so here's that here's a place where i have maybe a strong opinion uh-oh which is like you're right of course but no no like so so the what makes worst case so great right if you have a worst case analysis so great is that you get modularity you can take that thing and plug it into another thing and still have some understanding of what's going to happen when you click them together right if it just works well in practice in other words with respect to some distribution that you care about when you go plug it into another thing that distribution can shift it can change and your thing may not work well anymore and you want it to and you wish it does and you hope that it will but it might not and then ah so you're so so you're saying you don't like machine learning but we have some positive theoretical results for these things you know you can come back at me with yeah but they're really weak and yeah they're really weak and and you can even say that you know sorting algorithms like if you do the optimal sorting algorithm it's not really the one that you want and that might be true as well but but it is the modularity is a really powerful statement really as an engineer you can then assemble different things you can count on them to be i mean it's interesting it's it's a balance like with everything else in life you don't want to get too obsessed i mean this is what computer scientists do which they potentially get obsessed they over optimize things or they start by optimizing them they over optimize yeah so it's it's easy to like get really granular about this thing but like the step from an n squared to an n log n sorting algorithm is a big leap for most real-world systems no matter what the actual behavior of the system is that's a big leap and the same can probably be said for other kind of first leaps that you would take on a particular problem like it's the picking the low hanging fruit or whatever the equivalent of doing the not the dumbest thing but the next to the dumbest thing is picking the most delicious reachable fruit yeah most delicious reachable fruit i don't know why that's not a saying and yeah okay so uh so you then this is the 80s and this kind of idea starts to percolate of uh yeah at that point i got to re i got to meet rich sutton so everything was sort of downhill from there and that was that was really the pinnacle of everything um but then i you know then i felt like i was kind of on the inside so then as interesting results were happening i could like check in with with rich or with jerry tessaro who had a huge impact on uh kind of early thinking in in temporal difference learning and reinforcement learning and showed that you could do you could solve problems that we didn't know how to solve any other way and so that was really cool so as good things were happening i would hear about it from either the people who were doing it or the people who were talking to the people who are doing it and so i was able to track things pretty well through through the 90s so what uh wasn't most of the excitement on reinforcement learning in the 90s era with what is it td gamma like what's the role of these kind of little like fun game playing things and breakthroughs about uh get you know exciting the community was that like what were your because uh you've also built across or we're part of building a crossword a puzzle uh solver program yeah solving program uh called proverb so so you were interested in this as as a problem like in forming using games to understand how to build uh intelligent systems so like what did you think about tt gamble like what did you think about that whole thing in the 90s yeah i mean i found the td gammon result really just remarkable so i had known about some of jerry's stuff before he did td gammon he did a system just more vanilla well not not entirely vanilla but a more classical backproppy kind of uh network for playing back ammon where he was training it on expert moves so it was kind of supervised but the way that it worked was not to mimic the actions but to learn internally an evaluation function so to learn well if the expert chose this over this that must mean that the expert values this more than this and so let me adjust my weights to make it so that the network evaluates this as being better than this so it could learn from from human preferences it could learn its own preferences and then when he took the step from that to actually doing it as a full-on reinforcement learning problem where you didn't need a trainer you could just let it play that was that was remarkable right and so i think as as humans often do as we've done in the recent past as well people extrapolate it's like oh well if you can do that which is obviously very hard then obviously you could do all these other problems that we that we want to solve that we know are also really hard and it turned out very few of them ended up being practical partly because i think neural nets certainly at the time were struggling to be consistent and reliable and so training them in a reinforcement learning setting was a bit of a mess i had i don't know generation after generation of like master students who wanted to do value function approximation basically learn reinforcement learning with neural nets and over and over and over again we were failing we couldn't get the good results that jerry tessaro got i now believe that jerry is a neural net whisperer he has a particular ability to get neural networks to do things that other people would find impossible and it's not the technology it's the technology and jerry together yeah and which i think speaks to the role of the human expert in the process of machine learning right it's so easy we're so drawn to the idea that that it's the technology that is that is where the power is coming from that i think we lose sight of the fact that sometimes you need a really good just like i mean no one would think hey here's this great piece of software here's like i don't know gnu emacs or whatever um doesn't that prove that computers are super powerful and basically going to take over the world it's like no stallman is a hell of a hacker right so he was able to make the code do these amazing things he couldn't have done it without the computer but the computer couldn't have done it without him and so i think people discount the role of people like jerry who who um who have just a particular particular set of skills on that topic by the way as a small side note i tweeted emacs is greater than vim yesterday and deleted deleted the tweet 10 minutes later when i realized you're you were honest i started a war yeah i was like oh i was just kidding i i was just being um walk so people still feel passionately about that particular piece of uh i don't get that because emacs is clearly so much better i i don't understand but you know why do i say that because i cause like i spent a block of time in the 80s um making my fingers know the emacs keys and now like that's part of the thought process for me like i need to express and if you take that if you take my emacs key bindings away i become little i can't express myself i'm the same way with the i don't know if you know what what it is but it's a kinesis keyboard which is uh this butt shaped keyboard yes i've seen them yeah and they're very uh i don't know sexy elegant yeah they're just beautiful yeah they're they're gorgeous uh way too expensive but uh the the problem with them similar with emacs is when once you learn to use it it's harder to use other things it's hard to use other things there's this absurd thing where i have like small elegant lightweight beautiful little laptops and i'm sitting there in a coffee shop with a giant kinesis keyboard and a sexy little laptop it's absurd but it you know like i used to feel bad about it but at the same time you just kind of have to sometimes it's back to the billy joel thing you just have to throw that billy joe record and throw taylor swift and justin bieber to the wind so see but i like them now because i cause again i have no musical taste like like now that i've heard justin bieber enough i'm like i really like his songs and taylor swift not only do i like her songs but my daughter's convinced that she's a genius and so now i basically have i'm signed on to that so so yeah that that speaks to the back to the robustness of the human brain that speaks to the neuroplasticity that you can just you can you can just like a mouse teach yourself to a problem dog teach yourself to enjoy taylor swift i'll try it out i don't know i try you know what it has to do with just like acclimation right just like you said a couple weeks yeah that's an interesting experiment i'll actually try that like i'll listen that wasn't the intent of the experiment just like social media it wasn't intended as an experiment to see what we can take as a society but it turned out that way i don't think i'll be the same person on the other side of the week listening to taylor swift but let's try it it's more compartmental don't be so worried like it's like i get that you can be worried but don't be so worried because we compartmentalize really well and so it won't bleed into other parts of your life you won't start i don't know wearing red lipstick or whatever like it's it's fine it's changed fashion and everything but you know what the the thing you have to watch out for is you'll walk into a coffee shop once we can do that again and recognize the song and you'll be no you won't know that you're singing along until everybody in the coffee shop is looking at you and then you're like that wasn't me yeah that's the you know people are afraid of agi i'm afraid of the taylor uh the tail taylor swift takeover yeah and i mean people should know that td gammon was i get would you call it do you like the terminology of self play by any chance so like systems that learn by playing themselves just i don't know if it's the best word but uh so what's what's the problem with that term okay so it's like the big bang like it's it's like talking to serious physicists do you like the term big bang and when when it was early i feel like it's the early days of self-play i don't know maybe it was just previously but i think it's been used by only a small group of people uh and so like i think we're still deciding is this ridiculously silly name a good name for the cons potentially one of the most important concepts in artificial intelligence okay it depends how broadly you apply the term so i used the term in my 1996 phd dissertation wow the actual terms of yeah because because tessaro's paper was something like um training up an expert backgammon player through self-play so i think it was in the title of his paper okay if not in the title it was definitely a term that he used there's another term that we got from that work is rollout so i don't know if you do you ever hear the term rollout that's a backgammon term that has now applied generally in computers well at least in ai because of td gammon yeah that's fascinating so how is health play being used now and like why is it does it does it feel like a more general powerful concept sort of the idea of well the machine's just going to teach itself to be smart yeah so that's that's where maybe you can correct me but that's where you know the continuation of the spirit and actually like literally the exact algorithms of td gammon are applied by deep mind and open ai to learn games that are a little bit more complex that when i was learning artificial intelligence go was presented to me with artificial intelligence the modern approach i don't know if they explicitly pointed to go in those books as like unsolvable kind of thing like implying that these approaches hit their limit in this with these particular kind of games so something i don't remember if the book said it or not but something in my head or was the professors instilled in me the idea like this is the limits of artificial intelligence of the field like it instilled in me the idea that if we can create a system that can solve the game of go we've achieved agi that was kind of i didn't explicitly like say this but it that was the feeling and so from i was one of the people that it seemed magical when a learning system was able to to beat a uh a human world champion at the game of go and even more so from that that was alphago even more so with alphago zero then kind of renamed and advanced into alpha zero beating a world champion or world-class player without any supervisors learning on expert games we're doing only through by playing itself so that is i don't know what to make of it i think it would be interesting to hear what your opinions are on just how exciting surprising profound interesting or boring the breakthrough performance of alpha zero was okay so alphago knocked my socks off that was that was so remarkable which aspect of it that they they got it to work that they actually were able to leverage a whole bunch of different ideas integrate them into one giant system just the software engineering aspect of it is mind-blowing i don't i i've never been a part of a program as complicated as the program that they built for that and um and just the you know like like jerry tessaro is a neural net whisperer like you know david silver is a kind of neural net whisperer too he was able to coax these networks and these new way out their architectures to do these you know solve these problems that um as you said you know when we were learning from uh ai no one had an idea how to make it work it was it was remarkable that um these you know these these techniques that were so good at playing chess and they could beat the world champion in chess couldn't beat you know your typical go playing teenager and go so the fact that that you know in a very short number of years we kind of ramped up to uh trouncing people and go just blew me away so you're kind of focusing on the engineering aspect which is also very surprising i mean there's something different about large well-funded companies i mean there's a compute aspect to it too sure like that of course i mean that's similar to deep blue right with uh with ibm like there's something important to be learned and remembered about a large company taking the ideas that are already out there and investing a few million dollars into it or or more and so you're kind of saying the engineering is kind of fascinating both on the with alphago is probably just gathering all the data right of the expert games like organizing everything actually doing distributed supervised learning and to me see the engineering i kind of took for granted to me philosophically being able to persist in the in the face of like long odds because it feels like for me i'll be one of the skeptical people in the room thinking that you can learn your way to to beat go like it sounded like especially with david silver it sounded like david was not confident at all it's like it was like not it's funny how confidence works yeah it's like you're not like cocky about it like but right because if you're cocky about it you kind of stop and stall and don't get anywhere yeah but there's like a hope that's unbreakable maybe that's better than confidence it's a kind of wishful hope and a little dream and you almost don't want to do anything else you kind of keep doing it that's that seems to be the story and but with enough skepticism that you're looking for where the problems are and fighting through them yeah because you know there's got to be a way out of this thing yeah and for him it was probably there's there's a bunch of little factors that come into play it's funny how these stories just all come together like everything he did in his life came into play which is like a love for video games and also a connection to so the the 90s had to happen with td gammon and so on yeah in some ways it's surprising maybe you can provide some intuition to it that not much more than td gammon was done for quite a long time on the reinforcement learning front yeah is that weird to you i mean like i said the the students who i worked with we tried to get basically apply that architecture to other problems and we consistently failed there were a couple a couple really nice demonstrations that ended up being in the literature there was a paper about controlling elevators right where it's it's like okay can we modify the heuristic that elevators use for deciding like a bank of elevators for deciding which floors we should be stopping on to maximize throughput essentially and you can set that up as a reinforcement learning problem and you can you know have a neural net represent the value function so that it's taking where all the elevators where the button pushes you know this high dimensional well at the time high dimensional input um you know a couple dozen dimensions and turn that into a prediction as to oh is it going to be better if i stop at this floor or not and ultimately it appeared as though for the standard simulation distribution for people trying to leave the building at the end of the day that the neural net learned a better strategy than the standard one that's implemented in elevator controllers so that that was nice there was some work that satender singh it all did on uh handoffs with cell phones uh you know deciding when when should you hand off from this cell tower to this cell okay communication networks yeah yeah and so a couple things seemed like they were really promising none of them made it into production that i'm aware of and neural nets as a whole started to kind of implode around then and so there just wasn't a lot of air in the room for people to try to figure out okay how do we get this to work in the rl setting and then they they found their way back in in 10 in 10 plus years so you said alphago was impressive like it's a big spectacle is there right so then alpha zero so i think i may have a slightly different opinion on this than some people so um i talked to tinder saying in particular about this so satinder was uh like rich sutton a student of antibartow so they came out of the same lab very influential machine learning reinforcement learning researcher uh now deep mind uh as just as is rich though different sites the two of them he's in alberta rich is in alberta and uh satinder would be in england but i think he's in england from michigan at the moment uh but the but he was yes he was much more impressed with uh alphago zero which is didn't didn't get a kind of a bootstrap in the beginning with human trained games yes just was purely self-play though the first one alpha go was also a tremendous amount of self-play right they started off they kick-started the the action network that was making decisions but then they trained it for a really long time using more traditional temporal difference methods um so so as a result i didn't it didn't seem that different to me like it seems like yeah why wouldn't that work like once once it works it works so but he he found that that removal of that extra information to be breathtaking like that that's a game changer to me the first thing was more of a game changer but the open question i mean i guess that's the assumption is the expert games might contain with them within them a humongous amount of information but we know that it went beyond that right we know that it somehow got away from that information because it was learning strategies i don't think it i don't think alphago is just better at implementing human strategies i think it actually developed its own strategies that were that was more effective and so from that perspective okay well so it made at least one quantum leap in terms of strategic knowledge okay so now maybe it makes three like okay but that first one is the doozy right getting it to to to work reliably and and for the networks to to hold on to the value well enough like that was that was a big step well isn't maybe you could speak to this on the reinforcement learning front so the starting from scratch and learning to do something like the first like like random behavior to like crappy behavior to like somewhat okay behavior it's not obvious to me that that's not like impossible to take those steps like if you just think about the intuition like how the heck does random behavior become somewhat basic intelligent behavior not not human level not super human level but just basic but you're saying to you kind of the intuition is like if if you can go from human to superhuman level intelligence on the uh on this particular task of game playing then so you're good at taking leaps so you can take many of them that the system i believe that the system can take that kind of leap yeah no and also i think that that beginner knowledge in go like you can start to get a feel really quickly for the idea that um you know certain parts of the being in certain parts of the board seems to be more associated with winning right because it's not it's not stumbling upon the concept of winning it's told that it wins or that it loses well it's self-play so it both wins and loses it's told which which side won and the information is kind of there to start percolating around to make a difference as to um well these things have a better chance of helping you win and these things have a worse chance of helping you win and so you know it can get to basic play i think pretty quickly then once it has basic play well now it's kind of forced to do some search to actually experiment with okay well what gets me that next increment of of improvement how far do you think okay this is where you kind of bring up the the elon musk and the sam harris is right how far is your intuition about these kinds of self-playing mechanisms being able to take us because it feels one of the ominous but stated calmly things that when i talked to david silver he said is that they have not yet discovered a ceiling for alpha zero for example in the game of go or chess it's it keeps no matter how much the compute they throw at it it keeps improving so it's possible it's very possible that you if you throw you know some like 10x compute that it will improve by 5x or something like that and when stated calmly it's so like oh yeah i guess so but like and then you think like well can we potentially have like uh continuations of moore's law in totally different way like broadly defined moore's law right not the constitutional improvement exponential improvement like are we going to have an alpha zero that swallows the world uh but notice it's not getting better at other things it's getting better at go yeah and i think it's a that's a big leap to say okay well therefore it's better at other things well i mean the the question is how much of the game of life can be turned into right so that's of that i think is a really good question and i think that we don't i don't think we as a i don't know community really know that the answer to this but um so okay so so i went i went to a talk uh by some experts on computer chess so in particular computer chess is really interesting because for you know for of course for a thousand years humans were the best chess playing things on the planet um and then computers like edge to head of the best person and they've been ahead ever since it's not like people have have overtaken computers but um but computers and people together have overtaken computers right so at least last time i checked i don't know what the very latest is but last time i checked that there were teams of people who could work with computer programs to defeat the best computer programs in the game of go in the game of chess in the game of chess right and so using the information about how these things called elo scores this sort of notion of how strong a player are you there's a there's kind of a range of possible scores and the you you increment and score basically if you can beat another player of that lower score 62 percent of the time or something like that like there's some threshold of if you can somewhat consistently beat someone then you are of a higher score than that person and there's a question as to how many times can you do that in chess right and so we know that there's a range of human ability levels that cap out with the best playing humans and the computers went a step beyond that and computers and people together have not gone i think a full step beyond that it feels the estimates that they have is that it's starting to asymptote that we've reached kind of the maximum the best possible chess playing and so that means that there's kind of a finite strategic depth right at some point you just can't get any better at this game yeah i mean i i don't uh so i like to check that uh i think it's interesting because if you have somebody like uh magnus carlsen who's using these chess programs to train his mind like to learn to become a better chess player yeah and so like that's a very interesting thing because we're not static creatures we're learning together i mean just like we're talking about social networks those algorithms are teaching us just like we're teaching those algorithms so that's a fascinating thing but i think the best just playing programs are now better than the pairs like they have competition between paris but the it's still even if they weren't it's an interesting question where's the ceiling so the the david the ominous david silver kind of statement is like we have not found the ceiling right but so the question is okay so i don't i don't know his analysis on that my from talking to go experts the depth the strategic depth of go seems to be substantially greater than that of chess that there's more kind of steps of improvement that you can make get getting better and better and better but there's no reason to think that it's infinite yeah and so it could be that it's that the what david is seeing is a kind of asymptoting that you can keep getting better but with diminishing returns and at some point you hit optimal play like in theory all these finite games they're finite they have an optimal strategy there's a strategy that is the minimax optimal strategy and so at that point you can't get any better you can't beat that that strategy now that strategy may be from an information processing perspective intractable right the you need the the all the situations are sufficiently different that you can't compress it at all it's this giant mess of hard-coded rules and we can never achieve that but but that still puts a cap on how many levels of improvement that we can actually make but the the thing about self-play is if you if you put it although i don't like doing that in the broader category of self-supervised learning is that it doesn't require too much or any human human labeling yeah yeah human label or just human effort the human involvement past a certain point and the same thing you could argue is true for the recent breakthroughs in natural language processing with language models oh this is how you get to gpt3 yeah see how that did the uh that was a good good transition yeah yeah i practiced that for days uh leading up to this guy now uh but like that's one of the questions is can we find ways to formulate problems in this world that are important to us humans like more important than the game of chess that uh to which self-supervised kinds of approaches could be applied whether it's self-play for example for like maybe you could think of like autonomous vehicles in simulation that kind of stuff or just robotics applications and simulation or in the self-supervised learning where unannotated data or data that's generated by humans naturally without extra cost like the wikipedia or like all of the internet can be used to learn something about to create intelligent systems that do something uh really powerful that pass the turing test or that do some kind of superhuman level performance so what's your intuition like trying to stitch all of it together about our discussion of agi the limits of self-play and your thoughts about maybe the limits of neural networks in the context of language models is there some intuition in there that might be useful to think about yeah yeah yeah so so first of all the the whole transformer network family of things um is really cool it's really really cool i mean for you know if you've ever back in the day you played with i don't know mark off models for generating text and you've seen the kind of text that they spit out and you compare it to what's happening now it's it's amazing it's so amazing now it doesn't take very long interacting with one of these systems before you find the holes right it's it's not smart in any kind of general way it's really good at a bunch of things and it does seem to understand a lot of the statistics of language extremely well and that turns out to be very powerful you can answer many questions with that but it doesn't make it a good conversationalist right and doesn't make it a good storyteller it just makes it good at imitating of things it has seen in the past the exact same thing could be said by people who voting for donald trump about joe biden supporters and people voting for joe biden about donald trump supporters is uh you know that they're not intelligent they're just following the yeah they're following things they've seen in the past and uh so it's very it doesn't take long to find the flaws in their uh in their like natural language generation abilities yes yeah so we're being very that's interesting critical of ass right so so i've had a similar thought which was that the stories that gpt-3 spits out are amazing and very human-like and it doesn't mean that computers are smarter than we realize necessarily it partly means that people are dumber than we realize or that much of what we do day to day is not that deep like we're just we're just kind of going with the flow we're saying whatever feels like the natural thing to say next not a lot of it is is is creative or meaningful or or intentional but enough is that we actually get we get by right we we do come up with new ideas sometimes and we do manage to talk each other into things sometimes and we do sometimes vote for reasonable people sometimes but um but it's really hard to see in the statistics because so much of what we're saying is kind of rote and so our metrics that we use to measure how these systems are doing don't reveal that because it's it's it's in the interest this is that that is very hard to detect but is your do you have an intuition that with these language models if they grow in size it's already surprising that when you go from gpt2 to gpg3 that there is a noticeable improvement so the question now goes back to the ominous david silver and the ceiling right so maybe there's just no ceiling we just need more compute now i mean okay so now i'm speculating yes as opposed to before when i was completely on firm yeah all right um i don't believe that you can get something that really can do language and use language as a thing that doesn't interact with people like i think that it's not enough to just take everything that we've said written down and just say that's enough you can just learn from that and you can be intelligent i think you really need to be pushed back at i think that conversations even people who are pretty smart maybe the smartest thing that we know not maybe not the smartest thing we can imagine but we get so much benefit out of talking to each other and interacting that's presumably why you have conversations live with guests is that that there's something in that interaction that would not be exposed by oh i'll just write your story and then you can read it later and i think i think because these systems are just learning from our stories they're not learning from being pushed back at by us that they're fundamentally limited into what they could actually become on this route they have to they have to get you know shut down like we like we have to have an argument that they have to have an argument with us and lose a couple times before they start to realize oh okay wait there's some nuance here that actually matters yeah that's actually subtle sounding but quite profound that the interaction with humans is essential and the limitation within that is profound as well because the time scale like the bandwidth at which you can really interact with humans is very low so it's costly so you can't one of the underlying things about self self-plays it has to do you know a very large number of interactions and so you can't really deploy reinforcement learning systems into the real world to interact like you couldn't deploy a language model into the real world to interact with humans because it would just not get enough data relative to the cost it takes to interact like the time of humans is is expensive which is really interesting that's that go that takes us back to reinforce and learning and trying to figure out if there's ways to make algorithms that are more efficient at learning keep the spirit and reinforcement learning and become more efficient in some sense this seems to be the goal i'd love to hear what your thoughts are i don't know if you got a chance to see a blog post called bitter lesson oh yes but rich sutton that makes an argument hopefully i can summarize it perhaps perhaps you can yeah but okay so i i mean i could try and you can correct me which is uh he makes an argument that it seems if we look at the long arc of the history of the artificial intelligence field it calls you know 70 years that the algorithms from which we've seen the biggest improvements in practice are the very simple like dumb algorithms that are able to leverage computation and you just wait for the computation to improve like all the academics and so on have fun by finding little tricks and and congratulate themselves on those tricks and sometimes those tricks can be like big that feel in the moment like big spikes and breakthroughs but in reality over the decades it's still the same dumb algorithm that just waits for the compute to get faster and faster do you find that to be an interesting argument against the entirety of the field of machine learning that's an academic discipline that we're really just a subfield of computer architecture yeah we're just kind of waiting around for them to do we really don't want to do hardware work so like that's right i really don't want to we're procrastinating yes that's right just waiting for them to do their job so that we can pretend to have done ours so uh yeah i mean the argument reminds me a lot of i think it was a fred jelinek quote uh early computational linguist who said you know we're building these computational linguistic systems and every time we fire a linguist performance goes up by ten percent something like that and so the idea of us building the knowledge in in that in that case um was much less he was finding to be much less successful than get rid of the people who know about language as a you know from a kind of scholastic academic kind of perspective and replace them with more compute and so i think this is kind of a modern version of that story which is okay we want to do better on machine vision you could build in all these you know motivated part-based models that you know that just feel like obviously the right thing that you have to have or we can throw a lot of data at it and guess what we're doing better with it with a lot of data so i i hadn't thought about it until this moment in this way but what i believe well i've thought about what i believe what i believe is that you know compositionality and what's the right way to say it the complexity grows rapidly as you consider more and more possibilities like explosively and so far moore's law has also been growing explosively exponentially and so so it really does seem like well we don't have to think really hard about the algorithm design or the way that we build the systems because the best benefit we could get is exponential and the best benefit that we can get from waiting is exponential so we can just wait it's got that's gotta end right and there's hints now that that moore's law is is starting to feel some friction uh starting to the world is pushing back a little bit um one thing i i don't know do lots of people know this i didn't know this i was i was trying to write an essay and yeah moore's law has been amazing and it's been it's enabled all sorts of things but there's a there's also a kind of counter moore's law which is that the development cost for each successive generation of chips also is doubling so it's costing twice as much money so the amount of development money per cycle or whatever is actually sort of constant and at some point we run out of money uh so or we have to come up with an entirely different way of of doing the development process so like i i guess i always always a bit skeptical of the look it's an exponential curve therefore it has no end soon the number of people going to nurips will be greater than the population of the earth that means we're going to discover life on other planets no it doesn't it means that we're in a in a sigmoid curve on the front half which looks a lot like an exponential the second half is going to look a lot like diminishing returns yeah the i mean but the interesting thing about moore's law if you actually like look at the technologies involved it's hundreds if not thousands of s-curves stacked on top of each other it's not actually an exponential curve it's constant breakthroughs and and then what becomes useful to think about which is exactly what you're saying the cost of development like the size of teams the amount of resources that are invested in continuing to find new s-curves new breakthroughs and yeah it's uh it's an interesting idea you know if we live in the moment if we sit here today it seems to be the reasonable thing to say that exponentials end and yet in the software realm they just keep appearing to be happy anyway and it's so i mean it's so hard to disagree with elon musk on this because it it like i i've you know i used to be one of those folks i'm still one of those folks i've studied autonomous vehicles that's what i worked on and and it's it's like you look what elon musk is saying about autonomous vehicles well obviously in a couple years or in a year or next month we'll have fully autonomous vehicles like there's no reason why we can't driving is pretty simple like it's just a learning problem and you just need to convert uh all the driving that we're doing into data and just having you all know with the trains on that data and uh like we use only our eyes so you can use cameras and you can train on it and it's like yeah that's that what that should work and then you put that hat on like the philosophical hat and but then you put the pragmatic hat and it's like this is what the flaws of computer vision are like this is what it means to trans scale and then you you put the human factors the psychology hat on which is like it's actually driving us a lot the cognitive science or cognitive whatever the heck you call it is it's really hard it's much harder to drive than than we realize there's much larger number of edge cases so building up an intuition around this is uh around exponential is really difficult and on top of that the pandemic is making us think about exponentials making us realize that like we don't understand anything about it we're not able to intuit exponentials we're either that's true ultra terrified some part of the population and some part is like uh the opposite of whatever the carefree and we're not managing everything blase well wow that's that french uh it seems so it's got so it's uh it's fascinating to think what what the limits of this exponential growth of technology not just moore is law it's technology how that rubs up against the bitter lesson and gpt-3 and self-play mechanisms like it's not obvious i used to be much more skeptical about neural networks now at least give a slither possibility that we'll be all though will be very much surprised and also you know uh caught in a way that like we uh are not prepared for like in applications of um social networks for example sure because it feels like really good transformer models that are able to do some kind of like very good uh natural language generation of the same kind of models that could be used to learn human behavior and then manipulate that human behavior to gain advertiser dollars and all those kinds of things sure uh feed the capitalist system and and right so they arguably already are manipulating human behavior yeah yeah so but not for self-preservation which i think is a big that would be a big step like if they were trying to manipulate us to convince us not to shut them off i would be very freaked out but i don't see a path to that from where we are now they they don't have any of those abilities that's not what they're trying to do they're trying to keep people on on the site but see the thing is this this is the thing about life on earth is they might be borrowing our consciousness and sentience like so like in a sense they do because the creators of the algorithms have like they're not you know if you look at our body okay we're not a single organism we're a huge number of organisms with like tiny little motivations we're built on top of each other in the same sense the ai algorithms that are they're not it's a system that includes human companies and corporations right because corporations are funny organisms in and of themselves that really do seem to have self preservation built in and i think that's at the at the design level i think they're designed to have self-preservation be a focus so you're right in that in that broader system that we're also a part of and can have some influence on uh it's it's it is much more complicated much more powerful yeah i agree with that uh so people really love it when i ask what three books technical philosophical fiction had a big impact in your life maybe you couldn't recommend we went with movies we went uh with uh billy joel and i forgot what you uh what music you recommended but i didn't i just said i have no taste in music i just like pop music that was actually really uh skillful the way you thank you that question i'm going to try to do the same with the books so do you have a skillful way to avoid answering the question about three books you would recommend i'd like to tell you a story so um my first job out of college was at bellcore i mentioned that before where i worked with dave ackley the head of the group was a guy named tom landauer and i don't know how well known he's known now but arguably he's the he's the inventor and the first proselytizer of word embeddings so they they developed a system shortly before i got to the group yeah um that that uh called latent semantic analysis that would take words of english and embed them in you know multi-hundred dimensional space and then used that as a way of uh you know assessing similarity and basically doing reinforcement learning not sorry not reinforcing information retrieval you know sort of pre-google information retrieval and he was trained as an anthropologist but then became a cognitive scientist so i was in the cognitive science research group it's you know like i said i'm a cognitive science groupie um at the time i thought i'd become a cognitive scientist but then i realized in that group no i'm a computer scientist but i'm a computer scientist who really loves to hang out with cognitive scientists and he said he studied language acquisition in particular he said you know humans have about this number of words of vocabulary and most of that is learned from reading and i said that can't be true because i have a really big vocabulary and i don't read he's like you must i'm like i don't think i do i mean like stop signs i definitely read stop signs but like reading books is not it's not a thing that i do really though it might be just no i might be the red color do i read stop signs yeah no it's just pattern recognition at this point i don't sound it out um so now i do i wonder what that oh yeah stop the guns so um that's fascinating so you don't uh so i don't read very i mean obviously i read and i've read i've read plenty of books um but like some people like charles my friend charles and and and others like a lot of people in my field a lot of academics like reading was really a central topic to them in development and i'm not that guy in fact i used to joke that um when i got into college that it was on kind of a help out the illiterate kind of program because i got to like in my house i wasn't a particularly bad or good reader but when i got to college i was surrounded by these people that were just voracious in their reading appetite and they were like have you read this have you read this have you read this and i'd be like no i'm clearly not qualified to be at this school like there's no way i should be here now i've discovered books on tape like audiobooks um and so i'm i'm much better uh i'm more caught up i read a lot of books a small tangent on that it is a fascinating open question to me on the topic of driving whether you know supervised learning people machine learning people think you have to like drive to learn how to drive to me it's very possible that just by us humans by first of all walking but also by watching other people dr not even being inside cars as a passenger but let's say being inside the car as a passenger but even just like being a pedestrian and crossing the road you learn so much about driving from that it's very possible that you can without ever being inside of a car be okay at driving once you get in it uh or like watching a movie for example yeah i don't know something like that it's have you have you taught anyone to drive no so i have myself i have two children and um i learned a lot about car driving because my wife doesn't want to be the one in the car while they're learning so that's my job yeah so i sit in the passenger seat and it's really scary um you know i have wishes to live um and they're you know they're figuring things out now they start off very very much better than i imagine uh like a neural network would right they get that they're seeing the world they get that there's a road that they're trying to be on they get that there's a relationship between the angle the steering but it takes a while to not be very jerky and so that happens pretty quickly like the ability to stay in lane at speed that happens relatively fast it's not zero shot learning but it's pretty fast the thing that's remarkably hard and this is i think partly why self-driving cars are really hard is the degree to which driving is a social interaction activity yes and that blew me away i was completely unaware of it until i watched my son learning to drive and i was realizing that he was sending signals to all the cars around him and those in his case he's he's always had social communication challenges he was sending very mixed confusing signals to the other cars and that was causing the other cars to drive weirdly and erratically and there was no question in my mind that he would he would have an accident because they didn't know how to read him there's things you do with the the speed that you drive the positioning of your car that you're constantly like in the head of the other drivers and seeing him not knowing how to do that and having to be taught explicitly okay you have to be thinking about what the other driver is thinking was a revelation to me yeah i was supposed to be really so so creating kind of uh theories of mind of the other theories of mind of the other cars yeah yeah which i just hadn't heard discussed in the self-driving car talks that i've been to since then there's some people who do do consider those kinds of issues but it's way more subtle than i think there's a little bit of work involved with that when you realize like when you especially focus not on other cars but on pedestrians for example it's it's a literally staring you in the face yeah yeah yeah so that when you're just like how do i interact with pedestrians um yeah like pedestrians you're practically talking to an octopus at that point they've got all these weird degrees of freedom you don't know what they're going to do they can turn around any second but the point is we humans know what they're going to do like we have a good theory of mind we have a good mental model of what they're doing and we have a good model of the model that have a view and the model of the model of the model like they're we're able to kind of reason about this kind of uh the social like game of it uh all the hope is that it's quite simple actually that it could be learned that's what i just talked to the waymo i don't know if you know that company it's google south africa they i talked to their cto about this podcast and they like i wrote in their car and it's quite aggressive and it's quite fast and it's good and it feels great it make it also just like tesla waymo made me change my mind about like maybe driving is easier than i thought maybe i'm just being speciesist human maybe uh it's a speciesist argument yes i don't know but it it's fascinating to think about like the same as with reading which i think you just said you avoided the question but i still hope you answered in some way we avoided it brilliantly it is there's blind spots there's artificial intelligence that artificial intelligence researchers have about what it actually takes to learn to solve a problem have you had anka dragon on yeah okay one of my favorites so much energy she's right oh she yeah she's amazing fantastic and and in particular she thinks a lot about this kind of i know that you know that i know kind of planning and the last time i spoke with her she was very articulate about the ways in which self-driving cars are not solved like what's still really really hard but even her intuition is limited like we're all like new to this uh so in some sense the elon musk approach of being ultra confident and just like put it out there putting it out there like some people say it's reckless and dangerous and so on but like partly it's like it seems to be one of the only ways to make progress in artificial intelligence so it's uh it's you know these these are difficult things you know democracy is messy uh uh implementation of artificial intelligence systems in the real world is messy so many years ago before self-driving cars were an actual thing you could have a discussion about somebody asked me like what if what if the what if we could use that robotic technology and use it to drive cars around like isn't that aren't people going to be killed and then it's not you know blah blah blah i'm like that's not what's gonna happen i said with confidence incorrectly obviously uh what i think is gonna happen is we're gonna have a lot more like a very gradual kind of rollout where people have these cars in like closed communities right where it's somewhat realistic but it's still in a box right so that we can really get a sense of what what are the weird things that can happen how do we how do we have to change the way we behave around these vehicles like it obviously requires a kind of co-evolution that you can't just plop them in and see what happens but of course we're basically popping them in to see what happens so i was wrong but i do think that would have been a better plan so that's but your intuition that's funny just zooming out and looking at the forces of capitalism and it seems that capitalism rewards risk takers and rewards and punishes risk takers like it and like try it out the academic approach to let's try a small thing and try to understand slowly the fundamentals of the problem and let's start with one and do two and then see that and then do the three uh you know uh the the capitalist like startup entrepreneurial dream is let's build a thousand and let's right and 500 of them fail but whatever the other 500 we learned from them but if you're good enough i mean one thing it's like your intuition would say like that's going to be hugely destructive to everything but actually it's kind of the the the forces of capitalism people are quite it's easy to be critical but if you actually look at the data at the way our world has progressed in terms of the quality of life it seems like the competent good people rise to the top this is coming from me from the soviet union and so on it's like it's interesting that somebody like elon musk is the way you uh you push progress in artificial intelligence like it's forcing way more to step this their stuff up uh and waymo is forcing uh elon musk to step up it's fascinating i because i have this tension in in my heart and just being upset by the lack of progress in autonomous vehicles and within academia so there's a huge progress in the early days of the darpa challenges and then it just kind of stopped like at mit but it's true everywhere else with an exception of a few sponsors here and there is is like it's not seen as a sexy problem uh thomas like the moment artificial intelligence starts approaching the problems of the real world like academics kind of like ah all right let let the couple get really hard in a different way in a different way and that's right i think yeah right some of us are not excited about that other way but i still think there's fundamentals problems to be solved in those difficult things it's not it's still publishable i think like we just need to it's the same criticism you could have of all these conferences in europe's cvpr where application papers are often as powerful and as important as like uh theory paper even like theory just seems much more respectable and so on i mean machine learning community is changing that a little bit i mean at least in statements but it's it's still not seen as the sexiest of uh pursuits which is like how do i actually make this thing work in practice as opposed to on this toy data set all that to say are you still avoiding the three books question is there something on audiobook that you can uh recommend oh i've yeah i mean um i yeah i've read a lot of really fun stuff uh in terms of books that i find myself thinking back on that i read a while ago like that have stood the test of time to some degree i find myself thinking of program or be programmed a lot by douglas roshkopf um which was it basically put out the premise that we all need to become programmers in one form or another and it was an analogy to once upon a time we all had to become readers we had to become literate and there was a time before that when not everybody was literate but once literacy was possible the people who were literate had more of a say in society than the people who weren't and so we made a big effort to get everybody up to speed and now it's it's not 100 universal but it's quite widespread like the assumption is generally that people can read the analogy that he makes is that programming is a similar kind of thing that uh that we need to have a say in right so being a reader being literate being a reader means you can receive all this information but you don't get to put it out there and programming is the way that we get to put it out there that was the argument he made i think he specifically has now backed away from this idea he doesn't think it's happening quite this way and that might be true that it didn't society didn't sort of play forward quite that way i still believe in the premise i still believe that at some point we have the relationship that we have to these machines and these networks has to be one of each individual can has the wherewithal to make the machines help them do do the things that that person once done and as so you know as software people we know how to do that and we have a problem we're like okay i'll just i'll hack up a perl script or something and make it so if we lived in a world where everybody could do that that would be a better world and computers would be have i think less sway over us and other people's software would have less sway over us as a group yeah in some sense software engineering programming's power it's programming is power right it's it's yeah it's like magic it's like magic spells and and it's not out of reach of everyone but at the moment it's just a sliver of the population who can who can commune with machines in this way so i don't know so that book had a big big impact on me currently i'm i'm reading uh the alignment problem actually by brian christian so i don't know if you've seen this out there yet is this similar to stuart russell's work with the control problem it's in in that same general neighborhood i mean they take they have different emphases that they're they're concentrating on i think i think stewart's book did a remarkably good job like a just a celebratory good job at describing ai technology and sort of how it works i thought that was great it was really cool to see that in a book yeah i think he has some experience writing some books you know that's probably a possible thing he's maybe thought a thing or two about how to explain ai to people yeah yeah that's a really good point um this book so far has been remarkably good at telling the story of the sort of the history the recent history of some of the things that have happened uh this i'm in the first third he said this book is in three thirds the first third is essentially ai fairness and you know implications of ai on society that we're seeing right now and that's been great i mean he's telling the stories really well he's he went out and talked to the frontline people who whose names are associated with some of these ideas and and it's been terrific he says the second half of the book is on reinforcement learning so maybe that'll be fun um and then the third half third third is on uh this is super intelligence alignment problem and i i suspect that that part will be less fun for me to read yeah it's yeah it's it's an interesting problem to talk about i find it to be the most interesting just like thinking about whether we live in a simulation or not as a as a thought experiment to think about our own existence so in the same way talking about alignment problem with agi is a good way to think similarly like the trolley problem with autonomous vehicles it's a useless thing for engineering but it's a it's a nice little thought experiment for actually thinking about what are like our own human ethical systems our moral systems to to to uh by thinking how we engineer these things you start to understand yourself so sci-fi can be good at that too so one sci-fi book to recommend is exhalations by ted chang a bunch of short stories um this ted chang is the guy who wrote the short story that became the movie arrival um and all his stories just from a he's he was a computer scientist actually he studied at brown they all have this sort of really insightful bit of science or computer science that drives them and so it's just a romp right to just like he creates these artificial worlds with these by extrapolating on these ideas that that we know about but hadn't really thought through to this kind of conclusion and so his stuff is it's really fun to read it's mind warping so i'm not sure if you're familiar i seem to mention this every other word uh is i'm from the soviet union and i'm russian uh read way too much my roots are russian too but a couple generations back well it's probably in there somewhere so maybe we can uh we can pull up that thread a little bit of the existential dread that we all feel you mentioned that you i think somewhere in the conversation you mentioned they you don't really pretty much like dying i forget in which context it might have been a reinforcement learning perspective i don't know i know you know what it was it was in teaching my kids to drive that's that's how you face your mortality yes uh from a human being's perspective or from a reinforcement learning researcher's perspective let me ask you the most absurd question what's uh what do you think is the meaning of this whole thing the meaning of life on this spinning rock i mean i think reinforcement learning researchers maybe think about this from a science perspective more often than a lot of other people right as a supervised learning person you're probably not thinking about the sweep of a lifetime but reinforcement learning agents are having little lifetimes little weird little lifetimes and it's it's hard not to project yourself into their world sometimes but you know as far as the meaning of life so i when i turned 42 you may know from that's a that is a book i read um the the historical hitchhiker's guide to the galaxy that that is the meaning of life so when i turned 42 i had a meaning of life party where i invited people over and um everyone shared their meaning of life we they we had slides made up and so we had we all sat down and did a slide presentation to each other about the meaning of life and mine mine was balance i think that life is balance and um so the activity at the party for a 42 year old maybe this is a little bit non-standard but i i found all the little toys and devices that i had that where you had to balance on them you had to like stand on it and balance or pogo stick i brought a ripstick which is like a weird two-wheeled skateboard um i got a unicycle but i didn't know how to do it i didn't know how to do it i now can do it i love watching you try yeah i'll send you a video i'm not great but i put but but i managed um and so uh so balanced yeah so so my my wife has a really good one that she sticks to and is probably pretty accurate and it has to do with healthy relationships with people that you love and working hard for good causes but to me yeah balance balance in a word that's that that works for me not too much of anything because too much of anything is iffy that feels like uh rolling stone song i feel like they must be you can't always get what you want but if you try sometimes you can strike a balance yeah i think that's how it goes uh michael i'll write your parody it's a huge honor to talk to you this been a big fan of yours so um uh can't uh can't wait to see what you do next in the world of uh education the world of parity in the world of reinforcement learning thanks for talking today my pleasure thank you for listening to this conversation with michael littman and thank you to our sponsors simplisafe a home security company i use to monitor and protect my apartment expressvpn the vpn i've used for many years to protect my privacy and the internet masterclass online courses that i enjoy from some of the most amazing humans in history and better help online therapy with a licensed professional please check out the sponsors in the description to get a discount and to support this podcast if you enjoy this thing subscribe on youtube review five stars napa podcast follow on spotify support it on patreon or connect with me on twitter at lex friedman and now let me leave you some words from groucho marx if you're not having fun you're doing something wrong thank you for listening and hope to see you next time you