Tuomas Sandholm: Poker and Game Theory | Lex Fridman Podcast #12
b7bStIQovcY • 2018-12-28
Transcript preview
Open
Kind: captions Language: en the following is a conversation with Thomas sent home he's a professor same you and co-creator of lebra's which is the first AI system to be top human players in the game of heads-up No Limit Texas Hold'em he has published over 450 papers on game theory and machine learning including a best paper in 2017 at nips now renamed to new reps which is where I caught up with him for this conversation his research and companies have had wide reaching impact in the real world especially because he and his group not only proposed new ideas but also build systems to prove that these ideas work in the real world this conversation is part of the MIT course on artificial general intelligence and the artificial intelligence podcast if you enjoy subscribe on youtube itunes or simply connect with me on Twitter at Lex Friedman spelled Fri D and now here's my conversation with Thomas sent home can you describe at the high level the game of poker Texas Hold'em heads-up Texas Hold'em for people who might not be familiar at this card game yeah happy to so heads up No Limit Texas Hold'em has really emerged in the AI community as a main benchmark for testing these application independent algorithms for imperfect information game solving and this is a game that's actually played by humans you don't see that much on TV or casinos because well for obvious reasons but you do see it in some expert level casinos and you see it in the best poker movies of all time it's actually an event in the World Series of Poker but mostly it's played online and typically for pretty big sums of money and this is a game that usually only experts play so if you recall to your home game on a Friday night it probably is not gonna be hits up no Limit Texas Hold'em it might be no let me it takes us Hold'em in some cases but typically for a big group and it's not as competitive well heads up means it's two-player so it's really like me against you Am I you better much like chess or or or go in that sense but an imperfect information game which makes it much harder because I have to deal with issues of you knowing things that I don't know and I know things that you don't know instead of pieces being nicely laid on the board for both of us to see so in Texas Hold'em there's a two cards that you only see the game on to you yeah there is they gradually lay out some cards that add up overall to five cards that everybody can see yeah the imperfect nature of the information is the two cards that you're holding on front yeah so as you said you know you first get two cards in private each and then you this a betting round then you get three clubs in public on the table then there's a betting round then you get the fourth card in public on the table they're spitting around then you get the five fifth card on the table there's a bending drop so there's a total of four betting rounds and four torrontés of information revelation if you will the only the first tranche is private and they omits public from there and this is probably probably by far the most popular game in AI and just the general public in terms of imperfect information so it's probably the most popular spectator game to watch right so which is why it's a super exciting game tackle so it sits on the order of chess I would say in terms of popularity in terms of AI setting it as the bar of what is intelligence so in 2017 labret does how do you pronounce it Liberato lebra das lebra does beats little laughing they're a little bit Latin LeBron is beat a few for expert human players can you describe that event what you learned from it what was it like what was the process in general for people who have not read the papers and study yeah so the event was that we invited four of the top 10 players with these are specialist players in heads-up no Limit Texas Hold'em which is very important because this game is actually quite different than the the multiplayer version we brought me in to Pittsburgh to play at the reverse casino for twenty days we wanted to get a hundred and twenty thousand hands in because we wanted to get statistical significance so it's a lot of hands for humans to play even for this top pros who play fairly quickly normally so we couldn't just have one of them play so many hands twenty days they were playing basically morning to evening and he raised two hundred thousand as a little incentive for them to play and the setting was so that they didn't all get fifty thousand we actually paid them out based on how they did against the AI each so they had an incentive to play as hard as they could whether they're way ahead the way behind or right at the mark of beating the AI and you don't make any money unfortunately right no we can't make any money so so originally a couple of years earlier I actually explored whether we could actually play for money because that would be of course interesting as well to play against the top people for money but the Pennsylvania Gaming Board said no so so if we couldn't so this is much like an exhibit like for a musician or a boxer or something like that nevertheless you're keeping track of the money and brought us one close to two million dollars I think so so if there if it was for real money if you were able to earn money that was a quite impressive and inspiring achievement just a few details what what were the players looking at I mean were they behind the computer what was the interface like yes there they were playing much like they normally do these top players when they play this game they play mostly online so they used to playing through what UI yes and they did the same thing here so there was this layout you could imagine there's a table on the screen this the the human sitting there and then there's the AI sitting there and the the screen source everything is happening the cards coming out and so the bets being made and we also had the betting history for the human so if the human for what what had happened in the ham so far they could actually reference back and and and so forth is there a reason they were given access to the betting history for well we just uh it's a it didn't really matter that they wouldn't have forgotten anyway these are top quality people but we just want to put out there so it's not a question for human for getting and the AI somehow trying to get advantage of better memory so what was that like I mean that was an incredible accomplishment so what did it feel like before the event did you have doubt hope where was your confidence at yeah that's great so a great question so eighteen months earlier I had organized the similar brains versus AI competition with our previous a I call clerical and we couldn't beat the humans so this time around it was only eighteen months later and I knew that this new AI Lovato's was way stronger but it's hard to say how you'll do against the top humans before you try so I thought we had about a 50/50 shot and the international betting sites put us a us as a four to one or five to one underdog so it's kind of interesting that people really believe in people and I get over AI not just people people don't just believe over believing themselves but they have overconfidence in other people as well compared to the performance of AI and yeah so we were afford to 105 to 108 beating the humans in a row we were still 50/50 on the international betting sites do you think there's something special and magical about poker and in the way people think about it in a sense you have I mean even in chess there's no Hollywood movies poker is this the star of many movies and there's this feeling that certain human facial expressions and body language eye movement all these tells are critical to poker you can look into somebody's soul understand their betting strategy and so on there so that's probably why the possibly do you think that is why people have a confidence that humans will outperform because AI systems cannot in construct perceive these kinds of tells they're only looking at betting patterns and and nothing else the betting patterns and and statistics so what's more important to you if you step back and human players human versus human what's the role these tells of these ideas that we romanticize yeah so I split it into two parts so one is why do humans trust he much more than AI and all have overconfidence in humans yes I think that's that's not really related to tell a question it's just that they've seen these top players how good they are and they're really fantastic so it's just hard to believe therefore that the Navy I could beat them yeah so I think that's where that comes from and and that's actually maybe a more general lesson about the AI that until you've seen it over perform a human it's hard to believe it it could but then the tails a lot of these top players they're so good at hiding tails that among the top players it's actually not really worth it for them to invest a lot of effort trying to find tails in each other because there's a so good at hiding them so yes at the kind of Friday evening game tells are gonna be a huge thing you can read other people and if you're a good reader you you'll read them like an open book but at the top levels of poker no details become a list of the much much smaller and smaller aspect of the game as you go to the top levels the the amount of strategies the amount of possible actions is is very large ten to the power of one hundred plus so there has to be some I've read a few the papers related it has it has to form some abstractions of various hands and actions so what kind of abstractions are effective for the game of poker yeah so you're exactly right so when you go from a game tree that's ten to the 161 especially in an imperfect information game it's way too large to solve directly even with our fastest ik finding algorithms so you wanna abstract it first and abstraction in games is much trickier than abstraction in mdps or other single agent settings because you have these abstraction pathologies that if I have a finer grained abstraction the strategy that I can get from that for the real game might actually be worse than the strategy I can get from the coarse-grained abstraction if you have to be very careful now the the kinds of abstractions just to zoom out we're talking about there's the hands abstractions and then there's betting strategies yeah what I think actions yeah baiting access or so there's information obstruction to talk about general games information abstraction which is the abstraction of what chance does and this would be the cards in the case of poker and then there's action abstraction which is abstracting the actions of the actual players which would be bits in the case of poker yourself and the other players yes yourself and other players and for information abstraction we were completely automated so these were these are algorithms but they do what we call potential aware abstraction where we don't just look at the value of the hand but also how it might materialize in the good or bad hands over time and it's a certain kind of bottom-up process with integer programming there and clustering and various aspects how do you build build this abstraction and then in the action abstraction there it's largely based on how humans other and other AIS have played this game in the past but in the beginning we actually use an automated action abstraction technology which is provably convergent that it finds the optimal combination of eight sizes but it's not very scalable so we couldn't use it for the whole game but we used it for the first couple of betting actions so what's more important the strength of the hand so the information retraction or the how you play them the actions does you know the romanticized notion again is that it doesn't matter what hands you have that the actions the betting may be the way you win no matter what hands you have yeah so that's why you have to play a lot of hands so that the role of luck gets smaller so you could otherwise get lucky and get some good hands and then you're gonna win the match even with thousands of hands you can get lucky because there's so much variance in No Limit Texas Hold'em because if we both go all-in it's a huge stack or variant so there are these massive swings in No Limit Texas Hold'em so that's why you have to play not just thousands but over a hundred thousand hands don't get statistical significance let me ask another way this question if you didn't even look at your hands but they didn't know that the your opponents didn't know that how well would you be able to do oh that's a good question there's actually I heard this story that this is Norwegian female poker player goal and at uber stud who's actually won a tournament by doing exactly that but that would be extremely rare so so I cannot really play well the hands do have some role to play oh yes so LeBron is does not use as far as I understand a used learning methods deep learning is there room for learning in you know there's no reason why lab artist doesn't you know combined with an alphago type approach for estimating the quality for function estimator what are your thoughts on this maybe as compared to another algorithm which I'm not that familiar with deep stack the the engine that does use deep learning that it's unclear how well it does but nevertheless uses deep learning so what are your thoughts about learning methods to aid in the way that teller Broadus plays the game of poker yeah so as you said Lee barratto's did not use learning methods and played very well without them since then we have actually actually here we have a couple of papers on things that do use learning technique Saxon so and deep learning in particular and the sort of the way you're talking about where it's learning an evaluation function but in imperfect information games unlike let's say in Co or now now also in chess and shogi it's not some sufficient to learn an evaluation for a state because the value of an information set depends not only on the exact state but it also depends on both players beliefs like if I have a bad hand I'm much better off if the opponent thinks I'm have a good hand and vice versa if I have a good hand I'm much better off if the opponent believes I have a bad hand so the value of a state is not just a function of the cards it depends on if you will the path of play but only to the extent that is captured in the belief distributions so so that's why it's not as simple as as it is imperfect information games another one I'd say it's simple there either it's of course very complicated computationally there too but at least conceptually it's very straightforward there's a state there's an evaluation function you can try to learn it here you have to do something more and what we do is in one of these papers we're looking at allowing where we allow with the opponent to actually take different strategies at the leaf of the search tree as F if you will and and that is a different way of doing it and it doesn't assume therefore a particular way that the opponent plays but it allows opponent to choose from a set of different continuation strategies and that forces us to not be too optimistic in our local head search and that's that's one way you can do sound look ahead search in imperfect information games which is very different difficult and in us you were asking about deep stack what they did it was very different than what we do either in Lee brothers or in this new work they were gender and Umrah generating various situations in the game then they were doing Luca head from there to the end of the game as if that was a start of a different game and then they were using deep learning to learn those values of those states but the states were not just the physical states they include the belief distributions when you talk about look ahead for deep stack or with libertas does it mean considering every possibility that the game can involve is that we're talking about extremely sort of like this exponentially growth of a tree yes so we're talking about exactly that much like you do in Alpha Beta search or want to crawl to research but with different techniques so there's a different search algorithm and then we have to deal with the leaves differently so if you think about what Lee brothers did we didn't have to worry about this because we only did it at the end of the game so we would always terminate into a real situation and we would know what to payout this it didn't do this depth limited loka heads but now in this new paper which is called depth limited I think it's called depth limited research for imperfect information games we can actually do sound depth limited look at it so we can actually started with a look ahead from the beginning of the game on because that's too complicated to do for this whole long game so in Lee brothers we were just doing it for the end so and then the other side this belief distribution so is it explicitly modeled what kind of beliefs that the opponent might have yeah yeah it is explicitly modeled but it's not assumed the beliefs are actually output not input of course the starting beliefs are input but they just fall from the rules of the game because we know that the dealer deals uniformly from the dick so I know that every pair of cards that you might have is equally likely I know that for a fact that's as follows from the rules of the game of course except the two cards that I have I know you don't have those yes you have to take that into account that's called card removal and that's very important is the dealing always coming from a single deck in the heads up so you can assume single deck know that if some if if I have the ace of spades I know you don't have an ace of spades okay so in the beginning your belief is basically the fact that it's a fair dealing of hands but how do you adjust start to adjust that belief well that's a where this beauty of games here it comes so nash equilibrium which john nash introduced in 1950 introduces what rational play is when you have more than one player and these are pairs of strategies where strategies are contingency plans one for each player so neither player wants to deviate to a different strategy given that the other doesn't deviate but as a side effect you get the beliefs from Bayes rule so Nash equilibrium really isn't just deriving in these imperfect information games Nash equilibrium doesn't just define strategies it also defines beliefs for both us and it defines beliefs for each state so at the each state it's if they take all information sets at each information set in the game there's a set of different states that we might be in but I don't know which one we're in Nash equilibrium tells me exactly what is a probability distribution over those real world states in my mind how does naturally give you that distribution so why I'll do a simple example so you know the game rock-paper-scissors so we can draw it as player 1 moves first and then player 2 moves but of course it's important that player 2 doesn't know what player 1 moved otherwise player 2 would win every time so we can draw that as an information set where player 1 makes one of three moves first and then there's an information set for player 2 so player 2 doesn't know which of those nodes the world is it but once we know the strategy for player 1 Nash equilibrium will say that you play 1/3 Rock 1/3 paper 1/3 caesars from that I can derive my beliefs of the information set that they wanted 1/3 wants it though so Bayes gives you that basis you but is that specific to a particular player or is it is there something you quickly update with the game theory isn't really player specific so that's what also why we don't need any data we don't need any history how these particular humans played in the past or how any AI or even had played before it's all about rationality so we just think the AI just thinks about what would a rational opponent do and what would I do if I were right I am rational and that that's that's the idea of game theory so it's really a data free opponent free approach sir comes from the design of the game as opposed to the design of the player exactly if there's no opponent modeling per se I mean we've done some work on combining opponent modeling with game theory so you couldn't exploit weak players even more but that's another strand and in the Lee brothers we didn't turn that on because I decided that these players are too good and when you start to exploit an opponent you'll typically open yourself up self up to exploitation and these guys have so few holes to exploit and they're world's leading experts in counter exploitation so I decided that we're not gonna turn that stuff on actually I saw a few papers exploiting opponents it sound very interesting to explore do you think there's room for exploitation generally outside of LeBron us is is there subject or people differences that could be exploited maybe not just in poker but in general interactions negotiations all these other domains that yours considering yeah I definitely we've done some work on that and I really like their work at hybridize is the two so you figure out what would a rational opponent do and by the way that's safe in these zero-sum games two player zero-sum games because if the opponent does something irrational yes it might show throw off my beliefs but the amount that the player can gain by throwing off my belief is always less than they lose by playing poorly so so it's safe but still if somebody's weak as a player you might want to play differently to exploit them more so that you can think about it this way a game theoretic strategies are unbeatable but it doesn't maximally beat the other opponent so the winnings per hand might be better with a different strategy and the hybrid is that you start from a game theoretic approach and then as you gain data from about the opponent in certain parts of the game tree that in those parts of the game tree you start to tweak your strategy more and more towards exploitation while still staying fairly close to the game theoretic strategy so as to not open yourself up to exploitation too much how do you do that do you try to vary up strategies make it unpredictable it's like what is it tit-for-tat strategies in prisoner's dilemma or well it doesn't that that's a repeated game kind of prisoner's dilemma repeats it games but but even there there's no proof that says that that's the best thing but experimentally it actually does does does well so what kind of games are there first of all I don't know if this is something that you could just summarize there's perfect information games or all the informations on the table there is imperfect information games there's repeated games you play over and over there's zero-sum games there's nonzero-sum games yeah and then there's a really important distinction you're making two-player versus more players so what are what other games out there and what's the difference for example with this two-player game versus more players yeah what are the key differences right here so let me start from the the basic so a repeated game is a game where the same exact game is played over and over in these extensive form games where think about three form maybe with these information says to represent incomplete information you can have kind of repetitive interactions even repeated games are a special case of that by the way but if the game doesn't have to be exactly the same selectively sourcing all trips yes we kind of see it the same supply base year to year but what I'm buying is a little different every time and the supply base is a little different every time and so on so it's not really repeated so to find a purely repeated game is actually very rare in the world so they're really a very coarse model of what's going on then if you move up from repeat just repeated simple repeated matrix games not all the way to extensive form games but in between they're stochastic games where you know this these think about it like these little matrix games and when you take an action and your home takes an action they determine not which next state I'm going to next game I'm going to but the distribution over next games where I might be going to so so that's the stochastic game but it's like matrix games repeated stochastic games extensive form games that is from less to more general and and poker is an example of the last one so it's really the most general setting extensive form games and that's kind of what the AI community has been working on and being benched marked on with this heads-up No Limit Texas Hold'em can you describe extensive form games what was the motto here yeah so if you imagine with the tree form so it's really the tree form like in chess there's a search tree versus a matrix is a matrix yeah and that's the new matrix is called the matrix form or by matrix form or normal form game and here you have the tree form so you can actually do certain types of reasoning there that you'll lose the information when you go to normal form there's a certain form of equivalence like if you go from three form and you say it every possible contingency plan is the strategy then I can actually go back to the normal form but I lose some information from the lack of sequentiality then the multiplayer versus two-player distinction is an important one so two-player games in zero-sum are conceptually easier and computationally easier there's still huge like this one this one but they're conceptually easier and computationally easier in that conceptually you don't have to worry about which equilibrium is the other guy going to play when there are multiple because any equilibrium strategy is a best response to any other equilibrium strategy so I can play a different equilibrium from you and we'll still get the right values of the game that falls apart even with two players when you have general some games even without cooperation just even without cooperation so there's a big gap from two player zero-sum to two-player general sum or even to three player zero-sum that's that's a big gap at least in theory can you maybe not mathematically provide the intuition why it all falls apart with three or more players it seems like you should still be able to have a Nash equilibrium that yeah that's instructive that holds okay so it is true that all finite games have a Nash equilibrium so this is what your Nash actually proved so they do have a Nash equilibrium that's not a problem the problem is that there can be many and then there's a question of which equilibrium to select so and if you select your strategy from a different equilibrium and I select mind then did what does that mean I and in this non zero sum games we may lose some joint benefits we hope by being just simply stupid we could actually both be better off if we did something else yes and in three player you get other problems also like collusion that maybe you and I can get up on a third player and we can do radically better by colluding so that there are lots of issues that come up there so no Brown student you workers on this has mentioned I looked through the AMA and read it he mentioned that the ability of poker players to collaborate will make the game he was asked the question of how would you make the game of poker or both of you were asked the question how would you make the game of poker beyond being solvable by current AI methods and he said that there's not many ways of making poker more difficult but collaboration or cooperation between players would make it extremely difficult so can you provide the intuition behind why that is if you agree with that idea yeah so we've done a lot of work coalitional games and we actually have a paper here with my other student cappella Farina and some other collaborators on after net nips on that actually just came back from the poster session where we present life so when you have a collusion it's a it's a different problem yes and it typically gets even harder then even the game representations some of the game representations don't really allow go to computation so we actually introduced a new game representation for for that is that kind of cooperation part of the model is are you do you have do you have information about the fact that other players are cooperating or is it just this chaos that where nothing is known so there's some something's unknown can you give an example of a collusion type game or Z you select breach that so think about bridge it's like when you and I are on a team our payoffs are the same the problem is that we can't talk so so when I get my cards I can't whisper to you what my cards are that would not be allowed so we have to somehow coordinate our strategies ahead of time and only ahead of time and then there are certain signals we can talk about but they have to be such that the other team also understands them so so that that's that's an example where the coordination is already built into the rules of the game but in many other situations like auctions or negotiations or diplomatic relationships poker it's not really built-in but it still can be very helpful for the coders I've read you right somewhere the negotiations you come to the table with prior like a strategy that like that you're willing to do and not willing to do those kinds of things so how do you start to now moving away from poker movie beyond poker into other applications like negotiations how do you start applying this to other and to other domains yeah even real world domains that you've worked on yeah I actually have two start-up companies doing exactly that one is called strategic machine and that's for kind of build applications gaming sports all sorts of things like that any applications of this to business and to sports and to gaming to various types of things for in finance electricity markets and so on and the other is called strategy robot where we are taking this to military secure the cyber security and intelligence applications I think you worked a little bit in how he put it advertisement sort of suggesting ad kind of thing yeah auction that's another component optimized markets optimized but that's much more about a combinatorial market and optimization based technology that's not using these game theoretic reasoning technologies I think okay so what sort of high level do you think about our ability to use game theoretic concepts to model human behavior do you think do you think human behavior is amenable to this kind of modeling so outside of the poker games and where have you seen it done successfully in your work I'm not sure the goal really is modeling humans like for example if I'm playing a zero-sum game yes I don't really care that the opponent is actually following my model of rational behavior because if they're not that's even better for me right so so they see with the opponents and games there's a the prerequisite is that you've formalized the interaction in some way that can be amenable to analysis and you've done this amazing work with mechanism design designing games that have certain outcomes but so I'll tell you an example for my for my world of autonomous vehicles right we're studying pedestrians and pedestrians and cars negotiating this nonverbal communication there's this weird and game dance of tension where pedestrians are basically saying I trusted you won't kill me and so as a jaywalker I will step onto the road even though I'm breaking the law and there's this tension and the question is we really don't know how to model that well in trying to model intent and so people sometimes bring up ideas of game theory and so on do you think that aspect of human behavior can use these kinds of imperfect information approaches modeling how do we how do you start to attack a problem like that when you don't even know how the game design the game to describe the situation in order to solve it okay so I haven't really thought about jaywalking but one thing that I think could be a good application in an autonomous vehicles is the following so let's say that you have fleets of autonomous cars operated by different companies so maybe here's the way more fleet and here's the uber fleet if you think about the rules of the road they define certain little rules but that still leaves a huge strategy space open like as a simple example when cars merge you know how he must merge you know they slow down and look at each other and try to I try to merge wouldn't it be better if these situations would all repeat pre-negotiated so we can actually merge at full speed and we know that this is the situation this is how we do it and it's all gonna be faster but there are way too many situations to negotiate manually so you could do use automated negotiation this is the idea at least you could use automated negotiation to negotiate all of these situations or many of them in advance and of course it might be that hey maybe you're not gonna always let me go first maybe you said okay well in these situations all let you go first but in exchange you're gonna give me - how much you're gonna let me go first in this situation yes so it's this huge combinatorial negotiation and do you think there's room in that example of merging to model this whole situation is an imperfect information game or do you really want to consider it to be a perfect no that's a good question yeah that's a good question I'm paid the price of assuming that you don't know everything yeah I don't know it's certainly much easier games with perfect information are much easier so if you can get away with it you should but if the real situation is of imperfect information then you're going to have to deal with in for imperfect information great so what lessons have you learned the annual computer poker competition an incredible accomplishment of AI you know you look at the history of deep blue go these kind of moments when I stepped up in an engineering effort and a scientific effort combined to beat the best human players so what do you take away from this whole experience what have you learned about designing it has systems that play these kinds of games and what does that mean for sort of AI in general for the future of IAI development yeah so that's a good question so there's so much to say about it I do like this type of performance oriented research although in my group we go all the way from like idea to theory to experiments to big system fielding the commercialization so we spend that spectrum but I think that in a lot of situations in AI you really have to build the big systems and evaluate them at a scale before you know what works and doesn't and we've seen that in the computational game theory community that there are a lot of techniques that look good in the small but then they cease to look good in the large and we've also seen that there are a lot of techniques that look superior in theory and I really mean in terms of convergence rates better like first-order methods better convergence rates like the CFR based based algorithms yet the CFR pay based algorithms are the fastest in practice so it really tells me that you have to test this in reality the theory isn't tight enough if you will to tell you which algorithms are better than the others and you have to look at these things that in the large because any sort of projections you do from the small and at least in this domain be very misleading so that that's kind of from from a kind of science and engineering perspective from personal perspective it's been just a wild experience in that with the first poker competition the first or first brains versus AI man-machine poker competition that we organized there had been by the way for other poker games there had been previous competitions but this was for heads up No Limit this was the first and I probably became the most hated person in the world of Poker and I didn't mean to III size that they cracked in the game for yeah it was a lot of people felt that it was a real threat to the whole game the whole existence of the game if AI becomes better than humans people would be scared to play poker because there are the superhuman AI is running around taking their money and you know all of that so so I just it's just really aggressive just in the comments were super aggressive I got everything it's just short of death threats do you think the same was true for chess because right now they just completed the World Championships and chess and humans just started ignoring the fact that there's AI systems now that I'll perform humans and they still enjoy the game is still a beautiful game that's what I think yeah and I think the same thing happens in poker and so I didn't think of myself as somebody was gonna kill the game and I don't think I did yeah I've really learned to love this game I wasn't a poker player before but learn so many new ones is about it from these AIS and they've really changed how the game is played by the way so they have these very Martian ways of playing poker and the top humans are now incorporating those types of strategies into their own play so if anything to me our work has made poker a richer more interesting game for humans to play not something that is gonna steer him as away from it entirely just a quick comment and something you said which is if I may say so in academia is a little bit rare sometimes it's pretty brave to put your ideas to the test in the way you described saying that sometimes good ideas don't work when you actually try to apply them at scale and so where does that come from I mean what if you could do a advice for people what what drives you in that sense were you always this way I mean it takes a brave person I guess is what I'm saying to test their ideas and to see if this thing actually works against human top human players and so on yeah I don't know about brave but it takes a lot of work it takes a lot of work and a lot of time to organize do make something big and to organize an event and stuff like that and what drives you in that effort because you could still I would argue get a best paper award and nips as you did in 17 without doing this that's right yes and so so in general I believe it's very important to do things in in the real world and at scale and that's really where the the the pudding if you will proof is in the pudding that's what that's where it is in this particular case it was kind of a competition between different groups and for many years as to who can be the first one to beat the top humans that heads up No Limit Texas Hold'em so it became it became kind of a like a competition who can get there yeah so a little friendly competition could be I can do wonders for progress yes so the topic of mechanism design which is really interesting also kind of new to me except as an observer if I don't know politics and any I'm an observer of mechanisms but you write in your paper an automated mechanism design that I quickly read so mechanism design is designing the rules of the game so you get a certain desirable outcome and you have this work on doing so in an automatic fashion as opposed to fine-tuning it so what have you learned from those efforts if you look say I don't know at complex it's like our political system can we design our political system to have in an automated fashion to have outcomes that we want can we design something like traffic lights to be smart where it gets outcomes that we want so what are the lessons you draw from that work yeah so I still very much believe in the automated mechanism design direction yes but it's not a panacea there are impossibility results in mechanism design saying that there is no mechanism that accomplishes objective X in Class C so so they it's not gonna there's no way using any mechanism design tools manual or automated to do certain things in mechanism design he can't describe that again so meaning there it's impossible to achieve that yeah yes it was likely impossible so so so these are these are not statements about human ingenuity who might come up with something smart these are proofs that if you wanna accomplish properties X in Class C that is not to oppose with any mechanism the good thing about automated mechanism design is that we're not really designing for a class we're designing for specific settings at the time so even if there's an impossibility result for the whole class it just doesn't mean that all of the cases in the class are impossible it just means that some of the cases are impossible so we can actually carve these islands of possibility within these known impossible classes and we've actually done that so what one of the famous results in mechanism design is a Meyer sham set its weight theorem for pi Roger Myerson and Mark Satterthwaite from 1983 so it's an impossibility of efficient trade under imperfect information we show that you can in many settings avoid that and get the efficient trade anyway depending on how they design the game okay so depending how you design the game and of course it's not it doesn't in any way any way contradict to impossibility result or impossibility results is still there but it just finds spots within this impossible class where in those spots you don't have time possibility sorry if I'm going a bit philosophical but what lessons you draw towards like I mentioned politics or human interaction and designing mechanisms for outside of just these kinds of trading or auctioning or purely formal games our human interaction like a political system what how do you think it's applicable to yeah politics or to business to negotiations these kinds of things designing rules that have certain outcomes yeah yeah I do think so have you seen success that successfully done yes and really oh you mean mechanism design or automated make automated mechanism design but so so mechanism design itself has had fairly limited success so far there are certain cases but most of the real-world situations are actually not sound from a mechanism design perspective even in those cases where they've been designed by very knowledgeable mechanism design people the people are typically just taking some insights from the theory and applying those insights into the real world rather than applying the mechanisms directly so one famous example of is the FCC spectrum auctions so I've also had a small role in that and very good economists have been where excellent economists have been working on that with no game theory yet the rules that are designed in practice they're they're such that bidding truthfully is not the best strategy usually mechanism design we try to make things easy for the participants so telling the truth is the best strategy but but even in those very high stakes auctions where you have tens of billions of dollars worth of expect from being auctioned truth-telling is not the best strategy and by the way nobody knows even a single optimal bidding strategy for those auctions what's the challenge of coming up with an optimum because there's a lot of players and there's a lot of players but many items for sale and the these mechanisms are such that even with just two items or one item bidding truthfully wouldn't be the best strategy if you look at the history of AI it's marked by seminal events and alphago being a world champion human go player I would put librettist winning the heads of no-limit hold'em as one of such event thank you and what do you think is the next such event whether it's in your life or in the broadly AI community that you think might be out there that would surprise the world so that's a great question and I don't really know the answer in terms of game solving hits up No Limit Texas Hold'em really was the one remaining widely agreed-upon benchmark so that was the big milestone now are there other things yes certainly there are but there there is not one that the community has kind of focused on so what could be other things there are groups working on StarCraft there are groups working on dota2 these are video games yes or you could have like diplomacy or Hanavi you know things like that these are like recreational games but none of them are really acknowledged that's kind of the main next challenge problem like chess or go or heads-up No Limit Texas Hold'em was so I don't really know in the game solving space what is or what will will be the next benchmark I hope kind of hope that there will be a next benchmark because really the different groups working on the same problem really drove these application independent techniques for put very quickly over ten years do you think there's an open problem that excites you that you start moving away from games into real world games like say the stock market trading yeah that's that's kind of how I am so I am probably not going to work as hard on these recreational benchmarks I'm doing to startups on game solving technology strategic machine and strategy robot and we're really interested in pushing this stuff into practice what do you think would be really you know a powerful result that would be surprising that would be if you can say I mean you know five years ten years from now something that statistically you would say is not very likely but if there's a breakthrough what achieve yeah so I think that overall we're in a very different situation in game theory than we are in let's say machine learning yes so in machine learning it's a fairly mature technology and it's very broadly applied and proven success in the real world in game solving there are almost no applications yet we have just become superhuman which machine learning you could argue happened in the 90s if not earlier and at least some supervised learning at certain complex supervised learning applications now I think a next challenge problem I know you're not asking about this way you're you're asking about the technology breakthrough but I think the big big breakthrough is to be able to show it hey maybe most of let's say military planning or most of business strategy will actually be done strategically using computational game theory that that's what I would like to see as a next five or ten year goal maybe you can explain to me again forgive me if this is an obvious question but you know machine learning methods neural networks are suffer from not being transparent not being explainable a game theoretic methods you know Nash equilibria do they generally when you see the different solutions are they when you talk about military operations are they once you see the strategies do they make sense that they explainable or do they suffer from the same problems as neural networks do so that's that's a good question I would say a little bit yes and no and what I mean by that is that these games are ethic strategies let's say Nash equilibrium it has provable properties so it's unlike let's say deep learning where you kind of cross your fingers hopefully it'll work and then after the fact when you have the weights you're still crossing your fingers and I hope it will work here you know that the solution quality is there this provable or Souls from quality guarantees now that doesn't necessarily mean that the strategies are human understandable that's a whole other problem so that's also I think it deep learning and computational game theory are in the same boat in that sense that both are difficult to understand but at least the game theoretic techniques they have this guarantees of guarantee quality so did you see business operations to achieve your corporations or even military in the future being at least the strong candidates being proposed by automated systems do you see that yeah I do I do but that's more of a really belief than a substantiated fact depending on where you land and optimism or pessimism that's a relief to me that's an exciting future especially if they're provable things in terms of optimality so looking into the future there's a a few folks worried about the especially you look at the game of poker which is probably one of the last benchmarks in terms of games being solved they they worry about the future and the existential threats of artificial intelligence so the negative impact in whatever form on society is that something that concerns you as much are you more optimistic about the positive impacts of AI oh I am much more optimistic about the positive impacts so just in my own work what we've done so far we run the nationwide kidney exchange hundreds of people are walking around alive today who would it be and it's increased employment you had you have a lot of people now running kidney changes and at the transplant centers interacting with the kidney exchange you have extra surgeons nurses anesthesiologists hospitals all of that as so so employment is increasing from that and the world is becoming a better place another example is combinatorial sourcing auctions we did 800 large-scale combinatorial sourcing auctions from 2001 to 2010 in a previous startup of mine called combine it and we increased the supply chain efficiency on that sixty billion dollars of spend by twelve point six percent so that's over six billion dollars of efficiency improvement in the world and this is also like shifting value from somebody to somebody else just efficiency improvement like in trucking less empty driving so there's less waste less carbon footprint and so on it's a huge positive impact in the near term but sort of to stay in it for a little longer because I think game theory is a role to play here well let me actually come back and tell you this is one thing I think Asia is also going to make the world much safer so so so that's another aspect that often gets overlooked well
Resume
Categories