Transcript
naed4C4hfAg • David Patterson: Computer Architecture and Data Storage | Lex Fridman Podcast #104
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0403_naed4C4hfAg.txt
Kind: captions Language: en the following is a conversation with David Patterson Turing Award winner and professor of computer science at Berkeley he's known for pioneering contributions to RISC processor architecture used by 99% of new chips today and for co-creating RAID storage the impact that these two lines of research and development have had in our world is immeasurable he's also one of the great educators of computer science in the world his book with John Hennessy is how I first learned about and was humbled by the inner workings of machines at the lowest level quick summary of the ads to sponsors the Jordan Harbinger show and cash app please consider supporting the podcast by going to Jordan Harbinger complex and downloading cash app and using code Lexx podcast click on the links buy the stuff it's the best way to support this podcast and in general the journey I'm on in my research and startup this is the artificial intelligence podcast if you enjoy it subscribe on YouTube review it five stars in hype a podcast supported on patreon or connect with me on Twitter and Lex Friedman spelled without the e just Fri DM a.m. as usual I'll do a few minutes of ads now and never any ads in the middle that can break the flow of the conversation this episode is supported by the Jordan Harbinger show go to Jordan Harbinger calm / Lex it's how he knows I set you on that page there's links to subscribe to it an apple podcast Spotify and everywhere else I've been binging on this podcast it's amazing Jordan is a great human being he gets the best out of his guests - deep calls him out when it's needed it makes the whole thing fun to listen to he's interviewed Kobe Bryant Mark Cuban and Neil deGrasse Tyson and Garry Kasparov and many more I recently listened to his conversation with Frank Abagnale author of catch me if you can one of the world's most famous Kahneman perfect podcast length and topic for a recent long distance run that I did go to Jordan Harbinger complex to give him my love and to support this podcast subscribe also on Apple podcast Spotify and everywhere else this show is presented by cash app the greatest sponsor of this podcast ever and the number one finance app in the App Store when you get a used coat Lex podcast cash app lets you send money to friends buy bitcoin invest in the stock market with as little as one dollar since gas rep allows you to buy bitcoin let me mention that cryptocurrency in the context of the history of money is fascinating I recommend the scent of money as a great book on this history also the audio book is amazing debits and credits on Ledger's started around 30,000 years ago the US dollar created over two hundred years ago and the first decentralized cryptocurrency released just over ten years ago so given that history cryptocurrencies still very much in its early days of development but it's still aiming to and just might redefine the nature of money so again if you get cash out from the App Store Google Play and use the code Lex podcast you get ten dollars and cash up will also donate ten dollars to first an organization that is helping to advance robotics to stem education for young people around the world and now here's my conversation with David Patterson let's start with the big historical question how have computers changed in the past 50 years at both the fundamental architectural level and in general in your eyes well the biggest thing that happened was the invention of the microprocessor so computers that used to fill up several rooms could fit inside your cell phone and not only and how do they get smaller they got a lot faster so they're million times faster than they were 50 years ago and they're much cheaper and they're RIBA covetous you know there's seven point eight billion people on this planet probably half of them have cell phones but you know just remarkable it's probably more micro processors than there are people sure I don't know what the ratio is but I'm sure it's above one maybe it's ten to one or some number like that what is a microprocessor so a way to say what a microprocessor is to tell you what's inside a computer so a computer forever has classically had five pieces there's input and output which kind of naturally as you'd expect is input is like speech or typing and output is displays there's a memory and like the name sounds it it remembers things so it's integrated circuits whose job is you put information in and when you ask for it it comes back out that's memory and the third part is the processor where the team microprocessor comes from and that has two pieces as well and that is the control which is kind of the brain of the processor and the what's called the arithmetic unit it's kind of the Brawn of the computer so if you think of the as a human body the arithmetic unit the thing that does the number crunching is the is the body and the control is the brain so those five pieces input/output memory arithmetic unit and control are have been in computers since the very dawn in the last two are considered the processor so a microprocessor simply means a process of the fits on a microchip and that was invented at about you know 40 years ago was the first microprocessor it's interesting that you refer to the arithmetic unit as the like he connected to the body and the controller's of the brain so I guess I never thought of it that was a nice way to think of it because most of the actions the microprocessor does in terms of literally sort of computation with the microprocessor does computation it processes information and most of the thing it does is basically earth net arithmetic operations what what are the operations by the way it's a lot like a calculator you know so there are add instructions a subtractive Stressless multiply and divide and kind of the brilliance of the invention of the my computer or the processor is that it performs very trivial operations but it just performs billions of them per second and what we're capable of doing is writing software that can take these very trivial instructions and have them create tasks that can do things better than human beings can do today just looking back through your career did you anticipate the kind of how good we would be able to get at doing these small basic operations I think what how many surprises along the way we just kind of set back and said wow I didn't expect it to go this fast this good well the the fundamental driving force is what scored Moore's law which was named after Gordon Moore who's a Berkeley alumnus and he made this observation very early in what are called semi conductors and semiconductors are these ideas you can build these very simple switches and you can put them on these microchips and he made his observation over 50 years ago he looked at a few years and said I think what's going to happen is the number of these little switches called transistors is going to double every year for the next decade and he said this in 1965 and in 1975 he said well maybe he's gonna double every two years and that I would other people since named that Moore's Law guided the industry and when Gordon Moore makes that prediction he he wrote a paper back in I think in the in the 70s and said not only did this going to happen he wrote what would be the implications of that and in this article from 1965 he he shows ideas like computers being in cars and computers being in something that you would buy in the grocery store and stuff like that so he kind of not only called his shot he called the implications of it so if you were in in the computing field and a few believed Moore's prediction he kind of said what the what would be happening in the future so so it's not kind of it's at one sense this is what was predicted and you could imagine it was easily believed that Moore's law was going to continue and so this would be the implications on the other side there are these shocking events in your life like I remember driving in meriem across the bay in San Francisco and seeing a bulletin board at a local Civic Center and had a URL on it uh and it was like if for all for all that's for the people at the time these first URLs and that's the you know ww select stuff with the HTTP people thought it was looks like alien alien writing right they'd see these advertisements and commercials or bulletin boards that had this alien writing on it so for the lay people is like what the hell is going on here and for those people interesting it's oh my god this stuff is getting so popular it's actually leaking out of our nerdy world and into the real world so that I mean there is events like that I think another one was I member with the in the early days of the personal computer when we started seeing advertisements in magazines for personal computers like it's so popular that it's it made the newspapers so at one hands you know Gordon Moore predicted it and you kind of expected it to happen but when it really hit and you saw it affecting society it was it was shocking so maybe taking a step back and looking both the engineering and philosophical perspective what what do you see as the layers of abstraction in the computer do you see a computer as a set of layers of abstractions yeah and I think that's one of the things that computer science fundamentals is the these things are really complicated in the way we cope with complicated software and complicated hardware is these layers of abstraction and that simply means that we you know suspend disbelief and pretend that the only thing you know is that layer and you don't know anything about the layer below it and that's the way we can make very complicated things and probably it started with hardware that's the way it was done but it's been proven extremely useful and you know I would think in a modern computer today there might be 10 or 20 layers of abstraction and they're all trying to kind of enforce this contract is all you know is this interface there's a set of commands that you can allow to use and you stick to those commands that we will faithfully execute that and it's like peeling the air layers of a London onion you get down there's a new set of layers and so forth so for people who want to study computer science the exciting part about it is you can keep peeling those layers you you take your first course and you might learn to program in Python and then you can take a follow-on course and you can get it down to a lower level language like C and you know you can go and you can if you want to you can start getting into the hardware layers and you keep getting down all the way to that transistor that I talked about that Gordon Moore predicted and you can understand all those layers all the way up to the highest level application software so it's it's a very kind of magnetic field if you're interested you can go into any depth and keep going in particular what's happening right now or it's happened in software last twenty years and recently in hardware there's getting to be open sourced versions of all of these things so what open source means is what the engineer the programmer designs it's not secret the belonging to a company it's up there on the World Wide Web so you can see it so you can look at for lots of pieces of software that you use you can see exactly what the programmer does if you want to get involved that used to stop at the hardware recently there's been an efforts to make open-source hardware and those interfaces open so you can see that so instead of before you had to stop at the hardware you can now start going layer by layer below that and see what's inside there so it's it's a remarkable time that for the interested individual can really see in great depth what's really going on and the computers that power everything that we see around us are you thinking also when you say open source at the hardware level is this going to the design architecture instruction set level or is it going to literally the the you know the manufacturer of the of the actual hardware of the actual chips whether that's a six specialized a particular domain or the general yeah so let's talk about that a little bit so when you get down to the bottom layer of software the way software talks to hardware is in a vocabulary and what we call that vocabulary we call that the words of that vocabulary called instructions in the technical term for the vocabulary is instruction set so those instructions are likely we talked about earlier that can be instructions like add subtract and multiply divide there's instructions to put data into memory which is called a store instruction and to get data back which is called the load instructions and those simple instructions go back to the very dawn of computing in you know in 1950 the commercial commercial computer had these instructions so that's the instruction set that we're talking about so up until I'd say ten years ago these instruction sets are all proprietary so a very popular one is Alden by Intel the one that's in the cloud and then all the pcs in the world the Intel owns that instruction set it's referred to as the x86 there have been a sequence of ones that the first number was called 8086 and since then there's been a lot of numbers but they all end in 86 so there's then that kind of family of instruction sets and that's proprietary and that's proprietary the other one that's very popular is from arm that kind of powers all of all the cell phones in the world all the iPads in the world and a lot of things that are so-called Internet of Things devices arm and that one is also proprietary arm will license it to people for a fee but they own that so the new idea that got started at Berkeley kind of unintentionally ten years ago is in early in my career we pioneered a way to do of these vocabularies instruction sets that was very controversial at the time at the time in the 1980s conventional wisdom was these vocabularies instruction sets should have you know powerful instructions so polysyllabic kind of words you can think of that and and so that instead of just add subtract and multiply they would have polynomial vied or sort a list and the hope was of those powerful vocabularies that make it easier for software so we thought that didn't make sense for microprocessors servers people at Berkeley and Stanford and IBM who argued the opposite and we what we called that was a reduced instruction set computer in the abbreviation was our ISC and typical for computer people we use the abbreviations start pronouncing it so risk was there so we said for microprocessors which with Gordon's Moore is changing really fast we think it's better to have a pretty simple set of instructions reduce set of instructions that that would be a better way to build microprocessors since they're going to be changing so fast due to Moore's law and then we'll just use standard software to cover the used generate more of those simple instructions and one of the pieces of software that it's in a software stack going between these layers of abstractions is called a compiler and it basically translates it's a translator between levels we said the translator will handle it so the technical question was well since there are these reduced instructions you have to execute more of them yeah that's right but maybe they execute them faster yeah that's right there's simpler so they could go faster but you have to do more of them so what's what's that trade-off look like and it ended up that we ended up executing maybe 50 percent more instructions maybe 1/3 more instructions but they ran four times faster so so this risk controversial risk ideas proved to be maybe factors of three or four better I love that this idea was controversial and most kind of like a rebellious so that's in the context of what was more conventional is the complex instruction set competing so how'd you pronounce that Sisk Sisk risk vs. Sisk and and believe it or not this sounds very very you know who cares about this right it was it was violently debated at several conferences it's like what's the brightman ago is is and people thought risk was you know was de-evolution we're gonna make software worse by making death instructions simpler and they're fierce debates at several conferences in the 1980s and then later in the eighties that kind of settled to these benefits it's not completely intuitive to me why risk has for the most part one yes so why do that happen yeah yeah and maybe I can sort of say a bunch of dumb things that could lay the land for further commentary so to me and this is a this is kind of interesting thing if you look at C++ was just see with modern compilers you really could write faster code with C++ so relying on the compiler to reduce your complicated code into something simple and fast so to me comparing risk maybe this is a dumb question but why is it that focusing the definition the design of the instruction set on very few simple instructions in the long run provide faster execution versus coming up with like I said a ton of complicated instructions then over time you know years maybe decades you come up with compilers that can reduce those into simple instructions for you yeah some let's try and split that into two pieces so if the compiler can do that for you if the pilot can take you know a complicated program and produce simpler instructions then the programmer doesn't care right programmer yeah yeah I don't care just how how fast is the computer I'm using how much does it cost and so what we what and kind of in the software industry is right around before the 1980s critical pieces of software we're still written not in languages like C or C++ they were written in what's called assembly language where there's this kind of humans writing exactly at the instructions at the level then that a computer can understand so they were writing add subtract multiply you know instructions it's very tedious but the belief was to write this lowest level of software that the people use which are called operating systems they had to be written in assembly language because these high-level languages were just too inefficient they were too slow or the the programs would be too big so that changed with a famous operating system called UNIX which is kind of the grandfather of all the operating systems today so the UNIX demonstrated that you could write something as complicated as an operating system in a language like C so once that was true then that meant we could hide the instruction set from the programmer and so that meant then it didn't really matter the programmer didn't have to write lots of these simple instructions so that was up to the compiler so that was part of our arguments for risk is if you were still writing assembly languages maybe a better case for sis constructions but if the compiler can do that it's gonna be you know that's done once the computer translates it once and then every time you run the program it runs that this this potentially simpler instructions and so that that was the debate right is because and people would acknowledge that these simpler instructions could lead to a faster computer you can think of mono syllabic constructions you could say them you know if you think of reading you probably read them faster or say them faster than long instructions the same thing that analogy works pretty well for hardware and as long as you didn't have to read a lot more of those instructions you could win so that's that's kind of that's the basic idea for risk but it's interesting that the in that discussion of UNIX to see that there's only one step of levels of abstraction from the code that's really the closest to the machine to the code that's written by human it's uh at least to me again perhaps a dumb intuition but it feels like there might have been more layers sort of different kinds of humans stacked as well of each other so what's true and not true about what you said is several of the layers of software like so the if you hear two layers would be suppose we just talked about two layers that would be the operating system like you get from from Microsoft or from Apple like iOS or the Windows operating system and let's say applications that run on top of it like Word or Excel so both the operating system could be written in C and the application could be written in C so but you could construct those two layers and the applications absolutely do call upon the operating system and the change was that both of them could be written in higher-level languages so it's one step of a translation but you can still build many layers of abstraction of software on top of that and that's how how things are done today so still today many of the layers that you'll you'll deep deal with you may deal with debuggers you may deal with linkers there's libraries many of those today will be written in c++ say even though that language is pretty ancient and even the Python interpreter is probably written in C or C++ so lots of layers there are probably written in these some old fashioned efficient languages that still take one step to produce these instructions produce RISC instructions but they're composed each layer of software invokes one another through these interfaces and you can get ten layers of software that way so in general the risk was developed here Berkeley it was kind of the three places that were these radicals that advocated for this against the rest of community where IBM Berkeley and Stanford you're one of these radicals and how radical did you feel how confident did you feel how doubtful were you that risk might be the right approach because it may you can also Intuit that is kind of taking a step back into simplicity not forward into simplicity yeah no it was easy to make yeah it was easy to make the argument against it well this was my colleague John Hennessy at Stanford and I we were both assistant professors and for me I just believed in the power of our ideas I thought what we were saying made sense Moore's Law is going to move fast the other thing that I didn't mention is one of the surprises of these complex instruction sets you could certainly write these complex instructions if the programmer is writing them in themselves it turned out to be kind of difficult for the compiler to generate those complex instructions kind of ironically you'd have to find the right circumstances that that just exactly fit this complex instruction it was actually easier for the compiler to generate these simple instructions so not only did these complex instructions make the hard work more difficult to build often the compiler wouldn't even use them and so it's harder to build the compiler doesn't use them that much the simple instructions go better with Moore's Law that's you know the number of transistors is doubling every every two years so we're gonna have you know the you want to reduce the time to design the microprocessor that may be more important than these number instructions so I think we believed in the that we were right that this was the best idea then the question became in these debates well yeah that's a good technical idea but in the business world this doesn't matter there's other things that matter it's like arguing that if there's a standard with the railroad tracks and you've come up with a better with but the whole world has covered railroad tracks so you'll your ideas have no chance of success commercial success it was technically right but commercially it'll be insignificant yeah this it's kind of sad that this world the history of human civilization is full of good ideas that lost because somebody else came along first with a worse idea and it's good that in the computing world at least some of these have well well you could are I mean it's probably still sisk people that say yeah still are but and what happened was what was interesting Intel a bunch of the system companies with Sisk instruction sets of vocabulary they gave up but not Intel what Intel did to its credit because Intel's vocabulary was in the in the personal computer and so that was a very valuable vocabulary because the way we distribute software is in those actual instructions it's in the instructions of that instruction set so they then you don't get that source code what the programmers wrote you get after it's been translated into the last level that's if you were to get a floppy disk or download software it's in the instructions that instruction set so the x86 instruction set was very valuable so what Intel did cleverly and amazingly is they had their chips in hardware do a translation step they would take these complex instructions and translate them into essentially in RISC instructions in Hardware on the fly you know at at gigahertz clock speeds and then any good idea that risk people had they could use and they could still be compatible with us with this really valuable PC software software base and which also had very high volumes you know a hundred million personal computers per year so the sisk architecture in the business world was actually one in in this PC era so just going back to the the time of designing risk when you design an instruction set architecture do you think like a programmer do you think like a microprocessor engineer do you think like a artist a philosopher do you think in software and hardware I mean is it art I see science yeah I'd say I think designing a goods instruction set as an art and I think you're trying to balance the the simplicity and speed of execution with how well easy it will be for compilers to use it alright you're trying to create an instruction set that everything in there can be used by compilers there's not things that are missing that'll make it difficult for the program to run they run efficiently but you want it to be easy to build as well so it's that kind of so you're thinking I'd say you're thinking hard we're trying to find a hardware software compromise that'll work well and and it's you know it's you know it's a matter of taste right it's it's kind of fun to build instruction sets it's not that hard to build an instruction set but to build one that catches on and people use you know you have to be you know fortunate to be the right place at the right time or have a design that people really like are using metrics says is it quantifiable because you kind of have to anticipate the kind of programs that people will write yet ahead of time so is that can you use numbers can use metrics can you quantify something ahead of time or is this again that's the art part where you're kind of knows it's a a big a big change kind of what happened I think from Hennessey's and my perspective in the 1980s what happened was going from kind of really you know taste and hunches to quantifiable in in fact he and I wrote a textbook at the end of the 1980s called computer architecture a quantitative approach I heard of that and and it's it's the thing it it had a pretty big big impact in the field because we went from textbooks that kind of listed so here's what this computer does and here's the pros and cons and here's what this computer doesn't pros and cons to something where there were formulas in equations where you could measure things so specifically for instruction sets what we do in some other fields do is we agree upon a set of programs which we call benchmarks and a suite of programs and then you develop both the hardware and the compiler and you get numbers on how well your your computer does given its instruction set and how well you implemented it in your microprocessor and how good your compilers are and in computer architecture we you know using professors terms we grade on a curve rather than greater than absolute scale so when you say you know this these programs run this fast well that's kind of interesting but how do you know it's better while you compare it to other computers at the same time so the best way we know how to make turned it into a kind of more science and experimental and quantitative is to compare yourself to other computers or the same era that have the same access the same kind of technology on commonly agreed benchmark programs so maybe two toss-up two possible directions we can go one is what are the different trade-offs in designing architectures Ubben are you talking about Siskin risk but maybe a little bit more detail in terms of specific features that you were thinking about and the other side is what are the metrics that you're thinking about when looking at these trade-offs yeah well let's talk about the metrics so during these debates we actually had kind of a hard time explaining convincing people the ideas and partly we didn't have a formula to explain it and a few years into it we hit upon the formula that helped explain what was going on and I think if we can do this see how it works orally just is this so the yes if I can do a formula or Li L C so the so fundamentally the way you measure performance is how long does it take a program to run a program if you have ten programs and typically these benchmarks were sweet because you'd want to have ten programs so they could represents lots of different applications so for these ten programs how long they take to run now when you're trying to explain why it took so long you could factor how long it takes a program to run into three factors one of the first one is how many instructions did it take to execute so that's the that's the what we've been talking about you know the instructions of Academy how many did it take all right the next question is how long did each instruction take to run on average so you multiply the number instructions times how long it took to run and that gets you help okay so that's but now let's look at this metric of how long did it take the instruction to run well it turns out the way we could build computers today is they all have a clock and you've seen this when you if you buy a microprocessor it'll say 3.1 gigahertz or 2.5 gigahertz and more gigahertz is good well what that is is the speed of the clock so 2.5 gigahertz turns out to be 4 billions of instruction or 4 nanoseconds so that's the clock cycle time but there's another factor which is what's the average number of clock cycles that takes per instructions so it's number of instructions average number of clock cycles in the clock cycle time so in these risks ist's debates we would we they would concentrate on but wrist makes needs to take more instructions and we'd argue what maybe the clock cycle is faster but what the real big difference was was the number of clock cycles per instruction or instruction as fascinating what about the mess up the beautiful mess of parallelism in the whole picture parallelism which has to do was say how many instructions could execute in parallel and things like that you could think of that as affecting the clock cycles per instruction because it's the average clock cycles per instruction so when you're running a program if it took a hundred billion instructions and on average it took two clock cycles per instruction and they were four nanoseconds you could multiply that out and see how long it took to run and there's all kinds of tricks to try and reduce the number of clock cycles per instruction but it turned out that the way they would do these complex instructions is they would actually build what we would call an interpreter in a simpler a very simple hardware interpreter but it turned out that for sis constructions if you had to use one of those interpreters it would be like 10 clock cycles per instruction where the risk instructions could be too so there'd be this factor of five advantage in clock cycles per instruction we have to execute say 25 or 50 percent more instructions so that's where the wind would come and then you could make an argument whether the clock cycle times are the same or not but pointing out that we could divide the benchmark results time per program into three factors and the biggest difference between risk consists was the clock cycles per you execute a few more instructions but the clock cycles per instruction is much less and that was what this debate once we made that argument then people say okay I get it and so we went from it was outrageously controversial in you know 1982 that maybe probably by 1984 so people said oh yeah technically they've got a good argument what are the instructions in the RISC instruction set just to get an intuition okay 1995 I was asked scientific the future of what microprocessor so I and that well as I'd seen these predictions and usually people predict something outrageous just to be entertaining right and so my prediction for 2020 was you know things are gonna be pretty much they're gonna look very familiar to what they are and they are in if you were to read the article you know the things I said are pretty much true the instructions that have been around forever are kind of the same and that's the outrageous prediction actually yeah given how fast computers and well you know Moore's law was gonna go on we thought for 25 more years you know who knows but kind of the surprising thing in fact you know Hennessy and I you know won the the ACM a.m. Turing award for both the RISC instruction set contributions and for that textbook I mentioned but you know we are surprised that here we are 35 40 years later after we did our work and the the conventional wisdom of the best way to do instruction sets is still those RISC instruction sets that look very similar to what we look like you know we did in the 1980s so those surprisingly there hasn't some radical new idea even though we have you know a million times as many transistors as we had back then but what are the basic constructions and how did they change over the years so we're talking about addition subtract these are the specific so the the to get so the things that are in a calculator you are in a computer so any of the buttons that are in the calculator in the crater so the little button so if there's a memory function key and like I said those are turns into putting something in memories called a store bring something back Scott load just as a quick tangent when you say memory what does memory mean well I told you there were five pieces of a computer and if you remember in a calculator there's a memory key so you you want to have intermediate calculation and bring it back later so you'd hit the memory plus key M plus maybe and it would put that into memory and then you'd hit an REM like return instruction and it bring it back in the display so you don't have to type it you don't have to write it down bring it back again so that's exactly what memory is if you can put things into it as temporary storage and bring it back when you need it later so that's memory and loads and stores but the big thing the difference between a computer and a calculator is that the computer can make decisions and in amazingly the decisions are as simple is is this value less than zero or is this value bigger than that value so there's and those instructions which are called conditional branch instructions is what give computers all its power if you were in the early days of computing before the what's called the general-purpose microprocessor people would write these instructions kind of in hardware and but it couldn't make decisions it would just it would do the same thing over and over again with the power of having branch instructions that can look at things and make decisions automatically and it can make these decisions you know billions of times per second and amazingly enough we can get you know thanks to advances machine learning we can we can create programs that can do something smarter than human beings can do but if you go down that very basic level it's the instructions are the keys on the calculator plus the ability to make decisions of these conditional branch instructions you know and all decisions fundamental can be reduced down to these - assumptions yeah so in in fact and so you know going way back in the sack back to you know we did for risk projects at Berkeley in the 1980s they did a couple at Stanford in the 1980s in 2010 we decided we wanted to do a new instruction set learning from the mistakes of those RISC architectures of 1980s and that was done here at Berkeley almost exactly 10 years ago in the the people who did it I participated but other Christos Sanne and others drove it they called it risk 5 to honor those risk the four risk projects of 1980s so what is risk 5 involved so leaders 5 is another instruction set of vocabulary it's learned from the mistakes of the past but it still has if you look at the there's a core set of instructions it's very similar to the simplest architectures from the 1980s and the big difference about risk 5 is it's open so I talked early about proprietary versus open and kind of sauce software so this is an instruction set so it's a vocabulary it's not it's not hardware but by having an open instruction set we can have open source implementations open source processors that people can use where do you see that going says it's the really exciting possibilities but she's just like in the Scientific American if you were to predict 10 20 30 years from now that kind of ability to utilize open source instruction set architectures like risk 5 what kind of possibilities might that unlock yeah and so just to make it clear because this is confusing the specification of risk 5 is something that's like in a text book there's books about it so that's what that's kind of defining an interface there's also the way you build hardware is you write it in languages they're kind of like sea but they're specialized for hardware that gets translated into hardware and so these implementations of this specification are what are the open source so they're written in something that's called Verilog or VHDL but it's put up on the web like that you can see the C++ code for Linux on the web so that's the open instruction set enables open source implementations at risk five so you can literally build a processor using this instruction set people are and people are so what happened to us that the story was this was developed here for our use to do our research and we made it we licensed under the berkeley software distribution license like a lot of things get licensed here so other academics use it they wouldn't be afraid to use it and then about 2014 we started getting complaints that we were using it in our research in our courses and we got complaints from people in industries why did you change your instruction set between the fall and the spring semester and well we get complaints of additional time why the hell do you care what we do with our instruction set and then when we talked to him we found out there was this thirst for this idea of an open instruction set architecture and they had been looking for one they stumbled upon ours at Berkeley thought it was boy this looks great we should use this one and so once we realize there is this need for an open instruction set architecture we thought that's a great idea and then we started supporting it and tried to make it happen so this was you know kind we accidentally stumbled into this and to this need in our timing was good and so it's really taking off there's a you know universities are good at starting things but the not good it's sustaining things so like Linux has the Linux Foundation there's a risk 5 foundation that we started there's there's an annual conferences and the first one was done I think January 2015 and the one that was just last December in it you know it had 50 people at it and the last one last December had kind of 1,700 people were at it and the companies excited all over the world so if predicting into the future you know if we were doing 25 years I would predict that risk 5 will be you know possibly the most popular instruction set architecture out there because it's a pretty good instruction set architecture and it's open and free and there's no reason lots of people shouldn't use it and there's benefits just like Linux is so popular today compared to 20 years ago I and you know the fact that you can get access to it for free you can modify it you can improve it for all those same arguments and so people collaborate to make it a better system for all everybody to use and that works in software and I expect the same thing will happen in hardware so if you look at arm Intel mips if you look at just the lay of the land and what do you think just for me because I'm not familiar how difficult this kind of transition would how much challenges this kind of transition would entail do you see let me ask my dumb question another one no that's I know where you're headed well there's a budget I think the thing you point out there's there's these proprietary popular proprietary instruction sets the x86 and so how do we move to risk five potentially in sort of in the span of five 10 20 years a kind of a unification in given that the device is the kind of way we use devices IOT mobile devices and and the cloud keeps changing well part of it a big piece of it is the software stack and what right now looking forward there seem to be three important markets there's the cloud and then the cloud is simply companies like Alibaba and Amazon and Google Microsoft having these giant data centers with tens of thousands of servers in maybe a hunt maybe a hundred of these data centers all over the world and that's what the cloud is so the computer that dominates the cloud is the x86 instruction set so the instructions are the vocal instructor sets using the cloud of the x86 almost almost 100% of that today is x86 the other big thing are cell phones and laptops those are the big things today I mean the PC is also dominated by the x86 instruction set but those sales are dwindling you know there's maybe 200 million pcs a year and there's I serve one and a half billion phones a year there's numbers like that so for the phones that's dominated by arm and now and a reason that I talked about the software stacks and then the third category is Internet of Things which is basically embedded devices things in your cars and your microwaves everywhere so what's different about those three categories is for the cloud the software that runs in the cloud is determined by these companies Alibaba Amazon Google Microsoft so that they control that software stack for the cell phones there's both for Android and Apple the software they supply but both of them have marketplaces where anybody in the world can build software and that software is translated or you know compiled down and shipped in the vocabulary of arm so that's the the what's referred to as binary compatible because the actual it's the instructions are turned into numbers binary numbers and shipped around the world so and the size just a quick interruption so arm what is arm as arm is an instructions like a risk-based yeah it's a risk-based instruction as a proprietary one arm stands for advanced RISC machine erm is the name where the company is so it's a proprietary RISC architecture so and it's been around for a while and you know the surely the most popular instruction set in the world right now they every year billions of chips are using the arm design in this post PC era is what it was the one of the early risk adopters of the risk yeah yeah the first arm goes back I don't know 86 or so so Berkeley instead did their work in the early 80s their arm guys needed an instruction set and they read our papers and it heavily influenced them so getting back my story what about Internet of Things well software's not shipped in Internet of Things it's the the embedded device people control that software stack so you would the opportune these four risk five everybody thinks is in the internet of things embedded things because there's no dominant player like there is in the cloud or the smartphones and you know it's it's doesn't have a lot of licenses associated with and you can enhance the instruction set if you want and it's a in and people have looked at instruction sets and think it's a very good instruction set so it appears to be very popular there it's possible that in the cloud people those companies control their software stacks so that it's possible that they would decide to use verse five if we're talking about ten and twenty years in the future the one of the be harder it would be the cell phones since people ship software in the arm instruction set that you'd think be the more difficult one but if if risk five really catches on and you know you could in a period of a decade you can imagine that's changing over to give a sense why risk five our arm is dominated you mentioned these three categories why has why did arm dominate why does it dominate the mobile device base and maybe the my naive intuition is that there are some aspects of power efficiency that are important yeah that somehow come along with risk well part of it is for these old Siskin structions that's like in the x86 it it was more expensive to these for the you know they're older so they have disadvantages in them because they were designed forty years ago but also they have to translate in hardware from sis constructions to risks instructions on the fly and that costs both silicon area that the chips are bigger to be able to do that and it uses more power so arm his which has you know followed this risk philosophy is seen to be much more energy-efficient and in today's computer world both in the cloud in cell phone and you know things it isn't the limiting resource isn't the number of transistors you can fit in the chip it's what how much power can you dissipate for your application so by having a reduced instruction set you that's possible to have a simpler hardware which is more energy efficient in energy efficiency is incredibly important in the cloud when you have tens of thousands of computers in a datacenter you want to have the most energy-efficient ones there as well and of course for embedded things running off of batteries you want those to be energy efficient in the cell phones too so it I think it's believed that there's a energy disadvantage of using these more complex instruction set architectures so the other aspect of this is if we look at Apple Qualcomm Samsung Huawei all use the ARM architecture and yet the performance of the systems varies I mean I don't know whose opinion you take on but you know Apple for some reason seems to perform better and try these implementations architecture so where's the magic and sure that happened yeah so what arm pioneered was a new business model as they said well here's our proprietary instruction set and we'll give you two ways to do it eat there we'll give you one of these implementations written in things like C called Verilog and you can just use ours well you have to pay money for that not only pay will give you the you know will license use to do that or you could design your own and so we're talking about numbers like tens of millions of dollars to have the right to design your own since they it's the instruction set belongs to them so Apple got one of those the right to build their own most of the other people who build like Android phones just get one of the designs from arm and to do it themselves so Apple developed a really good microprocessor design team they you know acquired a very good team that had was a building other microprocessors and brought them into the company to build their designs so the instruction sets are the same the specifications are the same but their hardware design is much more efficient than I think everybody else's and that's given Apple an advantage in the marketplace and that the iPhones tend to be the faster than most everybody else's phones that are they it'd be nice to be able to jump around and kind of explore different little sides of this but let me ask one sort of romanticized question what to you is the most beautiful aspect or idea of risk instruction set or instruction sets for this you know what I think that you know I I'm you know I I was always attracted to the idea of you know smallest beautiful why is that the temptation in engineering it's kind of easy to make things more complicated it's harder to come up with a it's more difficult surprising they come up with a simple elegant solution and I think that there's a bunch of small features of of risk in general that you know where you can see this examples of keeping it simpler makes it more elegant specifically in risk five which you know I'm I was kind of the mentor in the program but it was really driven by christos sama and two grad students Andrew Waterman Yin Sibley is they hit upon this idea of having a subset of instructions a nice simple instruction subset instructions like 40-ish instructions that all software the software status v can run just on those forty instructions and then they provide optional features that could accelerate the performance instructions that if you needed them could be very helpful but you don't need to have them and that that's a new really a new idea so risk five has right now maybe five optional subsets that you can pull in but the software runs without them if you just want to build the just the core forty instructions that's fine you can do that so this is fantastic educationally is so you can explain computers you only have to explain forty instructions and not thousands of them also if you invent some wild and crazy new technology like you know biological computing you'd like a nice simple instruction set and you can risk 5e if you implement those core instructions you can run you know really interesting programs on top of that so this idea of a core set of instructions that the software stack runs on and then optional features that if you turn them on the compilers where used but you don't have to I think is a powerful idea what's happened in the past if for the proprietary instruction sets is when they add new instructions it becomes required piece and so that all all microprocessors in the future have to use those instructions so it's kind of like is for a lot of people as they get older they gain weight all right is it that weight and age are correlated and so you can see these instruction sets get getting bigger and bigger as they get older so risk five you know let's you be as slim as your as a teenager and you only have to add these extra features if you're really gonna use them rather than every you have no choice you have to keep growing with the instruction set I don't know if the analogy holds out but that's a beautiful notion that there's it's almost like a nudge towards here's the simple core that's the essential yeah I think the surprising thing is still if we if we brought back you know the pioneers from the 1950s and showed them the instruction set architectures they'd understand it they that doesn't look that different well you know I'm surprised and it's if there's it may be something you know to talk about philosophical things I mean there may be something powerful about those you know forty or fifty instructions that all you need is these commands like these instructions that we talked about and that is sufficient to build to bring upon you know artificial intelligence and so it's a remarkable surprising to me that is complicated Minoo microprocessors where the line widths are narrower than the wavelength of light you know is this amazing technologies at some fundamental level the commands that software execute are really pretty straightforward and haven't changed that much in in decades it's what a surprising outcome so underlying all computation all Turing machines all artificial intelligent systems perhaps might be a very simple instruction set like like a risk 5 or it's yeah I mean I that's kind of what I said I was interested to see I had another more senior faculty colleague and he he had written something in Scientific American in you know his 25 years in the future and his turned out about when I was a young professor and he said yep I checked it I was interest to see how that was going to turn out for me and it's pretty held up pretty well but yeah so there's there's probably there's something I you know there's there must be something fundamental about those instructions that were capable of creating you know intelligence and from pretty primitive operations and just doing them really fast you kind of mentioned the different maybe radical computational medium like biological and there's other ideas so there's a lot of spaces in a6 or domain-specific and then there could be quantum computers and wood so we couldn't think of all those different mediums and types of computation what's the connection between swapping out different Hardware systems and the instruction set do you see those as disjoint or they fundamentally coupled yeah so what's so kind of if we go back to the history you know when Moore's Law is in full effect and you're getting twice as many transistors every couple of years you know kind of the challenge for computer designers is how can we take advantage of that how can we turn those transistors into better computers faster typically and so there was an era I guess in the 80s and 90s where computers were doubling performance every 18 months and if you weren't around then what would happen is you had your computer and your friend's computer which was like a year year and a half newer and it was much faster than your computer and you he he or she could get their work done much faster than your typical user so people took their computers perfectly good computers and threw them away to buy a newer computer because the computer one or two years later was so much faster so that's what the world was like in 80s and 90s well with the slowing down of Moore's law that's no longer true right he not now with you know not decide computers with the laptops I only get a new laptop when it breaks right well damn the disk broke or this display broke I got to buy a new computer but before you would throw them away because it just they were just so sluggish compared to the latest computers so that's you know that's a huge change of what's gone on so but yes since this lasted for decades kind of programmers and maybe all society is used to computers getting faster regularly it we now now believe those of us who are in computer design it's called computer architecture that the path forward is instead is to add accelerators that only work well for certain applications so since Moore's law is slowing down we don't think general-purpose computers are gonna get a lot faster so the Intel processors of the world are not going to haven't been getting a lot faster they've been barely improving like a few percent a year it used to be doubling your 18 months and now it's doubling every 20 years so it was just shocking so to be able to deliver on what Moore's law used to do we think what's going to happen what is happening right now is people adding accelerators to their microprocessors that only work well for some domains and by sheer coincidence at the same time that this is happening has been this revolution in artificial intelligence called machine learning so with as I'm sure your other guess I've said you know a I had these two competing schools of thought is that we could figure out artificial intelligence by just writing the rules top-down or that was wrong you had to look at data and infer what the rules are the machine learning and what's happened in the last decade or eight years this machine learning has won and it turns out that machine learning the hardware you built from learning is pretty much multiply the matrix multiply is a key feature for the way people machine learning is done so that's a godsend for computer designers we know how to make metrics multiply run really fast so general-purpose microprocessors are slowing down we're adding accelerators from machine learning that fundamentally are doing matrix multiplies much more efficiently than general-purpose computers have done so we have to come up with a new way to accelerate things the danger of only accelerating one application is how important is that application turns it turns out machine learning gets used for all kinds of things so serendipitously we found something to accelerate that's widely applicable and we don't even we're in the middle of this revolution of machine learning we're not sure what the limits of machine learning are so this has been kind of a godsend if you're going to be able to Excel deliver on improved performance as long as people are moving their programs to be embracing more machine learning we know how to give them more performance even as Moore's Law is slowing down and counter-intuitively the machine learning mechanism you can say is domain-specific but because it's leveraging data it's actually could be very broad in terms of in terms of the domains it could be applied in yeah that's exactly right sort of it's almost sort of people sometimes talk about the idea of software 2.0 we're almost taking another step up in the abstraction layer in designing machine learning systems because now you're programming in the space of data in the space of hyper parameters it's changing fundamentally the nature of programming and so the specialized devices that that accelerate the performance especially neural network based machine learning systems might become the new general yes so the this thing that's interesting point out these are not coral these are not tied together the it's enthusiasm about machine learning about creating programs driven from data that we should figure out the answers from rather than kind of top down which classically the way most programming is done in the way artificial intelligent used to be done that's a movement that's going on at the same time coincidentally and the the first word machine learnings machines right so that's going to increase the demand for computing because instead of programmers being smart writing those those things down we're going to instead use computers to exam a lot of data to kind of create the programs that's the idea and remarkably this gets used for all kinds of things very successfully the image recognition the language translation the game playing and you know it gets into pieces of the software stack like databases and stuff like that we're not quite sure how journal purposes but that's going on independent as Hardware stuff what's happening on the hardware side is Moore's Law is slowing down right when we need a lot more cycles it's failing us it's failing us right when we need it because there's going to be a greater in peace a greater increase in computing and then this idea that we're going to do so-called domain-specific here's a domain that your greatest fear is you'll make this one thing work and that'll help you know 5% of the people in the world well this this looks like it's a very general-purpose thing so the timing is fortuitous that if we can perhaps if we can keep building hardware that will accelerate machine learning the neural networks that'll beat the timing D right that that neural network revolution will transform your software the so called software 2.0 and the software the future will be very different from the software the past and just as our microprocessors even though we're still going to have that same basic risk instructions to run a big pieces of the software stack like user interfaces and stuff like that we can accelerate the the kind of the small piece that's computationally intensive it's not lots of lines of code but there it takes a lot of cycles to run that code that that's going to be the accelerator piece and so this that's what makes this from a computer designer's perspective a really interesting decade but Hennessy and I talked about that the title of our Turing warrant speech is a new golden age we we see this as a very exciting decade much like when we were assistant professors and the wrists stuff was going on that was a very exciting time was where we were changing what was going on we see this happening again tremendous opportunities of people because we're fundamentally changing how software is built and how we're running it so which layer of the abstraction do you think most of the acceleration might be happening the if you look in the next ten years that Google is working on a lot of exciting stuff with the TPU sort of there's a closer to the hardware that could be optimizations around the IROC closer to the instruction set that could be optimization at the compiler level it could be even at the higher level software stack yeah it's going to be I mean if you think about the old risks this debate it was both it was software hardware it was the compilers improving as well as the architecture improving and that that's likely to be the way things are now with machine learning they they're using domain-specific languages the languages like tensorflow and pi torch are very popular with the machine learning people that those are the raising the level of abstraction it's easier for people to write machine learning in these domain-specific languages like like a PI torch in tensorflow so where the most of the optimization but yeah and so that and so there'll be both the compiler piece and the hardware piece underneath it so as you kind of the fatal flaw for hardware people is to create really great hardware but not have brought along the compilers and what we're seeing right now in the marketplace because of this enthusiasm around hardware for machine learning is getting you know probably a billions of dollars invested in start-up companies we're seeing startup companies go belly-up because they focus on the hardware but didn't bring the software stack along we talked about benchmarks earlier so I participated in machine learning didn't really have a set of benchmarks I think just two years ago they didn't have a set of benchmarks and we've created something called ml perf which machine learning benchmark suite and pretty much the companies who didn't invest in the software stack couldn't run a ml per fairy wall and the ones who did invest in software stack did and we're seeing you know like kind of in computer architecture this is what happens you have these arguments about risk versus ist's people spend billions of dollars in the marketplace to see who wins and it's not it's not a perfect comparison but it kind of sorts things out and we're seeing companies go out of business and then companies like like there's a company in Israel called Habana they came up with machine learning accelerators that they had good ml perf scores Intel had acquired a company earlier called nirvana a couple years ago they didn't reveal the amount of Perth's cores which was suspicious but month ago Intel announced that they're cancelling the Nirvana product line and they've bought Habana for two billion dollars and Intel's going to be shipping Habano chips which have hardware and software and run the ml perf programs pretty well and that's going to be their product line in the future brilliant so maybe just a linker briefly I'm a love metrics I love standards that everyone can gather around what are some interesting aspects of that portfolio of metrics well one of the interesting metrics is you know what we thought it was you know we I was involved in the start you know we that Peter Matson is leading the effort from Google Google got it off the ground but we had to reach out to competitors and say there's no benchmarks here this we didn't we think this is bad for the field it'll be much better if we look at examples like in the wrist days there was an effort to create a for the the people in the risk community got together competitors got together a building risk microprocessors to agree on a set of benchmarks that we called spec and that was good for the industry is rather before the different risk architectures were arguing well you can believe my performance others but those other guys are liars and that didn't do any good so we agreed on a set of benchmarks and then we could figure out who is faster between the various risk architectures but it was a little bit faster but that drew the market rather than you know people were afraid to buy anything so we argued the same thing would happen with him helper you know companies like Nvidia were you know maybe worried that it was some kind of trap but eventually we all got together to create a set of benchmarks and do the right thing right and we agree on the results and so we can see whether TP use or GPUs or CPUs are really faster than how much the faster and I think from an engineer's perspective as long as the results are fair Europe you can live with it okay you know you have a tip your hat to to your colleagues at another institution boy they did a better job than this what you what you hate is if it's it's false right they're making claims and it's just marketing and you know in that's affecting sales so you from an engineer's perspective as long as it's a fair comparison and we don't come in first place that's too bad but it's fair so we wanted to create that environment frame all perf and so now there's ten companies I mean ten universities and fifty companies involved so pretty much AML perf has is the is the way you measure machine learning performance and and it didn't exist even two years ago one of the cool things that I enjoy about the Internet has a few downsides but one of the nice things is people can see through BS a little better with the presence yes has a metrics it's so it's really nice a companies like Google and Facebook and Twitter now it's the cool thing to do is to put your engineers forward and to actually show off how well you do on these metrics there's not sort of it well there's a less of a desire to do marketing a less so in my in my sort of naive no I don't think well I was trying to understand that you know what's changed from the 80s in this era I think because of things like social networking Twitter and stuff like that if you if you put up you know stuff right that's just you know miss purposely misleading you know that you you can get a violent reaction in social media pointing out the flaws in your arguments right and so from a marketing perspective you have to be careful today that you didn't have to be careful that there'll be people who put off the flaw you can get the word out the flaws and what you're saying much more easily today than in the past you used to be it was used to be easier to get away with it and the other thing that's been happening in terms of starting off engineers it's just in the software side people have largely embraced open-source software it it was 20 years ago it was a dirty word at Microsoft and today Microsoft is one of the big proponents of open source software the kind of that's the standard way most software gets built which really shows off your engineers because you can see if you look at the source code you can see who are making the commits who's making the improvements who are the engineers at all these companies who are are you know really great programmers and engineers and making really solid contributions which enhances their reputations and the reputation of the companies so but that's of course not everywhere like in this space that I work more in is autonomous vehicles and they're still the machinery of hype and marketing is still very strong there and there's less willingness to be open in this kind of open source way and sort of benchmark so ml Perez represents the machine learning world is much better being open-source about holding itself to standards of different the amount of incredible benchmarks in terms of the different computer vision naturally new processing - inaudible it you know historically it wasn't always that way I had a graduate student working with me David Martin so for in computer in some fields benchmarking is been around forever so computer architecture databases maybe operating systems benchmarks are the way you measure progress but he was working with me and then started working with gender Malik and he's a gender Malik in computer vision space who I guess you've you interviewed yes and David Martin told me they don't have benchmarks everybody has their own vision algorithm in the way that my here's my image look at how well I do and everybody had their own image so David Martin back when he did his dissertation figured out a way to do benchmarks he had a bunch of graduate students identify images and then ran benchmarks to see which algorithms run well and that was as far as I know kind of the first time people did benchmarks in computer vision in which was predated all you know the things that eventually led to imagenet himself like that but then you know the vision community got religion and then once we got as far as image net then that let the guys in Toronto be able to win the image net competition and then you know that changed the whole world it's a scary step actually because when you enter the world of benchmarks you actually have to be good to participate as opposed to yeah you can just you just believe you're the best in the world and I think the people I think they weren't purposely misleading I think if you don't have benchmarks I mean how do you know you know you could have your intuition it's kind of like the way we did used to do computer architecture your intuition is that this is the right instruction set to do this job I believe in my experience my hunch is that's true we had to get to make things more quantitative to make progress and so I just don't know how you know in fields that don't have benchmarks I don't understand how they figure out how they're making progress we're kind of in the vacuum tube days of quantum computing what are your thoughts in this wholly different kind of space of architectures uh you know I actually you know quantum computing his ideas been around for a while and I actually thought well sure hope I retire before I have to start teaching this I'd say because I talked about give these talks about the slowing of Moore's law and you know when we need to change by doing domain-specific accelerators common questions say what about quantum computing the reason that comes up it's in the news all the time so I think the keep and the third thing to keep in mind is quantum computing is not right around the corner there have been two national reports one by the national campus of engineering another by the computing consortium where they did a frank assessment of quantum computing in both of those reports said you know as far as we can tell before you get error corrected quantum computing it's a decade away so I think of it like nuclear fusion right there been people who've been excited about nuclear fusion a long time if we ever get nuclear fusion it's going to be fantastic for the world I'm glad people are working on it but you know it's not right around the corner that those two reports to me say probably it'll be 2030 before quantum computing is a something that could happen and when it does happen you know this is going to be big science stuff this is you know microkelvin almost absolute zero things that if they vibrate if truck goes by it won't work right so this will be in data center stuff we're not gonna have a quantum cell phone and and it's probably a 2030 kind of thing so I'm happy that other people are working on it but just you know it's hard with all the news about it not to think that it's right around the corner and that's why we need to do something as Moore's Law is slowing down to provide the computing keep improving getting better for this next decade and and you know we shouldn't be betting on quantum computing are expecting quantum computing to deliver in the next few years it's it's probably further off you know I I'd be happy to be wrong it be great if quantum computing is gonna commercially viable but it will be a set of applications it's not a general-purpose computation so it's gonna do some amazing things but there'll be a lot of things that probably you know the the old-fashioned computers are gonna keep doing better for quite a while and there'll be a teenager 50 years from now watching this video saying look how silly David Patterson was saying I said what did I didn't say sorry I never we're not gonna have quantum cellphones so he's gonna be watching and well I mean III think this is such a you know given we've had Moore's law I just I feel comfortable trying to do projects that are thinking about the next decade I I admire people who are trying to do things that are 30 years out but it's such a fast-moving field I just don't know how to I'm not good enough to figure out what what's the problems gonna be in 30 years you know 10 years is hard enough for me so maybe if it's possible to untangle your intuition a little bit I spoke with Jim Keller I don't know if you're familiar with Jim and he he is trying to sort of be a little bit rebellious and to try to think that he quotes me as being wrong yeah so what are your the relationship for the record Jim talks about that he has an intuition that Moore's law is not in fact in fact dead yet and then it may continue for some time to come what are your thoughts about Jim's ideas in this space yeah this is just this is just marketing so but Gordon Moore said is a quantitative prediction if we can check the facts right which is doubling the number of transistors every two years so we can look back at Intel for the last five years and ask him let's look at DRAM chips six years ago so that would be three two-year periods so then our DRAM chips have eight times as many transistors as they did six years ago we can look up Intel microprocessors six years ago if Moore's law is continuing it should have eight times as many transistors as six years ago the answers in both those cases is no the problem has been because Moore's law was kind of genuinely embraced by the semiconductor industries they would make investments in severe equipment to make Moore's Law come true semiconductor improving in Moore's law in many people's mind are the same thing so when I say and I'm factually correct that Moore's law is no longer holds we are not doubling transistors every years years the downside for a company like Intel is people think that means it stopped that technology has no longer improved and so Jim is trying to react at AraC the impression that semiconductors are frozen in 2000 nineteen are never gonna get better so I never said that I said was Moore's law is no more and I'm strictly looking at a number of transistors because that's what more that's what Moore's law is there's the I don't know there's been this aura associated with Moore's law that they've enjoyed for fifty years about look at the field we're in we're doubling transistors every two years what an amazing field which is an amazing thing that they were able to pull off but even as Gordon Moore said you know no exponential can last forever it's lasted for 50 years which is amazing and this is a huge impact on the industry because of these changes that we've been talking about so he claims because he's trying to act and he claims you know Patterson says Moore's laws know more and look at all look at it it's still controlling and tsmc to say it's as no longer but there but there's quantitative evidence that Moore's law is not continuing so what I say now to try and okay I understand the perception problem when I say Moore's law is stopped okay so now I say Moore's law slowing down and I think Jim which is another way if he's if it's predicting every two years and I say it's slowing down then that's another way of saying it doesn't hold anymore and and I think Jim wouldn't disagree that it's slowing down because that sounds like it's things are still getting better just not as fast which is another way of saying Moore's law isn't working anymore it's still good for marketing but uh but what's your you're not you don't like expanding the definition of Moore's law sort of uh well yeah that's really yeah it's an educator you know are you know is this like bonding politics is everybody get their own facts or do we have Moore's law was a crisp you know amorous Carver Mead looked at his observations drawing on a log-log scale a straight line and that's what the definition of Moore's law is there's this other what Intel did for a while interestingly before Jim joined them they said oh no Morris lies in the number of doubling isn't really doubling transistors every two years Moore's law is the cost of the individual dressed sister going down cutting in half every two years now that's not what he said but they reinterpreted it because they believed that the that the cost of transistors was continuing to drop even if they couldn't get twice as many people industry have told me that's not true anymore that basically then the in more recent technologies that got more complicated the actual cost of transistor went up so even even the a corollary might not be true but certainly you know Moore's law that was the beauty of Moore's law it was a very simple it's like equals mc-squared right it was like wow what an amazing prediction it's so easy to understand the implications are amazing and that's why it was so famous as a as a prediction and this this reinterpretation of what it meant and changing is you know his revisionist history and I I'd be happy and and they're not claiming there's a new Moore's law they're not saying by the way it's instead of every two years it's every three years I don't think the I don't think they want to say that I think what's going to happen is the new technology Commission's H ones get a little bit slower so it it is slowing down the improvements will won't be as great and that's why we need to do new things yeah I don't like that the the idea of Moore's law is tied up with marketing I it would be nice if it's whether it's marketing or it's it's well it could be affecting business but they could also be infecting the imagination of engineers is if if Intel employees actually believe that we're frozen in 2019 well that's that would be bad for Intel they not just Intel but everybody it's inspired Moore's law is inspiring yeah everybody but what's happening right now talking to people in who have working in national offices and stuff like that a lot of the computer science community is unaware that this is going on right that we are in an era that's going to need radical change at lower levels that could affect the whole software stack this you know if if the Intel if you're using cloud stuff and servers that you get next year are basically only a little bit faster than the servers you got this year you need to know that and we need to start innovating to start delivery blow on it if you're counting on your software your software going to add a lot more features assuming the computers can get faster that's not true so are you gonna have to start making your software stack more efficient or are you gonna have to start learning about machine learning so it's you know it's kind of a it's a morning or call for arms that the world is changing right now and a lot of people a lot of computer science PhDs are unaware of that so a way to try and get their attention is to say that Moore's law is slowing down and that's gonna affect your assumptions and you know we're trying to get the word out and when companies like TSMC and Intel say oh no no no Moore's law is fine then people think okay that I don't have to change my behavior I'll just get the next servers and you know if they start doing measurements though realize what's going on it'd be nice to have some transparency and metrics for for the layperson to be able to know if computers are getting faster and there are yeah there are there are a bunch of most people kind of use clock rate as a measure performance you know it's not a perfect one but if you've noticed clock rates are more or less the same as they were five years ago computers are a little better than they aren't they haven't made zero progress but they've made small progress so you there's some indications out there and in our behavior right nobody buys the next laptop because it's so much faster than the laptop from the past four cell phones I think I don't know why people buy new cell phones you know because of the new ones announced the cameras are better but that's kind of domain-specific right they're putting special purpose hardware to make the processing of images go much better so that's that that's the way they're doing it they're not particularly it's not that the ARM processor there's twice as fast as much as they'd added accelerators to help eat the experience of the phone can we talk a little bit about one other exciting space arguably the same level of impact as your work with risk is raid and in your in 1988 you co-authored a paper a case for redundant array of inexpensive disks hence our AI D rate so you that's where you introduce the idea rate incredible that that little I mean little that paper kind of had this ripple effect and had a really revolutionary effect so first what is rate what is rate so this is work I did with my colleague Randy Katz and a star graduate student Garth Gibson so we had just done the fourth generation risk project and Randy Kass which had early Apple Macintosh computer at this time everything was done with floppy disks which are old technologies that to could store things that didn't have much capacity and you had to to get any work done you're always sticking in your little floppy disk in and out because they didn't have much capacity but they started building what are called hard disk drives which is magnetic material that can remember information storage for the Mac and Randy asked the question when he saw this disk next to his Mac jeez he's a brand-new small things before that for the big computers that the disk would be the size of washing machines and here's something the size of a kind of the size of a book or so this is I wonder what we could do with that well we the Randy was involved in the in the fourth generation risk project here at Berkeley 80s so we figured out a way how to make the computation part the processor part go a lot faster but what about the storage part can we do something to make it faster so we hit upon the idea of taking a lot of these disks developed for personal computers and mackintoshes and putting many of them together instead of one of these washing machine sized things and so we were to rub the first draft of the paper and we'd have 40 of these little PC DOS instead of one of these washing machine size things and they would be much cheaper because they're made for PCs and they could actually kind of be faster because there was 40 of them rather than one of them and so he wrote a paper like that and send it to one of a former Berkeley students at IBM and he said well this is all great and good but what about the reliability of these things now you have 40 of these devices each of which are kind of PC quality so they're not as good as these IBM washing machines IBM dominated the the the storage Genesis so you reliably gonna be awful and so when we calculated it out instead of you know it breaking on average once a year it would break every two weeks so we thought about the idea and said well we got to address the reliability so we did it originally performance but we had do reliability so the name redundant array of inexpensive disks is array of these disks inexpensive life for pcs but we have extra copies so if one breaks we won't lose all the information will have enough redundancy that we could let some break and we can still preserve the information so the name is an array of inexpensive discs this is a collection of these pcs and the are part of the name was the redundancy so they'd be reliable and it turns out if you put a modest number of extra disks in one of these arrays it could actually not only be as faster and cheaper that one of these washing machine discs it could be actually more reliable because you could have a couple of breaks even with these cheap discs whereas one failure with the washing machine thing would knock it out did you did you have a sense just like with risk that in the 30 years that followed raid would take over as a as a man I think George I I'd say I think I'm naturally an optimist but I thought our ideas were right I thought kind of like Moore's law it seemed to me if you looked at the history of the disk drives they went from washing machine size things than they were getting smaller and smaller and the volumes were with the smaller disk drives because that's where the PCs were so we thought that was a technological trend that disk drives the volume disk drives was going to be small getting smaller and smaller devices which were true they were the size of the I don't know eight inches diameter than five inches than three inches of diameters and so that it made sense to figure out how to deal things with an array of disks so I think it was one of those things where logically we think the technological forces were on our side that it made sense so we expected it to catch on but there was that same kind of business question you know IBM was the big pusher of these disk drives in the real world where the technical advantage get turned into a business advantage or not it proved to be true it did in so you know we thought we were sound technically and it was unclear worth of the business side but we kind of as academics we believe the technology should win and and it did and and if you look at those thirty years just from your perspective are there interesting developments in the space of storage that have happened in that time yeah the big thing that happened both a couple of things that happened what we did had a modest amount of storage so as redundancy as people built bigger and bigger storage systems they've added more we doesn't see so they could have more failures and they have biggest thing that happened in storage is for decades it was based on things physically spinning called hard disk drives where you used to turn on your computer and it would make a noise what that noise was was the disk drive spinning and they were rotating it in like 60 revolutions per second and it's like if you remember the vinyl vinyl records if you've ever seen those that's what it looked like and there was like a needle like on a vinyl record that was reading it so the big drive a change is switching that over to a similar technology called flash so within the last I'd say about decade is increasing fraction of all the computers in the world are using semiconductor for storage the flash drive instead of being magnetic their optical their there well their semiconductor writing of information into very densely and that's been a huge difference so all the cell phones in the world use flash most of the laptops use flash all the embedded devices use flash instead of storage still in the cloud magnetic disks are more economical than flash but they used both in the cloud so it's been a huge change in the storage industry this the switching from primarily disk to being primarily semiconductor for the individual discs but still the raid mechanism applies to those different kinds of yes the the people will still use raid ideas because it's kind of what's different you know kind of interesting kind of psychologically if you think about it people have always worried about the reliability of computing since the earliest days so kind of but if we're talking about computation if your computer makes a mistake and the computer says the computer has worries to check and say we screwed up we made a mistake what happens is that program that was running you have to redo it which is a hassle for storage if you've sent important information away and it loses that information you go nuts yeah yeah this is the worst I oh my god so if you have a laptop and you're not backing it up on the cloud or something like this and your disk drive breaks which it can do you'll lose all that information and you just go crazy right so the importance of reliability for storage is tremendously higher than the importance of reliability for computation because of the consequences of it so yes so raid ideas are still very popular even with the switch of the technology although you know flash drives are more reliable you know if you're not doing anything like backing it up to get some redundancy so they handle it you're you're you're taking great risks you said that for you and possibly from any others teaching and research don't conflict with each other as right one might suspect and in fact they kind of complement each other so maybe a question I have is how is teaching helped you in your research or just in your entirety as a person who both teaches and does research and just thinks and creates new ideas in this world yes I think I think what happens is is when you're a college student you know there's this kind of tenure system and doing research so kind of this model that you know is popular in America I think America really made it happen is we can attract these really great faculty to research universities because they get to do research as well as teach and that especially in fast-moving fields this means people are up-to-date and they're teaching those kind of things so but when you run into a really bad professor a really bad teacher I think the students think well this guy must be a great researcher because why else could he be here so is I you know I I after 40 years at Berkeley we had a retirement party and I got a chance to reflect and I looked back to some things that is not my experience there's a I saw a photograph of five of us in the department who won the distinguished Teaching Award from campus a very high honor you know what I've got one of those when the highest honors so they're five of us on that picture there's Manuel Blum Richard Karp me Randy Katz and John osterhaus contemporaries of mine I mentioned Randy already all of us are in the National Academy of Engineering we've all run the distinguished Teaching Award Blum Karp and I are all have turing award just going away that's right you know the highest award in computing so the opposite right it's what happens if you it's it's they're highly correlated so probably the other way to think of it if you're very successful people may be successful at everything they do it's not an either/or and but it's an interesting question whether specifically that's probably true but specifically for teaching if there's something in teaching that it's the Richard Fineman right right yeah is there something about teaching that actually makes your research makes you think deeper and more outside the box and yeah absolutely so yeah I was going to bring up Fineman I mean he criticized the Institute of Advanced Studies he says there's Advanced Studies was this thing that was created in your Princeton where Einstein and all these smart people and when he was invited he said he thought it was a terrible idea his this is a university was it was supposed to be heaven right a university without any teaching but he thought it was a mistake is getting up in the classroom and having to explain things to students and having them ask questions like well why is that true makes you stop and think so he to think he thought and I agree I think that interaction between a retina research university and having students with bright young man's asking hard questions the whole time is synergistic and you know a university without teaching wouldn't be as vital and exciting a place and I think it helps stimulate the the research another romanticized question but what's your favorite concept or idea to teach what inspires you or you see inspire the students is there something to pasta my or or puts the fear of God in them I don't know II whichever is most effective I mean in general I think people are surprised I've seen a lot of people who don't think they like teaching come come give guest lectures or teach a course and get hooked on seeing the lights turn on right his people you can explain something to people that they don't understand and suddenly they get something you know that's that's not that's important and difficult and just seeing the lights turn on is a you know it's a real satisfaction there I don't think there's any in a specific example of that it's just the general joy of seeing them seeing them understand I have to talk about this because I've wrestled I do usual arts yes yes I love Russ I'm a huge I'm Russian so I'll sure I'd have talked to Dan Gable oh yeah I guess so fine yang Gables my era kind of guy so you wrestled UCLA among many other things you've done in your life competitively in sports and science on you've wrestled maybe again continue in their immense sessions but what have you learned about life yeah and maybe even size from wrestling or from that's in fact I wrestled at UCLA but also at El Camino can be College and just right now we were in the state of California we were state champions at El Camino and the fact I was talking to my mom and I got into UCLA but I decided to go to the Community College which is it's much he's harder to go to UCLA than Community College and I asked why did I make the decision because I thought that was because of my girlfriend she said well it was the girlfriend and and you thought the wrestling team was really good and we were right we had a great wrestling team it we actually wrestled against UCLA at a tournament and we beat UCLA it's a community college which just freshmen and sophomores and the reason I brought this up is I'm gonna go they've invited me back at El Camino if give a lecture next month and so I'm Liev my friend who was on the wrestling team that we're still together we're right now reaching out to other members of the wrestling team you can get together every Union but in terms of me it was a huge difference I was I was both I was kind of the age cutoff I was who's December first and so I was almost always the youngest person in my class and I matured later on you know our family badgered later so I was almost always the smallest guy so you know I took in kind of nerdy courses but I was wrestling so wrestling was huge for my you know self-confidence in high school and then you know I kind of got bigger at El Camino and in college and so I had this kind of physical self-confidence and it's translated into research self-confidence and and also kind of I've had this feeling even today in my 70s you know if something if something going on and streets there's bad physically I'm not gonna ignore it right I'm gonna stand up and try and straighten that out and that kind of confidence just carries through the entirety of your life yeah and the same things happens intellectually if there's something going on where people are saying something that's not true I feel it's my job to stand up and just like I would in the street if there's something going on somebody attacking some woman or something I'm not I'm not standing by and letting that so I feel it's my job to stand up so it's kind of ironically translates the other things that turned out for both I had really great college in high school coaches and they believed even though wrestling's an individual sport that would be be more successful as a team if we bonded together you do things that we would support each other rather than everybody you know in wrestling it's one-on-one and you could be everybody's on their own but he felt if we bonded as a team we'd succeed so I kind of picked up those skills of how to form successful teams and how do you from wrestling and so I think one of most people would say one of my strengths is I can create teams of faculty watch teams of faculty grad students pull all together for a common goal and you know and you often be successful at it but I got I got both of those things from wrestling also I think I heard this line about if people are in kind of you know collision you know sports with physical contact like wrestling or football and stuff like that people are a little bit more you know assertive or something so I think I think that also comes through is you know in I was I didn't shy away from the risk debates you know I was yeah I enjoyed taking on the arguments and stuff like that so it was it was a I'm really glad I did wrestling I think it was really good for my self-image and I learned a lot from it so I think that's you know sports done well you know there's really lots of positives you can take about it leadership you know how to how to form teams and how to be successful so we've talked about metrics a lot there's a really cool in terms of bench press and weightlifting pioneers metric do you develop that we don't have time to talk about but it's it's a really cool that people should look into it's rethinking the way we think about metrics and weightlifting but let me talk about metrics more broadly since that appeals Cu in all forms let's look at the most ridiculous the biggest question of the meaning of life if you were to try to put metrics on a life well-lived what would those metrics be yeah a friend Randy Katz said this he said you know when when it's time to sign off it's it's the measure isn't the number of zeros in your bank account it's the number of inches in the obituary in The New York Times he said it I I think you know having and you know this is a cliche is that people don't die wishing they'd spent more time in the office right is I reflect upon my career there have been you know a half a dozen or a dozen things say I've been proud of a lot of them aren't papers or scientific well certainly my family my wife we've been married more than 50 years kids and grandkids that's really precious education thinks I've done I'm very proud of you know books and courses I did some help with underrepresented groups that was effective so it was interesting just seeing what were the things I reflected you know I had hundreds of papers but some of them weren't the papers like the risk and rate stuff wasn't proud of but a lot of them were or not those things so people who are just spend their lives you know going after the dollars are going after all the papers in the world you know that's probably not the things that are afterwards you're gonna care about when I was a yeah just when I got the offer from Berkeley but before I showed up I read a book where they interviewed a lot of people in all walks of life and what I got out of that book was the people who felt good about what they did was the people who affected people as opposed to things that were more transitory so I came into this job assuming that it wasn't going to be the papers it was gonna be relationships with the people over time that I would I would value and that was a correct assessment right it's it's the people you work with the people you can influence the people you can help is the things that you feel good about towards into your career it's not not the the stuff that's more transitory I don't think there's a better way to end it then talking about your family the the over 50 years of being married to your childhood sweetheart is how do when you tell people you've been married 50 years they want to know why how why I can tell you the nine magic words that you need to say to your partner to keep a good relationship in the nine magic words are was wrong you were right I love you okay and you got to say all nine you can't say I was wrong you were right you're a jerk you know you guess so yeah a freely acknowledging that you made a mistake the other person was right and that you love them really gets over a lot of bumps in the road so that's what I pass along beautifully put David is a huge honor thank you so much for the book you've written for the research you've done for changing the world thank you for talking to that oh thanks for the interview thanks for listening to this conversation with David Patterson and thank you to our sponsors the Jordan Harbinger show and cash app please consider supporting this podcast by going to Jordan Harbinger complex and downloading cash app and using colex podcast click the links buy the stuff it's the best way to support this podcast and the journey I'm on if you enjoy this thing subscribe on youtube review it with five stars in a podcast supported on patreon or connect with me on Twitter and lex Freedman spelled without the e try to figure out how to do that it's just fr ID ma n and now let me leave you with some words from Henry David Thoreau our life is frittered away by detail simplify simplify thank you for listening and hope to see you next time you