Welcome to the Make It Sense Podcast. This is Sam Harris. Just a note to say that if you are hearing this, you are not currently on our subscriber feed and will only be hearing the first part of this conversation. In order to access full episodes of the Make It Sense Podcast, you will need to subscribe at samharris.org.
There you will find our private RSS feed to add to your favorite podcatcher, along with other subscriber-only content. We don't run ads on the podcast, and therefore it is made possible entirely through the support of our subscribers. So if you enjoy what we are doing here, please consider becoming one. Today I am speaking with Jeff Hawkins.
Jeff is the co-founder of Numenta, a neuroscience research company, and also the founder of the Redwood Neuroscience Institute. And before that, he was one of the founders of the field of handheld computing, starting Palm and Handspring. He is also a member of the National Academy of Engineering. And he is the author of two books.
The first is On Intelligence, and the second and most recent is A Thousand Brains, A New Theory of Intelligence. And Jeff and I talk about intelligence from a few different sides here. We start with the brain. We talk about how the cortex creates models of the world, the role of prediction in experience.
We discuss the idea that thought is analogous to movement in conceptual space. But for the bulk of the conversation, we have a debate about the future of artificial intelligence, and in particular, the alignment problem, and the prospect that AI could pose some kind of existential risk to us. As you'll hear, Jeff and I have very different takes on that problem. Our intuitions divide fairly sharply.
And as a consequence, we have a very spirited exchange. Anyway, it was a lot of fun. I hope you enjoy it. And I bring you Jeff Hawkins.
I'm here with Jeff Hawkins. Jeff, thanks for joining me. Thanks for having me, Sam. It's a pleasure.
I think we met probably just once, but I feel like we met about 15 years ago at one of those Beyond Belief conferences at the Salk Institute. Does that ring a bell? You know, I was at one of the Beyond Belief conferences, and I don't recall meeting you there, but it's totally possible. It's possible we didn't meet, but I remember, I think we had an exchange where, you know, one of us was in the audience, and the other was in the exchange over 50 feet.
Yeah, oh, that makes sense. Yeah, I was in the audience, and I was speaking up. Yeah, okay. I was probably on stage defending some cockamamie conviction.
Well, anyway, nice to almost meet you once again. And you have a new book, which we'll cover part of, by no means exhausting its topics of interest, but the new book is A Thousand Brains, and it's a work of neuroscience and also a discussion about the frontiers of AI and where all this is heading. But maybe we should start with the brain part of it and start with the really novel and circuitous and entrepreneurial route you've taken to get into neuroscience. This is the non-standard course to becoming a neuroscientist.
Give us your brief biography here. How did you get into these topics? Well, I fell in love with brains when I just got out of college. So I studied electrical engineering in college, and right after I started my first job, until I read an article by Francis Crick about brains and how we don't understand the work.
And I just became enamored. I said, oh my God, we should understand this. This is me. I am my brain.
No one seems to know how this thing is working. And I just couldn't accept that. And so I decided to dedicate my life to figuring out what's going on when I'm thinking and who we are basically as a species. And it was a difficult path.
So I quit my job. I essentially applied to become a graduate student first at MIT and AI, but then I settled at Berkeley in neuroscience. And I said, okay, we're going to spend my life figuring out how the neocortex works. And I found out very quickly that that was a very difficult thing to do scientifically, but difficult to do from the practical aspects of science that you couldn't get funding for that.
It was considered too ambitious. It was theoretical work and people didn't fund theoretical work. So after a couple of years as a graduate student at Berkeley, I set a different path. I said, okay, I'm going to go back to work in the industry for a few years to mature, to figure out an institutional change because I was up against an institutional problem, not just a scientific problem.
And that turned into a series of successful businesses that I was involved with and started, including Palm and Handspring. These are some of the early hands-held computing companies. And we were having a tremendous amount of success with that. But it was never my mission to stay in the hands-held computing industry.
I wanted to get back to neuroscience. And everybody who worked for me knew this. In fact, I told the investors, I'm only going to do this for four years. And they said, what?
Yeah, that's it. But it turned out to be a lot longer than that because of all the success we had. But eventually, I just extracted myself from it. And I was like, I'm going to go and I have so many years left in my life.
So after having all that success in the mobile computing space, I started a neuroscience institute. This was at the recommendations of neuroscience friends of mine. So they helped me do that. And I ran that for three years and now I've been running sort of a private lab just doing current neuroscience for the last 17 years.
That's Numenta, right? That's Numenta, yeah. And we've made some really significant progress in our goals. And the book documents some of the recent really significant discoveries we've made.
So am I right in thinking that you made enough money at Palm and Handspring that you could self-fund your first neuroscience institute? Or is that not the case? Did you have to go raise money? It was, well, it was a bit of a hope.
Certainly, I was a major contributor. I wasn't the only one. But I didn't want the funding to be the driver of what we did and how we spent all our time. So at the institute, we had collaborations with both Berkeley and Stanford.
We didn't get funds from them, but we did work with them on various things. And then we had, that was mostly funded by myself. Nementa is still a major contributor. But there are other people who've invested in Nementa.
We have one outside venture capitalist and several people, but I'm still a major contributor to it. I just view that as a sort of a necessary thing to get onto the science and not have to worry about it. Because when I was at Berkeley, what I was told over and over again, I really can't understand this. In fact, I went to Washington to talk about to the National Science Foundation and National Institute of Health and also to DARPA, who were the funders of neuroscience.
And everyone thought what we were doing, which is sort of big theory, large-scale theories of cortical function, that this was like the most important problem to work on. But everyone said they can't fund it for various reasons. And so over the years, I've come to appreciate it's very difficult to be a scientist doing what we do with traditional funding sources. But we don't work outside of science.
We partner with labs and we go to conferences and we publish papers and we do all the regular stuff. Right, right. It's amazing how much comes down to funding or lack of funding and the incentives that would dictate whether something gets funded in the first place. It's really, it's finally the perfect system.
It's a kind of intellectual market failure. Yeah, it is fascinating. We can have a whole conversation about this sometimes perhaps. Because I guess why is it so hard?
Why people can't fund this? And there's reasons for it. And it's a complex, strange thing the world I'm living in. I'm going to get one chance here.
If I can't do this through like, you know, working as a graduate student to getting a position in university, how am I going to do it? And I said, okay, it's not what I thought, but this is what it's going to be. Nice. Well, let's jump into the neuroscience side of it.
Generally speaking, we're going to be talking about intelligence and how it's accomplished in physical systems. So let's start with a definition, however loose. What is intelligence in your view? So I didn't know and didn't have any pretty ideas about what this would be.
It was a mystery to me. But we've learned what a good portion of your brain is doing. And so we started the near cortex, which is about 70% of the volume of the human brain. And I now know what that does.
And so I'm going to take that as my definition of our intelligence here. What's going on in your near cortex is it's learning a model of the world, an internal recreation of all the things in the world that you know of. And how it does that's the key and what we've discovered. But it's this internal model.
And intelligence requires having an internal model of the world in your head. And it allows you to recognize where you are. It allows you to act on things. It allows you to plan and think about the future.
So if I'm going to say, what happens when I do this? The model tells you that. So to me, intelligence reach for an external one or a functional one that has to take in the environment and something about being able to flexibly meet your goals under a range of conditions. You know, more flexibly than rigidly.
I guess there's rigid forms of intelligence. But when we're talking about anything like general intelligence, we're talking about something that is not merely hardwired and reflexive. But if you have an internal model of the world, you have to learn it. I mean, at least from a human point of view.
There's some things we have built in when we're born. But the vast majority of what you and I know, Sam, is learned. You know, we didn't know what a computer was when you were born. You don't know what a coffee cup is.
You don't know what a building is. You don't know what doors are. You don't know what computer codes are. None of this stuff.
Everything that, almost everything we interact with in the world today isn't language. We don't know any particular language when we're born. We don't know mathematics. So we have to learn all these things.
So if you want to say there might be an internal model that wasn't learned, well, that's pretty trivial. But I'm talking about models that are learned and you have to interact with the world to learn it. You can't learn it without being present in the world, without having an embodiment, without moving about, touching and seeing and hearing things. So a large part of what people think about, like you brought up, is, okay, you know, we're able to solve a goal.
But that's what a model wants you to do. That is not what intelligence itself is. Intelligence is having this ability to solve any goal, right? Because if your model covers that part of the world, you can figure out how to manipulate that part of the world and achieve what you want.
So I'll give you a little further analogy. It's a little bit like computers. When we talk about a universal Turing machine or what a computer is, it's not defined by what the computer is applied to do. It's like a computer isn't something that solves a particular problem.
A computer is something that works on a set of principles. And that's how I think about intelligence. It's a modeling system that works on a set of principles. Those principles can exist in a mouse and a dog and a cat and a human and probably birds.
But don't focus on what those animals are doing. Yeah, it's important to point out that a model need not be a conscious model. In fact, most of our models are not conscious and might not even be in principle available to consciousness. Although I think the boundary, something that you'd say is happening entirely in the dark does have a kind of or can have a kind of liminal conscious aspect.
So I'm going to take the coffee cup example. This leads us into a more granular discussion of what it means to have a model of anything at the level of the cortex. But if I reach for my coffee cup and grasp it, the ordinary experience of doing that is something I'm conscious of. I'm not conscious of all of the prediction that is built into my accomplishing that and experiencing what I experience when I touch a coffee cup.
And yet it's prediction that is required having some ongoing expectation of what's going to happen there when each finger touches the surface of the cup that allows for me to detect any error there or to be surprised by something truly anomalous. So if I reach for a coffee cup and it turns out that's the hologram of a coffee cup and my hand passes right through it, the element of surprise there seems predicated on some ongoing prediction processing to which the results of my behavior is being compared. So maybe you can talk about what you mean by by having a model how prediction is built into that. Yeah, well my first book which I published like 14 years ago called Long Intelligence was just about that topic.
It was about how it is the brain is making all these predictions all the time and all your sensory modalities and you're not aware of it. And so that's sort of the foundation and you can't make a prediction without a model. I mean to make a prediction you have to have some expectation the expectation whether you're not aware of it or not but if you have an expectation and that has to be driven from some internal representation of the world that says hey, you're about to touch this thing I know what it is it's supposed to feel this way and even if you're not aware that you're doing it. One of the key discoveries we made and this was maybe about eight years ago we had to get to the bottom like how do neurons make predictions?
What is the physical manifestation of a prediction in the brain? And most of these predictions as you point out are not conscious you're not aware of them they're just happening and if something is wrong then your attention is drawn to it. So if you felt a coffee cup and there's a little burr on the side of a crack and you didn't know that it was expected that you'd say oh there's a crack what was the brain doing when it was making that prediction? And we have a theory about this and I wrote about it in the book a bit and it's a beautiful I think it's a beautiful theory but it's basically most of the predictions that are going on in your brain most of them not all of them but most of them and to happen inside individual neurons it is internal to the individual neurons now not a single neuron can predict something but an ensemble of neurons do this but it's an internal state and we wrote a paper that came out in 2016 which is it's called why do neurons have so many synapses?
and what we posited in that paper and I'm pretty sure this is correct is that neurons have these thousands of synapses most of those synapses are being used for prediction and when a neuron recognizes a pattern it says okay I'm supposed to be active soon I should be becoming active soon everything is according to our model here I should be becoming active soon and it goes into this internal state the neuron itself is saying okay I'm expecting to become active and you can't detect that consciously it's internal to them it's essentially just a depolarization or change of the voltage of the neuron but we show how the network of these neurons what will happen is if your prediction is correct then a small subset of the neurons become active but if the prediction is incorrect a whole bunch of neurons become active at the same time and then that draws your attention to the problem so it's a fascinating problem but most of the predictions going on in your brain are not accessible outside of individual neurons so there's no way you can be conscious about it I guess most people are familiar with the general anatomy of a neuron where you have this spindly looking thing where there's a cell body and it's a long process the axon leading away which carries the action potential if that neuron fires to the synapse and communicates neurotransmitters to other neurons but on the other side of in the standard case on the other side of the cell body there's this really often really profuse arborization of dendrites which is kind of a mad tangle of processes which receive information from other neurons to which this neuron is connected and it's the integration of information on that side but before that neuron fires that change the probability of its firing that that's the place you are locating this the full set of predictive changes or the full set of changes that constitute prediction in the case of a system of neurons yeah essentially for many years people looked at those the connections on the dendrites on the bushy part called synapses and when they activated a synapse most of the synapses were so far from the cell body that they didn't really have much of an effect they didn't seem like they could make anything happen and so but there are thousands and thousands of them out there but they don't seem powerful enough to make anything occur and what was discovered basically over the last 20 years that there are there's a second type of spike so you mentioned the one that goes down the axon that's the action potential but there are spikes that travel along the dendrites and so basically what happens is the individual sections of the dendrites like little branches of this tree each one of them can recognize patterns on their own they can recognize hundreds of separate patterns on these different branches and they can cause this spike to travel along the dendrite and that lowers the change of the voltage of the cell body a little bit and that is what we call the predictive state the cell is like crying it says oh I fire I'm ready to fire and it's not actually a probability change it's the timing and so a cell that's in this predictive state that says I think I should be firing now very shortly, if it does generate the regular spike, the action potential, it does it a little bit sooner than it would have otherwise. And it's the timing. That is the key to making the whole circuit work. We're getting pretty down to the weeds here.
I don't know if all your listeners will appreciate that. Yeah, no, I think it's useful, though. More weeds here. One of the novel things about your argument is that it was inspired by some much earlier theorizing.
You mark your debt to Vernon Mountcastle. But the idea is that there's a common algorithm operating more or less everywhere at the level of the cortex. That is, it's more or less, the cortex is doing essentially the same thing, whether it's producing language or vision or any other sensory channel or motor behavior. So talk about the general principle that you spend a lot of time on the book of just the organization of the neocortex into cortical columns and the implications this has for how we view what the brain is doing in terms of sensory and motor learning and all of its consequences.
This is, Vernon Mountcastle made this proposal back in the 70s. And it's just a dramatic idea. And it's an incredible idea. It's so incredible that some people just refuse to believe it, but other people really think it's a tremendous discovery.
But what you notice was if you look at the neocortex, if you could take one out of your head or out of a human's head, it's like a sheet. It's about two and a half millimeters thick. It is about the size of a large dinner napkin or 1,500 square centimeters. And if you could fold it, lay it flat.
And the different parts of it, like they do different things. There's parts that do vision, there's parts that do language, and parts that do hearing, and so on. But if you cut into it, and you look at the structure in any one of these areas, it's very complicated. There are dozens of different cell types, but they're very prototypically connected, and they're arranged in certain patterns and layers and different types of things.
So it's a very complex structure, but it's almost the same everywhere. It's not the same everywhere, but almost the same everywhere. And so this is not just true in a human neocortex, but if you look at a rat's neocortex, or a dog's neocortex, or a cat, or a monkey, the same basic structure is there. And what Vernon Mountcastle said is that all the parts of the neocortex are actually, we think of them as doing different things, but they're actually all doing some fundamental algorithm, which is the same.
So hearing and touch and vision are really the same thing. He says, if you took part of the cortex and you hook it up to your eyes, you'll get vision. If you hook it up to your ears, you'll get hearing. If you hook it up to other parts of the neocortex, you'll get language.
And so he spent many years giving the evidence for this. He proposed further that this algorithm was contained in what's called a column. And so if you would take a small area of this neocortex, remember it's like two and a half millimeters thick, you take a very sort of skinny little one millimeter column out of it, that that is the processing element. And so our human neocortex, we have about 150,000 of these columns.
Other animals have more or less. People should picture something resembling a grain of rice in terms of scale here. Yeah, yeah. I sometimes take a piece of skinny spaghetti, like, you know, angel head pasta or something like that, and cut into two little two and a half millimeter links and stack them side by side.
Now, the funny thing about columns is you can't see them. They're not visual things. You can't look in a microscope and you won't see it. But he pointed out why they're there.
It has to do with how they're connected. So all the cells in one of these little millimeter pieces of rice or spaghetti, you feel, are all processing the same thing. And the next piece of rice over processing something different, and the next piece of rice over processing something different. And so he didn't know what was going on in the cortical column.
He articulated the architecture. He talked about the evidence that this exists. He said, here's the evidence why these things are all doing the same thing. But he didn't know what it was.
It's kind of hard to imagine what it is that this algorithm could be doing. But that was essentially the core of our research. That's what we've been focused on for close to 20 years. So it's also hard to imagine that the microanatomy here, because in each one of these little columns, there's something like 150,000 neurons on average.
And if you could just unravel all of the connections there, you know, the tiny filaments of nerve endings, what you'd have there is on the order of kilometers in length, you know, all wound up into that tiny structure. So it's a strange juxtaposition of simplicity and complexity, but it's certainly a mad tangle of processes in there. Yeah, this is why brains are so hard to study. You know, if you look at another organ in the body, whether it's the heart or the liver or something like that, and you take a little section of it, it's pretty uniform, you know what I'm saying?
But here, you take a teeny, teeny piece of the cortex, it's got this incredible complexity in it, which is not just a, it's not random, it's very specific. And so, yeah, it's hard to wrap your heads around how complex it is. But we need to be complex, because what we do as humans is extremely complex. And, you know, we shouldn't be fooled that we're just a bunch of neurons that are doing some mass action.
No, there's a very complex processing going on in your brain, and that it's not just a blob of neurons that are pulsating, you know, very detailed mechanisms that are undergoing it. And we figured out what some of those are. So describe to me what you mean by this phrase, a reference frame. What does that mean at the level of the cortex and cortical columns?
Yeah. So we're jumping to the end point, because that's not where we started. We were trying to figure out how cortical columns work. And what we realized is that they're little modeling engines.
Each one of these cortical columns is able to build a model of its input. And that model is what we would call a sensory motor model. That is, it's getting input. Let's assume it's getting input from your finger, right?
A tip of your finger, one of the columns is getting input from the tip of your finger. And as your finger moves and touches something, the input changes. But it's not sufficient on how the input changes. For you to build a model of the object you're touching, I use a coffee cup example quite a bit, because that's how we use it.
If you move your finger over the coffee cup, and you're not even looking at the coffee cup, you can learn a model of the coffee cup. You can feel like, just with one finger, you can feel like, oh, this is what its shape is. But to do that, your brain, that cortical column, your brain as a whole, but that cortical column individually has to know something about where your finger is relative to the cup. It's not just a changing pattern that's coming in.
It has to know how your finger's moving and where your finger is as it touches it. So the idea of a reference frame is a way of noting a location. You have to have a location signal. You have to have some knowledge about where things are in the world relative to other things.
In this case, where's your finger relative to the object you're trying to touch the coffee cup? And we realize that for your brain to make a prediction of what you're going to feel when you touch the edge of the cup, and again, you submitted earlier, you're not conscious of this, you've reached the cup, and your brain's predicting what all your fingers are going to feel. It needs to know where the finger's going to be. And it has to know what the object is, it's a cup, it needs to know where it's going to be.
And that requires a reference frame. A reference frame is just a way of noting a location. It's saying, relative to this cup, your finger's over here, not over there, not on the handle, up at the top, whatever it is. And this is a deduced property.
We can say for certainty that this has to exist. If your finger's going to make a prediction when it reaches the coffee cup, it needs to know where the finger is, that location has to be relative to the cup. So if you can just say for certainty that there needs to be reference frames in the brain, this is not a controversial idea. What we, perhaps as novel, is that we realize that these reference frames exist in every cortical column.
And it's the structure of knowledge. It applies to not just what your finger feels on a coffee cup and what you see when you look at it, but also how you arrange all your knowledge in the world is stored in these reference frames. And so we're jumping ahead here in many steps, but when we think, and when we posit, when we try to, you know, reason in our head, when even my language right now is where the neurons are walking through locations in reference frames, recalling the information stored there, and that's what comes into your head, that's what you say. So it becomes the core reference, the reference frame becomes the core structure for the entire, everything you do, it's the knowledge about the world is in these reference frames.
Yeah, you make a strong claim about the primacy of motion, right? Everyone knows that there's part of the cortex devoted to motor action, we refer to it as the motor cortex, and distinguish it from sensory cortex in that way. But it's also true that other regions of the cortex, and perhaps every region of the cortex, does have some connection to lower structures that can affect motion, right? So it's not that it's just motor cortex that's in the motion game.
And by analogy, or by direct implication, you think of thought as itself being a kind of movement in conceptual space, right? So it's a mapping of the sensory world that can really only be accomplished by acting on it, you know, and therefore moving, right? So the way to map the cup, you know, is to touch it with your fingers, in the end, there's an analogous kind of motion in conceptual space, and even abstract ideas, like, I think some of the examples you've talked about, like, you know, democracy, right? You know, or money, or how we understand these things.
So let's go back to the first thing you said there. The idea that there's motor cortex and sensory cortex is sort of no longer considered, right? As you mentioned, the neurons that, in these cortical columns, there's certain neurons that are the motor output neurons. These are in a particular layer 5, as they're called.
And so in the motor cortex, they were really big, and they project to the spinal cord and say, oh, that's how you move your fingers. But if you look at the neurons, the columns in the visual cortex, the parts that get input from the eyes, they have the same layer 5 cells, and these cells project to a part of the brain called the superior colliculus, which is what controls eye motion. So this goes against the original idea of, oh, there's sensory cortex and motor cortex. No one believes that.
Well, I don't know, nobody, but very few people believe that anymore. As far as we know, every part of the cortex has a motor output. And so every part of the cortex is getting some sort of input, and it having some motor output. And so the basic algorithm of the cortex is a sensory motor system.
It's not divided. It's not like we have sensory areas and motor areas. As far as we know, it's been seen, there's these motor cells everywhere. So we can put that aside.
Now, I can very clearly walk you through, in some sense, prove from logic, that when you're learning what a coffee cup feels like, and I can even do this for vision, that you have to have this idea of a reference frame, that you have to know where your finger is relative to the cup, and that's how you build a model of it. And so we can build out this quarter column that explains how it does that. How does your part of your cortex that represent your fingers are able to learn the structure of a coffee cup? Now, Mouncastle, I'll go back to him.
He said, look, it's the same algorithm everywhere. And he says, it looks the same everywhere. So it's the same algorithm everywhere. So that's just sort of say, well, if I'm thinking about something that doesn't seem like a sensory motor system, I'm not touching something, I'm just thinking about something.
If Mouncastle was right, then the same basic algorithm would be applied there. So that was one constraint. Well, and the evidence is that Mouncastle is right. The physical evidence suggests he's right.
It just becomes a little odd to think, well, how is language like this? And how is mathematics like touching a coffee cup? But then we realize that reference frames are a way of storing everything. And the way we move through a reference frame, it's like, how do you move from one location?
How do the neurons activate one location after another location after another location? We do that to this idea of movement. So I'm moving, if I want to access the locations on a coffee cup, I move my finger. But the same concept could apply to mathematics or to politics, but you're not actually physically moving something, but you're still walking through a structure.
A good bridge example is if I say to you, imagine your house and I ask you to walk, you know, tell me about your house. What you'll do is you'll mentally imagine walking through your house. It won't be random, you just won't have random thoughts come to your head, but you will mentally imagine walking through your house. And as you walk through your house, you'll recall what is supposed to be seen in different directions.
You can say, oh, I'll walk in the front door and I'll look to the right, what do I see? I'll look to the left, what do I see? This is sort of an example you could relate to something physically you could move to, but that's pretty much what's going on when you're thinking about anything. If you're thinking about your podcast and how you get more subscribers, you have a model of that in your head.
And you're trying out thinking about different aspects by literally invoking these different locations and reference frames. And so this is sort of the core of all knowledge. Yeah, it's interesting. I guess back to Mountcastle for a second.
One piece of evidence in favor of this view of a common cortical algorithm is the fact that adjacent areas of cortex can be appropriated by various functions. You know, if you lose your vision, say, you know, classical visual cortex can be appropriated by other senses. And there's this plasticity that can ignore some of the previous boundaries between separate senses in the cortex. Yeah, that's right.
There's this tremendous plasticity and you can also recover from various sorts of trauma and so on. I mean, there's some rewiring that has to occur, but it does show that whatever's going, whatever the circuit in the visual cortex was, you know, quote, if you were a sighted person, what it would do. If you're not a sighted person, well, it'll just do something else. And so it's not, and so that is a very, very strong argument for that.
There's a famous scientist, Baccarita, who did an experiment where he, I'm trying to remember the animal he used, maybe you recall it. But anyway, it'll come to me. A ferret, I think it was a ferret. He took it before the animal was born.
He took the optic nerve and ran it to one part of the, a different part of the nerve cortex. He took the optic nerve and ran it to a different part of the nerve cortex. Basically, rewired the animal. I'm not sure we do these experiments today.
And, you know, and the argument was that the animals, you know, still saw and still heard and so on. Maybe not as well as an unaltered one, but the evidence was that, yeah, that really works. So what is genetically determined and what is learned here? It seems that the genetics at minimum are determining what is hooked up to what initially, right?
Yeah, roughly, roughly, that's right. I think, you know, like where are the eyes, the optic nerve from the eyes, where do they project? And where are the reasons they get the input from the eyes, where do they project? And so this rough sort of overall architecture is specified.
And as we just talked through trauma and other reasons, sometimes that architecture can get rewired. I think also the basic algorithm that goes on in each of these cordial columns, the circuitry inside the neural cortex is pretty well determined by genetics. In fact, one of my arguments was that humans, the human neural cortex got large, and we have a very large one, knowledge of our body science. Just because all evolution has to do is discover, just make more copies of these columns.
You know, you don't have to do anything new, just make more copies. And that's something easy for genes to specify. And so human brains got large quickly in evolutionary time. But that just replicates more of it type of thing.
Okay, so let's go beyond the human now and talk about artificial intelligence. And before we talk about the risks or the imagined risks, tell me what you think the path looks like going forward. What are we doing now and what do you think we need to do to have our dreams of true artificial general intelligence realized? Well, you know, today's AI, as powerful as it is and successful as it is, I think most senior AI practitioners will admit, and many of them have, that they don't really think they're intelligent.
You know, they're really wonderful pattern classifiers and they can do all kinds of clever things. But there are very few practitioners who would say, hey, this AI system that's recognizing faces is really intelligent. And there's also a lack of understanding what intelligence is and how to go forward and how do you make a system that could solve general problems, could do more than one thing, right? And so in the second part of my book, I lay out what I believe are the requirements to do that.
And my approach has always been, for 40 years, has been like, well, I think we need to first figure out what brains do and how they do them. And then we'll know how to build intelligent machines because we just don't seem able to intuit what an intelligent machine is. So I think the way I look at this problem, if I want to make, you know, what's the recipe for making an intelligent machine is you have to say, what are the principles by which the brain works that we need to replicate and which principles don't we need to replicate? And so I made a list of these in the book, but if you think of a very high level, they have to have some sort of embodiment.
They have to have the ability to move their sensors somehow in the world. You know, you can't really learn how to use tools and how to, you know, run factories and how to do things unless you can move in the world. And it requires these reference frames I was talking about because movement requires reference frames. That's not a controversial statement.
It's just a fact. You're going to have to know where things are in the world. And then the final, there's a set of things, but one of the other big ones, which we haven't talked about yet, which is where the title of the book comes from, A Thousand Brains, is that the way to think about our near cortex, it has 150,000 of these columns. We have essentially 150,000 separate modeling systems going on in our brain and they work together by voting.
And so that concept of a distributed intelligence system is important. We're not just one thing. It feels like we're one thing, but we're really 150,000 of these things. And we're only conscious of being one thing, but that's not really what's happening under the covers.
So those are some of the key ideas. I'm just thinking very, very high ideas. It has to have an embodiment. It has to be able to move.
It has to be able to organize information in reference frames and it has to be distributed. And that's how we. do multiple sensors and sensory integration things like that i guess i question the criteria of um embodiment and movement right i understand that practically speaking that's how useful intelligence can get trained up in our world to do things you know physically in our world but it seems like you could have a perfectly intelligent system i.e a mind uh that is turned loose on you know simulated worlds and are capable of solving problems that don't require effectors of any kind you know chess is obviously a very low-level analogy but just imagine a thousand things like chess that represent real you know theory building or cognition you know a box yeah i think you're right and so when i use the word movement embodiment and i'm careful to define in the book because it doesn't have to be physical it you know example i gave you can imagine intelligent agent that lives in the internet and the movement is following links right it's not a physical thing but there's still this conceptual mathematical idea what it means to move yeah and so you're changing the location of of some representation and that could be virtual it could be you know it doesn't have a physical embodiment but but in the end you can't you can't learn about the world just by looking at a set of pictures that's not gonna happen you can learn to classify pictures but so so some ai systems will have to be physically embodied like a like a robot i guess you want many will not be manually virtual but they all have this internal process which i could point to the thing that says here's where the reference frame is here's where your location is here's how it's moving to a new location based on some movement vector you know like a verb a word you can think of that as an action and so you can have an action that's not physical but it's still an action it moves to a new location in this internal representation right right okay let's talk about risk because this is the place where i think you and i have very different intuitions you are as far as i can tell from your book you seem very sanguine about ai risk and really you seem to think that the only real risk and the serious risk of things going very badly for us is that bad people will do bad things with much more powerful tools so the heuristic here would be you know don't give your super intelligent ai to the next hitler because that would be bad but other than that the generic problem of self-replication which you talk about briefly and we you point out we face that on other fronts like with you know the pandemic where we've been dealing with natural viruses and bacteria or computer viruses and there's anything that can self-replicate can be dangerous but that aside you seem quite confident that ai will not get away from us that there won't be an intelligence explosion and um we don't have to worry too much about the so-called alignment problem and at one point you can question whether it makes sense to expect that we'll produce something that can be appropriately called superhuman intelligence so perhaps you can explain the basis for your optimism here so i think what most people and perhaps yourself have fears about is they they they use humans as an example of how things can go wrong and so we think about the alignment problem we think about you know motivations of an ai system well okay does the ai system have motivation or not does it have a desire to do anything now as a human and animal we all have desires right but if you if you take apart what parts of the human brain are doing different parts there's some parts that are just building this model of the world and this is the core of our intelligence this is what it means to be intelligent that part itself is is benign it has no motivation on its own it doesn't desire to do anything i use an example of a map you know a map is a model of the world and you can a map can be very powerful uh a tool for something to do good or to do bad but on its own the map doesn't do anything so if you think about the neocortex on its own it sits on top of the rest of your brain and the rest of your brain is really what makes us motivated it gets us you know we have our our good sides and our bad sides you know our desire to maintain our life and have sex and aggression and all this stuff then of course you're sitting there it's like a map it says you know i understand the world and you can use me as you want so we build intelligent machines we have the option and and i think almost the imperative not to build the old parts of the brain too you know why do that we just have this thing which is inherently smart but on it doesn't really want to do anything and so this is some of the some of the risks that come about from the people's fears about the alignment problem specifically is that the intelligent agent will decide on its own or decide for some reason to do things that are in its best interest not in our best interest or maybe it'll listen to us but then not listen to us or something like this i just don't see how that can physically happen and and for people most people don't understand the separation they just assume that this intelligence is wrapped up in these all these all the things that make us humans the intelligence explosion problem is a separate issue i'm not sure which one of those you're more worried about yeah let's let's deal with um the alignment issue first i do think that's more critical but let's see if i can capture what troubles me about this uh picture you painted here it seems that you're to my mind you're being strangely anthropomorphic on one side but not anthropomorphic enough on the other so like you you think that to understand intelligence and actually truly implement it in machines we really have to be focused on ourselves first and we have to understand how the human brain works and then emulate those principles pretty directly in machines that strikes me as possibly true but possibly not true and if i had to bet i think i would probably bet against it although even here you seem to be not taking full account of what the human brain is doing like we can't partition reason and emotion as clearly as we thought we could hundreds of years ago in fact you know certain emotions and certain drives are built into our being able to reason effectively i think that's you know i'll take a exception to that i know i know this is a opinion that you had uh this program recently yeah antonio damasia is the person who's separate these two and and i can say this because i understand actually what's going on in the near cortex and i can see what i have a very good sense of what the actual neurons are actually doing when it's modeling the world and so on and you do not does not require this emotional component a human does now you say you know in one hand i don't argue we should replicate the brain i think we should replicate the structures in your cortex right which is not replicating the brain it's just one part of the brain and so i'm specifically saying yeah i don't really care too much about how the spinal cord works or how you know the brainstem does this or that it's interesting maybe i know a little bit about it but that's not important the cortex sits on top of another structure and the cortex does its own thing and they interact of course they interact and our emotions affect what we learn what we don't learn but it doesn't have to be that way in a system another system that we build that's why humans are you know because i would agree with that except the boundary between what is an emotion or drive or a motivation or a goal and what is a value neutral mapping of reality you know i think that boundary is perhaps harder to specify than than you think it is and that certainly these things are connected right which is to i mean here's an example this is probably not a perfect analogy but this gets at some of the surprising features of cognition that may await us so we think intuitively that understanding a proposition is cognitively quite distinct from believing it right so i can give you a statement that you can believe or disbelieve or be uncertain about you know i can say you know there's two plus two equals four two plus five and i can give you some gigantic number and say this number is prime and presumably in the first condition you'll say yes i believe that in second you'll say no that's false and in the third you won't know what whether or not it's prime or not so those are distinct states that we can intuitively differentiate but there's also evidence to suggest that merely comprehending a statement i give you a statement and you parse it successfully the parsing itself contains an actual default acceptance of it as true and rejecting it as false is a separate operation added to that i mean there's not a ton of evidence for this but there's certainly some behavioral evidence so if i put you in a paradigm where we gave you statements that were true and false and all you had to do was to judge them true and false and they were all matched for complexity so you know two plus two equals four is no more or less complex than two plus five but it'll take you longer systematically longer to judge very simple statements to be false than to judge them to be true suggesting you're doing a further operation now we can remain agnostic as to whether or not that's actually true but if true it's counterintuitive that merely understanding something entails you know some credence giving epistemic credence given to it by default and that to reject it as false represents a subsequent act but like that's the kind of thing that you know already we're on territory that is not coldly rational some of the all too apish appetites have kind of crept into cognition here in ways that we didn't really budget for and so the question is just how much of that is avoidable in building a new type of mind well you know i'm not familiar with that specific research and so i've heard that but to me none of these things are surprising in any way you just if you start thinking about the brain is basically trying to build models it's constantly trying to build models in fact you're as you walk around your life the day-to-day moment to moment and you see things you're building a model being constructed even like where are things in the refrigerator right now your brain will update you open the fridge oh the milk's on the left today whatever and so someone gives you a proposition like two plus two equals five you know i don't know what the evidence that you believe it and then falsify it but i certainly imagine you can imagine it trying to see if it's right it'd be like me saying you hey you know say i'm the milk is on the right in your refrigerator and you have to think about it for a second you say well let me think oh no last time i saw it on the left you know no that's wrong but you would walk through the process of trying to imagine it and trying to see does that fit my model and yes or no and i don't it's not surprising to me that would that you'd have to process it the way as if it was true it's just matter saying can you imagine this or imagine it do you think it's right it's not like i believe that now i've falsified it it's more like well actually i'll just give you one other datum here because it's just intellectually interesting and socially all too consequential this effect goes by several names i think but one is the illusory truth effect which is even the act of disconfirming something you know to be false you know some specious rumor or conspiracy theory merely having to invoke it and have people entertain the concept again even in the context of debunking it ramifies a belief in it in many many people it's just oh yeah it's harder to discredit things because you have to talk about them in the first place yeah i mean so look we're talking about language here right and in language so much of what we humans know is being a language and we have no idea if it's true when someone says something to you right how do you know and so it's you have to so i mean i gave an example like i've never been to the city of havana well i believe it's there i believe it's true i don't know i've never been there i never actually touched or smelled it or saw it so maybe it's false so i just i mean this is one of the issues we have i have a whole chapter on false beliefs because so much of our knowledge of the world is built up on language and the default assumption under language that if someone says something it's true it's like it's a pattern in the world you're going to accept it if i touch a coffee cup i accept that that's what it feels right and if i look at something i accept that it looks like well someone says something and my initial acceptance is okay that's what it is so you know and then it says something that's false of course well that's a problem because just just by the fact that i've experienced it is now part of my world model and that's what you're referring to i can see this is really a problem of language we face and this is the root cause of almost all of our false beliefs is that someone just says something enough times and that's good enough and you have to seek out contrary evidence evidence for it yeah sometimes it's good nothing when you're the one saying it you just overhear your voice of your own mind saying it no i know that's that's been proven that everyone is susceptible to that kind of distortion of our beliefs or special memories just remembering something over again changes it you know okay so let's get back to ai risk here because here's where i think you and i have very different intuitions and the intuition that many of us have the people who have informed my views here are people like seward russell who you probably know at berkeley and nick bostrom and eliezer yutowski and just lots of people in this spot uh worrying about the same thing to one another degree the intuition is that you don't get a second chance to create a truly autonomous super intelligence right like it seems that in principle this is the kind of thing you have to get right on the first try right and having anything right on the first try just seems extraordinarily dangerous because we we rarely ever do that when doing something complicated um another way of putting this is that it seems like in the space of all possible super intelligent minds there are more ways to build one that isn't perfectly aligned with our long-term well-being than there are ways to build one that is perfectly aligned with our long-term well-being and you know from my point of view what what your optimism and the optimism of many other people you know take your side of this debate is based on is a it's not really taking the prospect of intelligence seriously enough and the autonomy that is intrinsic to it if we actually built a true general intelligence what that means is that we would suddenly find ourselves in relationship to something that we actually can't perfectly understand it's like it will be analogous to a strange person walking into the room you know you're in relationship and if this person can think a thousand times or a million times faster than you can and has goals that are less than perfectly aligned with your own that's going to be a problem eventually we can't find ourselves in a state of perpetual negotiation with systems that are more competent and powerful and intelligent i think there's i think there's two mistakes in your argument the first one is you say my intuition your intuition i think most people have this fear have intuition about what if you'd like to continue listening to this conversation you'll need to subscribe to sam harris.org once you do you'll get access to all full-length episodes of making sense podcast along with other subscriber only content including bonus episodes and amas and conversations i've been having on the waking up app the making sense podcast is ad-free and relies entirely on listener support and you can subscribe now at sam harris.org