Transcript
F3Jd9GI6XqE • Edward Gibson: Human Language, Psycholinguistics, Syntax, Grammar & LLMs | Lex Fridman Podcast #426
/home/itcorpmy/itcorp.my.id/harry/yt_channel/out/lexfridman/.shards/text-0001.zst#text/0779_F3Jd9GI6XqE.txt
Kind: captions
Language: en
naively I certainly thought that all
humans would have words for exact
counting uh and the Paha don't okay so
they don't have any words for even one
there's not a word for one in their
language and so there's certainly not a
word for two three or four so that kind
of blows people's minds often yeah that
blowing my mind that's pretty weird how
are you how are you going to ask I want
two of those you just don't and so
that's just not a thing you can possibly
ask in the P it's not possible that is
there is no words for that
the following is a conversation with
Edward Gibson or Ted as everybody calls
him he is a psycho Linguistics professor
in MIT he heads the MIT language lab
that investigates why human languages
look the way they do the relationship
between cultureal language and how
people represent process and learn
language also he should have a book
titled syntax a cognitive approach
published by MIT press coming out this
fall so look out for that this is Alex
rman podcast to support it please check
out our sponsors in the description and
now dear friends here's Edward
Gibson when did you first become
fascinated with human language as a kid
in school when we had to structure
sentences in English grammar I I I found
that process interesting I found it
confusing as to what it was I was told
to do I didn't didn't didn't understand
what the theory was behind it but I
found it very interesting so when you
look at grammar you're almost thinking
about like a puzzle like almost like a
mathematical puzzle yeah I think that's
right I didn't know I was going to work
on this at all at that point I was
really just I was kind of a math geek
person computer scientist I really liked
computer science and then I found
language as a a neat puzzle to work on
from an engineering perspective actually
that's what I as a I I sort of
accidentally well I decided after I
finished my undergraduate degree which
was computer science and math and Canada
and Queens University I decided to go to
grad school it's like that's what I
always thought I would do and I went to
Cambridge where they had a master's in a
master's program in computational
linguistics and I hadn't taken a single
language class before all I had taken
was CS computer science math classes
pretty much mostly as an undergrad and I
just oh this was an interesting thing to
do for a year
because it was a single year program and
um then I ended up spending my whole
life doing it so fundamentally your
journey through life was one of a
mathematician and a computer scientist
and then you kind of discovered the
puzzle the problem of language and
approached it from that angle uh to try
to understand it from that angle almost
like a mathematician or maybe even an
engineer as an engineer I'd say I mean
to be frank I had taken an AI class I
guess it was 83 or 84 5 somewhere 84 in
there a long time ago and there was a
natural language section in there and it
didn't impress me I thought there must
be more interesting things we can do
didn't it didn't seem very it seemed
just a bunch of uh hacks to me it didn't
seem like a real theory of things in any
way and so I just thought this was this
seemed like an interesting area where
there wasn't enough good work did you
ever come across like the the philosophy
angle of logic so if you think about the
80s with AI the expert systems where you
try to kind
of uh maybe sidestep the The Poetry of
language and some of the syntax and the
grammar and all that kind of stuff and
go to the underlying meaning that
language is trying to communicate and
try to somehow compress that in a
computer representable way did you ever
come across that in your studies I mean
I probably did but I wasn't as
interested in it I was I was trying to
do the easier problems first the ones I
could thought maybe were handleable
which is seems like the syntax is easier
like which is just the forms as opposed
to the meaning like you're talking when
you're starting talking about the
meaning that's very hard problem and
it's still is a really really hard
problem but the forms is is easier and
so I thought at least figuring out the
forms of human language which sounds
really hard but is actually maybe more
attractable so it's interesting you
think there is a big divide there's a
gap there's a distance between form and
meaning because that's a question you
have discussed a lot with llms mhm
because they're damn good at form yeah I
think that's what they're good at is
form exactly and that's that's why
they're good because they can do form
meanings hard do you think there's oh
wow and I mean it's an open question
right yeah how close form and meaning
are we'll discuss it but I to me
studying form maybe it's a romantic
notion gives you form is like the
shadow of the the bigger meaning thing
underlying language CU I it form is is
language is how we communicate ideas we
communicate with each other using
language so in understanding the
structure of that communication I think
you start to understand the structure of
thought and the structure of meaning
behind those thoughts and communication
to me but to you big gap yeah what do
you find most beautiful about human
language maybe the form of human
language the expression of human
language what I find beautiful about
human language is the uh some of the
generalizations that um happen across
the human languages within and across a
language so let me give you an example
of something which I find kind of
remarkable that is if like a language if
it has um a word order such that the
verbs tend to come before they're
objects and so that's like English does
that so we have the the first the
subject comes first in a in a simple
sentence so I say uh you know the the
dog chased the cat or or Mary kicked the
ball so the subject's first the and then
after the subject there's the verb and
then we have objects all these things
come after in English so it's a it's
generally a verb and most of the stuff
that we want to say comes after the
subject it's comes it's the it's the
objects there's a lot of things we want
to say that come after and and and
there's a lot of languages like that
about 40% of the languages of the world
are look like that they're um sub
subject verb object languages and then
um these languages tend to have um
prepositions these little markers on the
nouns that that connect nouns to other
nouns or nouns to verbs so I when I so
verb like sorry preposition like in or
on or of or about I say I talk about
something the something is the object of
that preposition that we have these
little markers come also just like verbs
they come before their their nouns okay
and then so now we look at other
languages that like Japanese or or Hindi
or some these are these are so-called
verb final languages
those as about maybe a little more than
40% maybe 45% of the world's languages
or more I mean 50% of the world's
languages are verb final those tend to
be um post positions those markers the
same we have the states have the same
kinds of markers as we do in English but
they put them after so uh uh sorry they
put them uh first the markers come first
so you say instead of um you know talk
about a book you say book about the
opposite order there in Japanese or in
Hindi you do the opposite and and the
talk comes at the end so the verb will
come at the end as well so instead of um
Mary kicked the ball it's Mary uh Ball
kicked and then uh says Mary kicked the
ball to John it's John two the two
little the marker there uh the
preposition it's a postposition in these
languages and so the interesting thing
fascinating thing to me is that within a
language this order
aligns it's
harmonic and so if it's one or the other
it's either verb initial or verb final
but then you then you'll have
prepositions prepositions or
postpositions and so that and that's
across the languages that we we can look
at we' got around a thousand languages
for for there's around 7,000 languages
around on on the Earth right now uh but
we have information about say word order
on around a thousand of those pretty
decent amount of information and for
those thousand which we know about um
about 95% fit that pattern so they will
have either verb so about it's about
half and half half a verb initial like
English and half a verb final like um
like Japanese so just to clarify verb
initial is subject verb object that's
correct verb final is still subject
object verb that's correct yeah the
subject is generally first that's so
fascinating I ate an apple or I Apple at
yes okay and this fascinating that
there's a pretty even division in the
world amongst those 40 45% yeah it's
pretty it's pretty even and and those
two are the most common by far those two
word ARS the subject tends to be first
there's so many interesting things but
these things are the thing I find so
fascinating is there are these
generalizations within and across a
language and and not only those are the
and there's actually a simple
explanation I think for a lot of that
and that is um you're trying to like
minimize dependencies between words
that's basically the story I think
behind a lot of why word order looks the
way it is is you we're always connecting
what is it what is the thing I'm telling
you I'm I'm talking to you in sentences
you're talking to me in sentences these
are sequences of Words which are
connected and the connections are
dependencies between the words and and
it turns out that what we what we're
trying to do in a language is actually
minimize those dependency links it's
easier for me to say things if the words
that are connecting for their meaning
are close together it's easier for you
in understanding if that's also true if
they're far away it's it's hard as to
produce produce that and it's hard for
you to understand and the languages of
the world within a language and across
languages you know fit that
generalization which is you know so I
you know it turns out that having verbs
initial and then having prepositions
ends up making dependencies shorter and
and having verbs final and having
postpositions ends up making dependency
shorter then if you cross them if if you
cross themit ends up you just end up
it's possible you can do it it mean
within a language within a language you
can do it it just ends up with longer
dependencies than if you didn't so
languages tend to go that way they tend
to minim they say they call it harmonic
so it was observed a long time ago by uh
without the explanation by a guy called
Joseph Greenberg who's a um famous
typologist from Stanford he observes a
lot of generalizations about how word
order works and these are some of the
harmonic generalizations that he
observed
harmonic generalizations about word word
order there's so many things I want to
ask you okay let me uh just sometimes
Basics you you mentioned dependencies a
few times yeah what do you mean by
dependencies well what I mean is in um
in language there's kind of three
structures to three components to the
structure of language one is the sounds
so cat is C and T in English I'm not
talking about that part I'm talking then
there's two meaning parts and those are
the words and and you were talking about
meaning earlier so words have a form and
they have a meaning associated with them
and so cat is a full form in English and
it has a meaning associated with
whatever a cat is and then the
combinations of words uh that's what
I'll call grammar or syntax and uh
that's like when I have a combination
like the cat or two cats okay so uh
where I take two different words there
and put them together and I get a
compositional meaning from putting those
two different words together and and so
that's the syntax and
in any sentence or utterance whatever
I'm talking to you you're talking to me
we have a bunch of words and we're
putting together in a sequence they it
turns out they
are connected so that every word is
connected to just one other word in that
in that sentence and so you end up with
what's what's called technically a tree
it's a tree structure so there where
there's a root of that of that utterance
of that sentence and then there's a
bunch of dependence like branches from
that root that go down to the words the
words are the leaves in this metaphor
for a tree so a tree is also sort of a
mathematical construct a graph
theoretical thing graph Theory thing uh
so in the it's fascinating that you can
break down a sentence into a tree and
then one every word is hanging on to
another this depending on right and and
everyone agrees on that so all linguists
will agree with that no one not
controversial that is not controversial
there's nobody sitting here mad at you I
don't think so okay there's no linguist
sitting there mad at this I think every
language I think everyone agrees that
all sentences are trees at some level
can I pause on that cuz it it's to me
just as a Layman it uh it's surprising
yeah that you can break down sentences
in many most all languages all languages
I into a tree I think so that's weird I
I've never heard of anyone disagreeing
with that that's weird the details of
the trees are what people disagree about
well okay so what's uh what's at the
root of a how do you conru construct how
hard is it what is the process of
constructing a tree from a sentence uh
well this is where you know depending on
what you're there's different
theoretical Notions I'm going to say the
simplest thing the pendency grammar it's
like a bunch of people invented this
tenier was the first French guy back in
I mean the paper was published in 1959
but he was working on the 30s and stuff
so and and it goes back to uh you know
philologist Pini was doing this in
ancient uh India okay and so you know
doing something like this the simplest
thing we can think of is that there's
just connections between the words to
make the the utterance and so just say I
have like two dogs entered a room okay
here's a sentence and so uh we're
connecting two and dogs together that's
like there's some dependency between
those words to make some bigger meaning
and then we're connecting dogs now to uh
entered right and we connect a room
somehow to entered and so I'm going to
connect uh to room and then room back to
enter is that's the tree is I that the
root is entered that's the the thing is
like an entering event that's what we're
saying here and the the subject which is
whatever that dog is is two dogs it was
and and the connection goes back to dogs
which goes back to then that that goes
back to two I'm just that that's my tree
it it starts at entered goes to dogs
down to two and on the other side after
the verb the object it goes to room and
then that goes back to the the
determiner or article whatever you want
to call that word uh so there's a bunch
of categories of words here we're
noticing so there are verbs those are
these things that typically Mark uh they
refer to events and states in the world
and they're nouns which typically refer
to people places and things is what
people say but they can refer to other
more they can refer to events themselves
as well they're they're they're marked
by you know how they how they get you
what the category the part of speech of
a word is how it gets used in language
it's like that's how you decide what the
what the category of a word is not not
by the meaning but how it's how it gets
used how it's used what's usually the
root is it going to be the verb that
defines the event usually usually yes
yes okay yeah I mean if I don't say a
verb then there won't be a verb and so
it'll be something else what if you're
messing are we talking about language
that's like correct language what if
you're doing poetry and messing with
stuff is it then then rules go out the
window right then it's no you're still
no no no you're constrained by whatever
language you're dealing with probably
you have other constraints in poetry
such that you're like usually in poetry
there's multiple constraints that you
want to like you want to usually convey
multiple meanings is the idea and maybe
you have like a rhythm or a rhyming
structure as well and depending on so
but you usually are constrained by your
the rules of your language for the most
part and so you don't violate those too
much you can violate them somewhat but
not too much so it has to be
recognizable as your language like in
English I can't say dogs to entered room
ah I mean I meant the you know two dogs
entered a room and I I I can't mess with
the order of the the Articles the
Articles and the nouns you just can't do
that in some languages you can you can
mess around with the order of words much
more I mean you speak Russian Russian
has a much Freer word order than English
and so in fact you can move around words
in you know I told you that English has
the subject verb object word order so
does Russian but Russian is much Freer
than English and so you can actually
mess around with the word order so
probably Russian
poetry is going to be quite different
from English poetry because the word
order is much less constrained yeah
there's a much more extensive uh culture
of poetry throughout the history of the
last 100 years in Russia and I I always
wondered why that is but it seems that
there's more
flexibility in the way the language is
used there's more you're morphing the
language Easier by altering the words
altering the order of the words messing
with it well you can just mess with
different things in each language and so
Russian you have case markers right on
the end which is there these endings on
the nouns which tell you how it connects
each noun connects to the verb right we
don't have that in English and so when I
say um Mary kissed John I don't know who
the agent or the patient is except by
the order of the words right in in
Russian you actually have a marker on
the end if you're using a Russian name
and each of those names you'll also say
is it you know agent it'll be the uh you
know nominative which is marking the
subject or an accusative will Mark the
object and you could put them in the
reverse order you could put accusative
first as you could put subject you could
put um the patient first and then the
verb and then the the the subject and
that would be a perfectly good Russian
sentence and it would still mean Mary I
could say John kissed Mary meaning Mary
kissed John with as long as I use the
case markers in the right way you can't
do that in English and so uh I love the
terminology of agent and patient and uh
and the other ones you used those are
sort of linguistic terms correct those
are those are for like kind of meaning
those are meaning and and subject and
object are generally used for position
so subject is just like the thing that
comes before the the verb and the object
is one that comes after the verb the
agent is kind of like the thing doing it
that's kind of what that means right the
subject is often the person doing the
action right the thing so yeah okay this
is fascinating so how hard is it to form
a tree in general is there um is there a
procedure to it like if you look at
different languages is it supposed to be
a very natural like is it aable or is
there some human genius involved in I
think it's pretty automatable at this
point people can figure out the words
are they can figure out the morphemes
which are the technically morphemes are
the the minimal meaning units within a
language okay and so when you say eats
or drinks it actually has two morphemes
and in English there's there's the
there's the root which is the verb and
then there's some ending on it which
tells you you know that's this third
person uh third person singular say what
mores are morphemes are just the minimal
meaning units within a language and a
word is just kind of the things we put
spaces between English and they have a
little bit more they have the morphology
as well they have the endings this
inflexal morphology on the endings on
the roots they modify something about
the word that adds additional meaning
they tell you yeah yeah yeah and so we
have a little bit of that in English
very little much more in Russian for
instance and and uh but we have a little
bit in English and so we have a little
on the on the nouns you can say it's
either singular or plural and and you
can say uh same thing for um for for
verbs like simple past tense for example
like you know notice in English we say
drink drinks uh you know he drinks but
everyone else is I drink you drink we
drink it's unmarked in a way and then
but in the past tense it's just drank
there for everyone there's no morphology
at all for past tense it's there is
morphology it's marking past tense but
it's kind of it's an irregular now so we
don't even you know drink to drank you
know it's not even a regular word so in
most verbs many verbs there's an ed we
kind of add so walk to walked we add
that to say it's the past tense that I
just happen to choose an irregular
because it's a high frequency word and
the high frequency words tend to have
Irregulars in English for what's an
irregular irregular it's just there's
there isn't a rule so drink to drank is
an is an irregular drink drank okay Asos
to walk walked talk talked and there's a
lot ofre Irregulars in English there's a
lot of Irregulars in English the the the
frequent ones the common words tend to
be irregular the Le there's many many
more um low frequency words and those
tend to be those IR regular ones the
evolution of the Irregulars are
fascinating it's essentially slang
that's sticky mhm cuz you're breaking
the rules and then everybody use it and
doesn't follow the rules yeah and they
they say screw it to the rules it's
fascinating so you said it mores lots of
questions so morphology is what the
study of morphemes morphology is the is
the connections between the morphemes
onto the Roots the Roots so in English
we mostly have suffixes we have endings
on the words not very much but a little
bit and uh as opposed to prefixes some
words depending on your language can
have you know mostly prefixes mostly
suffixes or mostly or or both and then
even languages several languages have
things called infixes where you have
some kind of a uh
General uh form for the for the root and
you put stuff in the middle you change
the vowels that's fascinating that is
fascinating so wait so in general
there's what two morphemes per word
usually one or two or three well in
English it's it's it's one or two in
English it tends to be one or two there
can be more you know in in other
languages you know a lang language like
uh like finish which has a very uh
elaborate morphology there may be 10
morphemes on the end of a route okay and
so there may be Mill there be millions
of forms of a given word okay okay I I
will ask the same question over and over
but
uh how does a just sometimes to
understand things like morphemes it's
nice to just ask the question how does
these kinds of things evolve so you uh
have a great book studying sort of
the how how the cognitive processing how
language used for communication so the
the mathematical notion of how effective
language is for communication what role
that plays in the evolution of language
but just high level like how do we how
does a language evolve with where
English is two morphemes or one or two
mores per word and then Finnish has
Infinity forward so what how does that
how does that happen is it just
that's a really good question yeah
that's a very good question is like why
do languages have more morphology versus
less morphology and and I don't think we
know the answer to this I don't I think
there's just like a lot of good
solutions to the problem of
communication so I like I believe as you
hinted that language is an invented
system by humans for communicating their
ideas and I think we it comes down to we
label things we want to talk about those
are the the the morphemes and words
those are the things we want to talk
about in the world and we invent those
things and then uh we put them together
in ways that are um easy for us to
convey to process but that that that's
like a naive View and I don't I mean I I
think it's probably right right it's
naive and probably right well I don't
know if it's naive I think it's simple
simple yeah I think naive is naive is an
indication that it's an incorrect
somehow it's a trivial to too simple I
think it could very well be correct but
it's interesting how sticky it feels
like two people got
together it just it just feels like once
you figure out certain aspects of a
language that just becomes sticky and
the tribe forms around that language
maybe the language maybe the tribe forms
first and then the language evolves and
then you just kind of agree and that you
stick to whatever that is I mean these
are very interesting questions we don't
know really about how words even words
get invented very much about you know we
don't really I mean assuming they get
invented they we don't really know how
that process works and how these things
evolve what we have is kind of a a
current picture a current picture of few
thousand languages a few thousand
instances we don't have any pictures of
really how these things are evolving
really and and then the evolution is
massively con you know uh confused by
contact right so as soon as one language
group one group runs into
another we are smart hum are smart and
they take on whatever is useful in the
other group and so any kind of contrast
which you're talking about which I find
useful I'm going to I'm going to start
using as well so I I worked a little bit
in um in in specific areas of words in
in number words and in in color words
and in color words that so we have in
English we have around 11 words that
everyone knows for
colors and uh and many more if you
happen to uh be interested in color for
some reason or other if you're a fashion
designer or an artist or something you
may have many many more words but we can
see Millions like if you have normal
color vision normal tri chometric color
vision you can see millions of
distinctions in colors so we don't have
millions of words you know the most
efficient no the most you know detailed
color vocabulary would have over a
million terms to distinguish all the
different colors that we can see but of
course we don't have that so it's
somehow it's been it's kind of useful
for English to have evolved in some way
to there's 11 terms that people find
useful to talk about you know black
white red uh blue green yellow purple uh
gray pink and I probably missed
something there anyway uh there there's
11 that everyone knows yeah and um and
depending on your and but you go to
different cultures um especially the
non-industrialized cultures and there'll
be many fewer so some cultures will have
only two believe it or not that the Dan
I and in Papa New Guinea have only two
labels that the that the group uses for
color those are roughly black and white
they are okay very very dark and very
very light which are roughly black and
white and you might think oh they're
dividing the whole color space into you
know light and dark or something and
that's not really true they mostly just
only label the light the black and the
white things they just don't talk about
the colors for the other ones and so and
and then there's other groups I've
worked with a group called The chimani
down in um in Bolivia in South America
and they have three words that everyone
knows but there's a few others that are
that that several people like that many
people know and so they have me kind of
depending at how you count between three
and seven words that the group knows
okay and uh and again they're they're
black and white everyone knows those and
red red is you like that tends to be the
third word that everyone that that
cultures bring in if there's a word it's
always read the third one and then after
that it's kind of all bets are off about
what they bring in and so after that
they they bring in a sort of a big blue
green Spa gr gr they have one for that
and then they have uh and then you know
different people have different words
that they'll use for other parts of the
space and so anyway it's probably
related to what they want to talk what
they not what they not what they see
because they see the same colors as we
see so it's not like they have they
don't they have a a weak a low color
palette and the things they're looking
at they're looking at a lot of beautiful
scenery okay a lot of different colored
uh flowers and berries and things and
you know and so there's lots of things
of very bright colors but they just
don't label the color in those cases and
the reason probably we we don't know
this but we think probably what's going
on here is that what you do why you
label something is you need to talk to
someone else about it and and why do I
need to talk about a color well if I
have two things which are identical and
I want you to give me the one that's
different and and the only way it varies
is color
then I invent a word which tells you uh
you know this is the one I want so I
want the red sweater off the rack not
the not the green sweater right there's
two and and so those those things will
be identical ex because these are things
we made and they're died and there
there's nothing different about them and
so in in industrialized Society we have
you know everything everything we've got
is pretty much arbitrarily colored uh
but you go to non-industrialized group
that's not true and so they don't re Sly
they're not interested in color you you
bring bright colored things to them they
like them just like we like them bright
colors are great they're beautiful they
are but they just don't need to don't
need to talk about them they don't have
so probably color words is a good
example of how language evolves from
sort of function when you need to
communicate the use of something I think
so then then you kind of invent
different variations and uh and
basically you can imagine that the
evolution of a language has to do with
what the early tribe is doing like what
what they want what what kind of
problems they're facing them and they're
quickly figuring out how to efficiently
communicate uh the solution to those
problems whether it's aesthetic or
functional all that kind of stuff
running away from a mammoth or whatever
um but you know it's so so I think what
you're pointing to is that we don't have
data on the evolution of language
because many languages have formed a
long time ago so you don't get the
chatter we have a little bit of like Old
English to Modern English because there
was a writing system and we can see how
how old English looked so the word order
changed for instance in Old English to
Middle English to Modern English and so
it you know we can see things like that
but most languages don't even have a
writing system so of the 7,000 only you
know a small subset of those have a
writing system and even if they have a
writing system they it's not a very
modern writing system and so they don't
have it so we just basically have for
Mandarin for Chinese we have a lot of a
lot of evidence from from for a long
time and for English and not for much
else not for in German a little bit but
not for a whole lot of like long-term um
language Evolution we don't have a lot
we just have snapshots is what we've got
of current languages yeah I you get an
inkling of that from the rapid
communication on certain platforms like
on Reddit there's different communities
and they'll come up with different slang
usually from my perspective during by a
little bit of humor um or maybe mockery
or whatever it's you know just talking
and different kinds of ways and uh
you could see the
evolution of language there
because um I think a lot of things on
the internet you don't want to be the
boring mainstream so you like want to
deviate from the proper way of talking
MH and so you get a lot of deviation
like rapid deviation then when
communities Collide you get like uh just
like you said humans adapt to it and you
can see it through L of humor I mean
it's very difficult to study but you can
imagine like 100 years from now well if
there's a new language born for example
will get really high resolution data on
I mean English is changing English
changes all the time all languages
change all the time so you know there
the famous um result about the queen's
English so the que if you look at the
Queen's vowels the queen's English is
supposed to be you know originally the
proper way for the talk was sort of
defined by whoever the queen talked or
the king whoever was in charge and uh
and and so if you look at the how her
vowels changed uh from when she be first
became Queen in 1952 or 53 when she was
car the first I mean that's Queen
Elizabeth who's got who died recently of
course uh until you know 50 years later
her vowels changed her vowels shifted a
lot and so that you know even in the
sounds of British English in her the way
she was talking was changing the vowels
were changing slightly so that's just in
the sounds there's change I don't know
what's you know we're we're I'm
interested we're all interested in
what's driving any of these changes the
the word order of English changed a lot
over Thousand Years right so it used to
look like German you know it looks it
used to be a verb final language with
case marking and it shifted to a verb
medial language a lot of contact so a
lot of contact with French and it became
a verb medial language with no case
marking and so it became this you know
verb verb initially thing so and so
that's evolving we it totally evolved
and so it may very well I mean you know
it doesn't evolve maybe very much in 20
years is maybe what you're talking about
but over 50 and 100 years things change
a lot I I think will now have good data
on it which is great that's for sure um
can you talk to what is syntax and what
is grammar so you wrote a book on syntax
I did you were asking me before about
what you know how do I figure out what a
dependency structure is I'd say the
dependency structures aren't that hard
to generally I think there's a lot of
agreement of what they of what they are
for almost any sentence in in most
languages I think people will agree on a
lot of
that there are other parameters in the
mix such that some people think there's
a more complicated grammar than just a
dependency structure and so you know
like n chsky he's the most famous
linguist ever uh and he he is famous for
proposing a a a slightly more
complicated syntax and so he he invented
phrase structure grammar so he's um well
known for many many things but in the
50s in early 60s like but late 50s he
was basically figuring out what's called
formal language Theory so and he uh
figured out sort of a framework for
figuring out how complicated langu you
know a certain type of language might be
so-called phrase structured grammars of
language might be and so he his his idea
was that maybe we can we can think about
the complexity of a language by how
complicated the rules are okay and the
rules will look like this they will have
a left hand side and will have a right
right hand side something on the left
hand side will expand to the thing on
the right hand side so we'll say we'll
start with an a an S which is like the
root which is an a sentence okay and
then we're going to expand to things uh
like a noun phrase and a verb phrase is
what he would say for instance okay an S
goes to an NP and a VP is a kind of a
phrase structure Rule and then and we
figure out what an NP is an NP is a a a
determiner and a noun for instance and a
verb verb phrase is something else is a
verb and another noun phrase and another
npce for instance those are the rules of
a very simple phrase structure okay and
and so he he proposed phrase structure
grammar as a way to sort of cover human
languages and then he actually figured
out that well depending on the
formalization of those grammars you
might get more complicated or less
complicated languages so you could he
could he said well you these are these
are things called you know um context
free languages that rule that he thought
you know human languages tend to be what
he calls context free languages um and
but there are simpler languages which
are so-called regular languages and they
have a more a more constrained form to
the rules of the of the phrase structure
of of these particular rules so he he
basically discovered and kind of
invented ways to describe the language
and and those are phrase those are
phrase structure a human language and he
was mostly interested in English
initially in his his work in the 50s so
a quick questions around all this so
former language theory is The Big Field
of just studying language formally yes
and it doesn't have to be human language
there we have computer languages any
kind of system which is generating a uh
a um
some set of um expressions in a language
and those could be like the the um you
know the statements in a in a computer
language for example so formal it could
be that or it could be human language so
technically you can study programming
languages ab and have been been heavily
studied using this formalism there
there's a big field of programming
languages within the formal language
okay and then phrase structure grammar
is this idea that you can break down
language into this s npvp
it's a particular formalism for
describing language okay so and chsky
was the first one he's the one who
figured that stuff out back in the 50s
and and and but he and and that's
equivalent actually the this the context
free grammar is actually is kind of
equivalent in the sense that it
generates the same sentences as a
dependency grammar would you know as the
dependency grammar is a little simpler
in some way you just have a root and it
goes like we don't have any of these the
the rules are implicit I guess in and we
just have connections between words the
phrase structure grammar is a kind of a
different way to think about the the
dependency grammar it's slightly more
complicated but it's kind of the same in
some ways so to clarify dependency
grammar is the framework under which you
see language and you make the case that
this is a good way to describe language
that's correct and uh no Nome jsky is
watching this is very upset right now so
let's uh I'm just kidding but uh what's
the difference between uh where's the
the place of disagreement um between
phrase structure grammar and dependency
grammar they're they're very close so
phrase structure grammar and dependency
grammar aren't that aren't that far
apart I I I like dependency grammar
because it's more perspicuous it's more
transparent about representing the
connections between the words it's just
a little harder to see in phrase
structure grammar you know the the place
where Chomsky sort of devolved or went
off from from from this is he also
thought there was um something called M
okay and so so and that's where we
disagree okay that's the place where I
would say we disagree and and and I mean
we maybe we'll get into that later but
the idea is if you want to do you want
me to explain that now I would love can
you to explain movement movement okay so
you're saying so many interesting things
yeah yeah yeah okay so here's the
movement is Chomsky basically sees
English and he says okay I said um you
know we had that sentence earlier like
it was like two dogs enter the room it's
changed a little bit say two dogs will
enter the room and he notices that hey
English if I want to make a question a
yes no question from that same sentence
I I say instead of two dogs will enter
the room I say will two dogs enter the
room okay there's a different way to to
say the same idea and it's like well the
auxiliary verb that will thing it's at
the front as opposed to in the middle
okay and so and he looked you know if
you look at English you see that that's
true for all those modal verbs and for
other kinds of auxiliary verbs in
English you always do that you always
put an auxiliary verb at the front and
and what he when he saw that so you know
if I say um I can win this bet can I win
this bet right so I move a can to the
front so actually that's a theory I just
gave you a theory there I he he talks
about it as movement that word in the
thinks the declarative is the root is is
the sort of default way to think about
the sentence and you move the auxiliary
verb to the front that's a movement
Theory okay and he he just thought that
was just so obvious that it must be true
that that that there's nothing more to
say about that that this is how
auxiliary verbs work in English there's
a movement rule such that you're move
like to get from the declarative to the
interrogative you're moving the
auxiliary to the front and it's a little
more complicated as soon as you go to
simple simple present and simple past
because you know if I say you know John
slept you have to say did JN sleep not
slept John right and so it's you have to
somehow get an auxiliary verb and I
guess underlyingly it's like slept is
it's a little more complicated than that
but his that's his idea there's a
movement okay and and and so a different
way to think about that that isn't I
mean the then then he ended up showing
later so he proposed this theory of
grammar which has movement there's other
places where he thought there's movement
not just auxiliary verbs but things like
the passive in English and things like
um questions wh questions a bunch of
places where he thought there's also
movement going on and and in each each
one of those these things there's words
well phrases and words are moving around
from one structure to another what you
call Deep structure to surface structure
I mean there's like two different
structures in his in his theory okay um
there's a different way to think about
this um which is there's no movement at
all there's a lexical copying rule such
that the word will or the word can these
these auxiliary verbs they just have two
forms and and and one of them is the
declarative and one of them is
interrogative and you basically have the
declarative one and oh I form the
interrogative or I can form one from the
other it doesn't matter which direction
you go and and I just have a new entry
which has the same meaning which has a
slightly different argument structure
argument structure just a fancy word for
The Ordering of the words and so if I
say you it was um the the dogs two dogs
can or will enter the room the the
there's two forms of will one is Will
declarative and and then okay I've got
my subject to the left it comes before
me and the verb comes after me in that
one and then the will interrogative it's
like oh I go first interrogative will is
first and then have the subject
immediately after and then the verb
after that and so you just you can just
generate from one of those words another
word with a slightly different argument
structure with different ordering and
these are just lexical copies they
they're not necessarily moving from one
to another there's no movement there's a
romantic notion that you have like one
main way to use a word and then you
could move it around right right which
is essentially what movement is implying
yeah but that's that's the lexical
copying is similar so then so then then
we we do Lex copying for that same idea
that maybe the declarative is the source
and then we can copy it and so an
advantage uh for there's multiple
advantages of the lexical copying story
it's not my story this is like um Ivan
SG linguists a bunch of linguists have
been proposing these stories as well you
know in tandem with the movement story
okay you know he's he Ivan soag died a
while ago but he was a one of the
proponents of the non-movement of the
lexical copying story and so that is
that um a great Advantage is well
Chomsky really famously in 1971 showed
that the movement story leads to
learnability problems it leads it leads
to problems for for how language is
learned it's really really hard to
figure out what the underlying structure
of a language is if you have both phrase
structure and movement it's like really
hard to figure out what came from what
there's like a lot of possibilities
there if you don't have that problem
learning that learning problem gets a
lot easier say there's lexical copies
and when we say the learning problem do
you mean like humans learning a new
language yeah just learning English so
baby is lying around listening to the
crib listening to me talk and is you
know how are they learning English or or
you know maybe it's a 2-year-old who's
learning you know interrogatives and
stuff or one you know there you how are
they doing that are they doing it from
like are they figuring out or like know
so Chomsky said it's impossible to
figure it out actually he said it's
actually impossible not not hard but
impossible MH and therefore that's that
that's where Universal grammar comes
from is that it has to be built in and
so what they're learning is uh that
there there's some built-in movement is
built in in his story is absolutely part
of your language module and uh and then
you are you're just setting parameters
you're you're said depending on English
is just sort of a variant of the
universal grammar and you're figuring
out oh which orders do does English do
these things that's the the non-movement
story doesn't have this it's like much
more
bottom up uh you're you're learning
rules you're learning rules one by one
and oh there's this this word is
connected to that word a great advant
another Advantage it's learnable another
advantage of it is that it predicts that
not all auxiliaries might move like it
it might depend on the word depending on
whether you and and and that turns out
to be true so there's words that um that
don't really work as auxiliary you they
work in declarative and not in in
interrogative so I can say um I'll give
you the opposite first if so I can say
aren't I invited to the party okay and
that's an that's an interrogative form
but it's not from I aren't invited to
the party there is no I aren't right so
that's that's interrogative only and and
then we also have forms like um ought uh
I I ought to do this and and I guess
some British old British people can say
exactly it doesn't sound right does it
for me it sounds ridiculous I don't even
think a is great but I mean I totally
recognize I ought to I is not too bad
actually I can say I ought to do this
that sounds if I'm trying to sound
sophisticated maybe I don't know it just
sounds completely out to me I yeah
anyway it's so there are variance here
uh and a lot of these words just work in
one versus is the other and and that's
like fine under the lexical copying
story it's like well you just learn the
usage whatever the usage is is what you
is what you do with this with with this
word but um it doesn't it's a little bit
harder in the movement story The
Movement story like that's an advantage
I think of lexical copying in all these
different places there's there's all
these usage variants which make the
movement story um a little bit harder to
work so one of the main divisions here
is the movement Story versus the C story
that has to do about the auxiliary warts
and so on but you if rewind to the
phrase structured grammar yeah versus
dependency grammar those are equivalent
in some sense in that for any dependency
grammar I can generate a dependence a
phrase structure grammar which generates
exactly the same sentences I just I just
like the dependency grammar uh formalism
because it makes something really
Salient which is the depend the the
lengths of dependencies between Words
which isn't so obvious in in the phrase
in the phrase structure it's just kind
of hard to see it's in there it's just
very very it's opaque uh technically I
think phrase structure grammar is
mappable to dependency grammar and vice
versa and vice versa yeah there's like
these like little labels SN PVP yeah for
a particular dependency grammar you can
make a phrase structure grammar which
generates exactly those same sentences
and vice versa but there are many phrase
structure grammars which you can't
really make a dependency grammar I mean
there you can do a lot more in a phrase
structure grammar you get many more of
these extra nodes basically you you can
have more structure in there uh and and
some people like that and and maybe
there's value to that I I I don't like
it well for you so we should clarify so
so dependency grammar it's just uh well
one word depends on only one other word
and you form these trees and that makes
it really puts priority on those
dependencies just like as a as a tree
that you can then measure the distance
of the dependency from one word to the
other they can then map to uh the
cognitive processing of the of these
sentences how well how easy it is to
understand and all that kind of stuff so
it just puts the focus on just like the
mathematical
um uh distance of dependence between
words so like it's just a different
Focus absolutely Ju Just continue on a
thread of chsky because it's really
interesting because it as you're
discussing
disagreement to the degree there's
disagreement you're also telling the
history of the study of language which
is really awesome so you mention context
free versus regular does that
distinction come into play for the peny
grammar no okay not at all I mean the
regular regular languages are too simple
for human languages they they are uh
they it's a part of the hierarchy but
human languages are in in the phrase
structure world are definite they
they're at least context free maybe a
little bit more a little bit harder than
that but uh so there's something called
context sensitive as well where you can
have like this is the just the formal
language description in in a context
free grammar you have one this is like a
bunch of like formal language Theory
we're doing here but I love it okay so
you have you have a left- hand side
category and you're expanding to
anything on the right is is a uh that's
a context free so like the idea is that
that category on the left expands in
independent of context to those things
whatever they are on the right doesn't
matter what and and a context sensitive
says Okay I I actually have more than
one thing on the left I can tell you
only in this context you know I maybe
have like a left and a right context or
just a left context or a right context I
have two or more stuff on the left tells
you how to expand that th those things
in that way okay so it's Contex
sensitive a a regular language is just
more constrained and so it it doesn't
allow anything on the right it it allows
very it allow basically it's a one very
complicated rule is kind of what a a a
regular language is and so it doesn't
have any um it say long distance
depencies it doesn't allow recursion for
instance there's no recursion yeah
recursion is where you which human
languages have recursion they have
embedding and you can't well it doesn't
allow Center embedded recursion which
human languages have which is what
Center embedded recursion within a
sentence within a sentence yeah within a
sentence so here we're going to get to
that but I you know the formal language
stuff is a little aside Chomsky wasn't
proposing it for human languages even he
was just pointing out that human
languages are context free and then he
was most in for for human because that
was kind of stuff we did for formal
languages and what he was most
interested in was human language and
that's like the the movement is where we
we we where where he sort of set off in
on I would say a very interesting but
wrong foot it was kind of interesting
it's a very I agree it's a very
interesting history so there's this so
he proposed this multiple theories in 57
and then' 65 they're they all have this
framework though was phrase structure
plus movement different versions of the
of the phrase structure and the movement
in the 57 these are the most famous
original bits of chomsky's work and then
in 71 is when he figured out that those
lead to learning problems that that
there's cases where a kid could never
figure out which rule um which set of
rules was intended and and so and then
he said well that means it's innate it's
kind of interesting he just really
thought the movement was just so
obviously true that he couldn't he
didn't even entertain giving it up it's
just obvious that's obviously right and
um it was later where people figured out
that there's all these like subtle ways
in which things which which look like
generalizations aren't generalizations
and they you across the category they're
they're word specific and and they have
and they they kind of work but they
don't work across various other words in
the category and so it's easier to just
think of these things as lexical copies
and and I think he was very obsessed I
don't know I'm like guessing that he he
just he really wanted this story to be
simple in some sense and language is a
little more complicated in some sense
you know he didn't like Words uh he
never talks about words he likes to talk
about combinations of words and words
are you know look up a dictionary
there's 50 senses for a common word
right the word take will have 30 or 40
senses in it so uh there'll be many
different senses for common words and he
just doesn't think about that it's or
doesn't think that's language I think he
doesn't think that's language he thinks
that words are distinct from
combinations of words I think they're
the same if you look at my brain in the
scanner while I'm listening to a
language I understand and you compare I
can localize my language Network in a
few minutes in like 15 minutes and what
you do is I listen to a language I know
I listen to you know maybe some language
I don't know or I listen to muffled
speech or I I read sentences and I read
non-words like I do anything like this
anything that's sort of really like
English and anything that's not very
like English so I've got to something
like it and not and I got a control and
and the voxel which is just you know the
um 3D pixels in my in my brain that are
responding most are are is a language
area and and that's this left
lateralized um area in my head and and
wherever I look in that network if you
look for the combinations versus the
words it's there it's it's everywhere
it's the same that's fascinating and so
it's like hard to find there are no
areas that we know I mean
that's it's a little overstated right
now at this at this point the the
technology isn't great it's not bad but
we have the best the best way to figure
out what's going on in my brain when I'm
listening or reading language is to use
FM R functional magnetic resonance
imaging and that's a very good
localization method so I can figure out
where exactly these signals are coming
from pretty you know down to you know
millimeters you know cubic millimeters
or smaller okay very small we can figure
those out very well the problem is the
when okay uh it's it's measuring um
oxygen okay and oxygen takes a little
while to get to those cells so it takes
on the order of seconds so I talk fast I
I probably listen fast and I can
probably understand things really fast
so a lot of stuff happens in two seconds
and so to say that we know what's going
on that the words right now in that
Network our best guess is that whole
network is doing something similar but
maybe different parts of that Network
are doing different things and and
that's probably the case we just don't
have very good methods to figure that
out right at this moment and so since
we're kind of talking about the history
of the study of language what other
interesting disagreements and you're
both at it or were for a long time what
kind of interesting disagreements there
tension of ideas are there between you
and no chowski and we should say that
gnome was in the Linguistics department
and
you're uh I guess for a time we
Affiliated there but primarily a brain
and cognitive science department it's
just another way of studying language
and you've been talking about fmri so
like what is there something else
interesting to bring to the surface
about the disagreement between the two
you or other people in the this yeah I I
mean I've been at MIT for 31 years since
1993 and he chomsky's been there much
longer so I I met him I knew him I I met
when I first got there I guess and I and
we would interact every now and then I'd
say that so I'd say our our biggest
difference is our methods and so um
that's the biggest difference between me
and gnome uh is that I gather data from
people I uh do experiments with people
and I gather Corpus data whatever
whatever Corpus data is available and we
do quantitative methods to evaluate any
kind of hypothesis we have he just
doesn't do that so you know you you know
he has never once been associated with
any experiment or Corpus work ever and
so it's all thought experiments it's his
own intuitions so I I just don't think
that's the way to do things um that's a
that's a you know across the street
they're across the street from us kind
of difference between brain and kogai
and Linguistics I mean not all lingu
some of the linguists depending on what
you do more speech oriented they do more
quantitative stuff but in the in the
meaning um words and well it's
combinations of words syntax semantics
they tend not to do experiments and uh
and Corpus analyses so listic size
probably well the but the method is a
symptom of a bigger approach which is
sort of a psychology philosophy side on
gome and for you it's more sort of
datadriven sort of almost like
mathematical approach yeah I mean I'm
psychologist so I would say we're in
Psychology you know I I me brain
cognitive science is is mit's old
psychology department it was the
psychology department up until 1985 and
it became the brain and Cog cognitive
science department and so I I mean my
training is in Psych I mean my training
is math and computer science but I'm a
psychologist I mean I'm I mean I don't
know what I am so data driven
psychologist yeah you are I am what I am
but but I'm happy to be called a
linguist I'm happy to be called a
computer scientist I'm happy to be
called a psychologist any of those
things in the actual uh like how that
manifests itself outside the methodology
is like these differences these subtle
differences about the movement Story
versus the lexical copy story those are
theories so the the like the theories
are I but I think the the reason we
differ in part is because of how we
evaluate the theories and so I evaluate
theories quantitatively and gnome
doesn't got it okay well let's let's
explore the theories that you explore in
your book Let's return to this
dependency grammar framework of looking
at language what's uh a good
justification why the dependency grammar
framework is a good way to explain
language what's your intuition so the
reason I like dependency grammar as I've
said before is that it's very
transparent about its representation of
distance between words so it's like it
all it is is you've got a bunch of words
you're connect them together to make a
sentence
and a really neat
Insight which turns out to be true is
that the further apart the pair of words
are that you're connecting the harder it
is to do the production the harder it is
to do the comprehension it's it's harder
to produce it's hard to understand when
the words are far apart when they're
close together it's easy to produce and
it's easy to comprehend let me give you
an example okay so we have in any
language
we have mostly local connections between
words but they're abstract the the
connections are abstract they're between
categories of words and so you can
always make things further apart if you
put your if you add modification for
example after a noun so a noun in
English comes before a verb the subject
noun comes before a verb and then
there's an an object after for example
so I can say what I said before you know
the dog entered the room or something
like that so I can modify dog if I say
something more about dog after it then
what I'm doing is in indirectly I'm
lengthening the dependence the
dependence between dog and entered by
adding more stuff to it so I just make
just make it explicit here if I say um
uh the the boy who the cat scratched
cried we're going to have a mean cat
here and uh and so what I've got here is
I the boy cried it would be a very short
simple sentence and I just told you
something about
the the boy and I told you it was it was
the boy who the cat scratched okay so
the CED is connected to the boy C at the
end it's connected to the boy in the
beginning right and so I can do that I
can say that that's a perfectly fine
English sentence and I can say the cat
which the dog chased ran away or
something okay I can do that but I it's
really so I but it's really hard now I
I've got whatever I have here I have the
boy who the cat now let's say I try to
modify cat cat Okay the boy who the cat
which the dog chased scratched ran away
oh my God that's hard right I I can I'm
sort of just working that through in my
head how to produce and how to and it's
really very just horrendous to
understand it's not so bad at least I've
got intonation there to sort of Mark the
boundaries and stuff but it's that's
really complicated that's sort of
English in a way I mean that follows the
rules of English but uh so what's
interesting about that is is that what
I'm doing is nesting dependencies there
I'm putting one con I've got a subject
connected to a verb there and and then
I'm modifying that with a a a clause
another clause which happens to have a
subject in a verb relation I'm trying to
do that again on that second one and
what that does is it lengthens out the
dependence multiple dependence actually
get lengthened out there the
dependencies get get longer longer on
the outside ones get long and even the
ones in between get kind of long and and
and you just so what's fascinating is
that that's bad that's really horrendous
in English but that's horrendous in any
language and so in in no matter what
language you look at if you do just
figure out some structure where I'm
going to have some modification
following some head which is connected
some later head and I do it again it
won't be good guaranteed like 100% that
will be uninterpretable in that language
in the same way that was uninterpretable
in English just clarify the distance of
the dependencies is whenever the uh boy
cried this um there's a dependence
between words and then you counting the
number of what morphemes between them
that's a good question I I just say
words your words are morphs between
don't we don't know that actually that's
a very good question what is the
distance metric but let's just say it's
words sure okay so that and you're
saying the longer the distance of that
dependence the more no matter the
language except legales uh even leg okay
we'll talk about it okay okay okay uh
but
that the people will be very upset that
speak that language not upset but
they'll either not understand it they'll
be like this is they'll uh their brain
will be working in overtime yeah they
they'll have a hard time either
producing or comprehending it they might
tell you that's not their language you
know it's sort of the language I mean it
it following their like they'll agree
with each of those pieces as part of
their language but somehow that
combination will be very very difficult
to produce and understand is that a
chicken or the egg issue here so like is
well I'm giving you an explanation so
the I well I mean I didn't there's I'm
giving you two kinds of explanations I'm
telling you that Center embedding that's
nesting those are the same those are
synonyms for the same concept here and
I'm the explanation for why those are
always hard Centum bedding and nesting
are always hard and I give you an
explanation for why they might be hard
which is longdistance connections
there's a when you do Centrum Bing when
you do nesting you always have
long-distance connections between the
dependents you just so that's not
necessarily the right explanation it
just ha I can go through reasons why
that's probably a good explanation and
it's not really just about one of them
it so probably it's a a pair of them or
something of these dependents that you
get get long that drives you to like be
really confused in that case and so what
the the behavioral consequence there if
you I mean we this is kind of methods
like how do we get at this you could try
to do experiments to get people to
produce these things they're going to
have a hard time producing them you can
try to do experiments to get them to
understand them get and see how well
they understand them can they understand
them uh another method you can do is
give people partial materials and ask
them to complete them you know those
those Cent embedded materials and and
they they'll fail so I've done that I've
done all these kinds of things wait a
minute so so Central embedding meaning
like you take a normal sentence like boy
cried and inject a bunch of crap in the
middle yes that separates the boy and
the cried yes okay that's Central
bedding and nesting is on top of that no
no nesting is the same thing Cent Bing
those are totally equivalent terms I'm
sorry I sometimes use one and sometimes
got it got it different got it and then
uh what you're saying is there's a bunch
of different kinds of experiments you
can do I mean I like the understanding
one is like have more embedding more
Central embedding is it easier or harder
to understand but then you have to
measure the level of understanding I
guess yeah yeah you could I mean there's
multiple ways to do that I mean there's
there's the simplest ways just ask
people how good is it sound how natural
is it sound that's a very blunt but very
good measure it's very very reliable
people will do the same thing and so
it's like I don't know what it means
exactly but it's doing something such
that we're measuring something about the
confusion the difficulty associated with
those and those like those are giving
you a signal that's why you can say that
okay what about the the completion of
this with the central so if you give
them a partial sentence say I say um the
book which the author who and I ask you
to now finish that off for me I mean
either say yeah yeah but you can just
but it say it's written in front of you
and you can just type and have as much
time as you want they will even though
that one's not too hard right so if I
say it's like the book it's like oh the
the book which the author who I met
wrote was good you know that's a very
simple completion for that you know if I
give that completion on online somewhere
to a uh you know um a crowdsourcing
platform and ask people to complete that
they will miss off of a verb very
regularly like half the time maybe
two-thirds of the time they'll say
they'll just leave off one of those verb
phrases even with that simple so say the
book uh which the author who and they'll
say was um um they won't you need three
verbs right I need three verbs here who
I met wrote was good and they'll give me
two they'll say who was who was famous
was good or something like that they'll
just give me two and and and that
that'll happen about 60% of the time so
40% maybe 30 they'll do it correctly
correctly meaning they'll do with three
verb phrases I don't know what's correct
or not you know it this is hard it's a
hard task yeah I actually I'm struggling
with it in my head well it's it's easier
when you when you stare at it if you a
little easier than listen listening is
pretty tough because you have to because
there's no trace of it you have to
remember the words that I'm saying which
is very hard auditorily we wouldn't do
it this way we you do it written you can
look at it and figure it out it's easier
in many dimensions in some ways
depending on the the person it's easier
to gather written data for I mean most
sort of psycho I work in Psycho
Linguistics right psychology of language
and stuff and so a lot of our work is
based on written stuff because it's so
easy to gather data from people doing
written kind of Tas spoken tasks are
just more complicated to administer and
analyze because people do weird things
when they speak and it's harder to
analyze what they do but they um they
they generally point to the same kinds
of things so okay so the the universal
theory of language Yeah by Ted Gibson is
uh that you can form
dependency you can form trees from many
sentences and right you can measure the
distance in some way of those
dependencies and then you can say that
uh most languages have very short
dependencies all languages all languages
all languages have short dependencies
you can actually measure that so uh an
ex student of mine these guys at um
University of California Irvine Richard
Futrell did a thing of bunch of years
ago now where he looked at all the
languages we could look at which was
about 40 initially and now I think
there's about 60 for which there are
dependency structures like you so
they're meaning there got to be like a
big text a bunch of text which have been
parsed for the dependency structures and
there's about 60 of those which have
been parsed that way and for all of
those um you can what what he did was
take any um any sentence in one of those
languages and uh and you can do the
dependency structure and then start at
the root we we're talking about
dependency structures that's pretty easy
now and and he's trying to figure out
what a control way you might say the
same sentence is in that language and so
we is just like all right there's a root
and it has let's say as a sentence is um
let's go back to you know two dogs
entered the room so entered is the root
and entered has um two dependents it's
got dogs and it has room okay and what
he does is like let's scramble that
order that's three things the root and
the The Head and the two dependents and
in just some random order just random
and then just do that for all the
dependents down that for so now look do
it for the and whatever was two in dogs
and for uh in room and and that's you
know that's not it's a very short
sentence when sentences get longer and
you have more dependence there's more
scrambling that's possible and what he
found was so that so so that that's one
you could figure out one scrambling for
that sentence he did like 100 times for
every sentence in every Cor in every one
of these texts every Corpus and and then
he just compared the dependency lengths
in those random scrambling to what
actually happen what what the the
English or the French or the German was
in the in original language or Chinese
or what are all these like 80 l no 60
languages okay and and the dependency
lengths are always shorter in the real
language compared to this kind of a
control and there's another it's a
little more rigid his control so um the
the way I described it you could have
crossed dependencies like by by
scrambling that way you could scramble
in any way at all um languages don't do
that they tend not to cross dependencies
very much like so the dependency
structure they just they tend to keep
things uh non-cross and there's a you
know there's a technical term they call
that projective but it's just non-cross
is all that is projective and so if you
just
constrain the the scrambling so that it
only gives you projective sort of non
non-cross the same thing holds so it's
so the You Still Still Human languages
are much shorter uh than these this kind
of a control so there's like it what it
means is that that that we're in every
language we're trying to put things
close in relative to this kind of a
control like it doesn't matter about the
word order some of these are verb final
some of them these are verb medial like
English and some are even verb initial
there are a few languages of the world
which have vso world word order verb
subject object languages haven't talked
about those it's like 10% of the and
even even in those languages it's still
short dependencies short dependencies is
rules okay so how what what are some
possible explanations for that uh for
why why languages have evolved that way
so that that's one of the um I suppose
disagreements you might have with
chowski so you consider the evolution of
language in um in terms of information
Theory yeah and uh for you the purpose
of language is ease of communication
right and processing that's right that's
right so I mean the the story here is
just about communication it is just
about production really it's about ease
of production is the story when you say
production can you can oh I just mean
ease of language production it's easier
for me to say things when the what I'm
doing whenever I'm talking to you is
somehow I'm formulating some idea in my
head and I'm putting these words
together and it's easier for me to do
that uh to put to say something where
the words are close closely connected in
inde dependency as opposed to separated
like by putting something in between and
over and over again I it's just hard for
me to keep that in my head it like
that's that's the whole story like the
story it's basically I like that the
dependency grammar sort of gives that to
you like just like long long as bad
short as good it's like easier to keep
in mind because you have to keep it in
mind for probably for production Pro you
know matters in comprehension as well
like also matters in comprehension it's
on both sides of the production and the
but I would guess it's probably evolved
for production it's about producing it's
right what's easier for me to say that
ends up being easier for you also I
that's very hard to disentangle this
idea of who's it for is it for me the
speaker or is it for you the listener I
mean part of my language is for you like
the way I talk to you is going to be
different from how I talk to different
people so I'm I'm definitely angling
what I'm saying to who I'm saying right
it's not like I'm is talking the same
way to every single person and so I am
sensitive to my audience but how does
that does that you know work itself out
in the in the dependency length
differences I don't know maybe that's
about just the words that part you know
which words I select my initial
intuition is that you optimize language
for the audience yeah but it's both it's
just kind of like messing with my head a
little bit to say that some of the
optimization might be it may be the
primary objective of the optimization
might be the ease of production we have
different senses I guess I'm I'm like
very
selfish and you're I'm like I think it's
like it's all about me I'm like I'm just
doing what's easiest for me at all I
don't want to I'm like I'll I mean but I
have to of course choose the words that
I think you're going to know I'm not
going to choose words you don't know in
fact I'm going to fix that when I you
know so there it's about but but maybe
for for the Syntax for the combinations
it's just about me I feel like it's I
don't know though it's very wait wait
purpose of communication is to be
understood is to convince others and so
on so like the selfish thing is to be
understood it's about circular there too
then okay right I mean like the use of
production helps helps me be understood
then I don't think it's circular so I I
think the primary I think the primary
objective is to be UND is about the
listener because otherwise if you're
optimizing to for the ease of production
then you're you're not going to have any
of the interesting complexity of
language like you're trying
let's control for what it is I want to
say like I I'm saying let's control for
the thing the the message control for
the message the message needs to be
understood that's the goal oh but that's
the meaning so I'm still talking about
the form just the form of the meaning
how do I frame the form of the meaning
is all I'm talking about you're talking
about a harder thing I think is like how
am I like trying to change the let's
let's keep the meaning constant like
which it if you keep the meaning
constant how can I phrase whatever it is
I need to say
like I got to pick the right words and
I'm going to pick the order so that it's
so it's easy for me that's that's that's
that's what I think Isa I think I'm
still tying meaning and
form together in my head but you're
saying if you keep the meaning of what
you're saying constant yeah what the
optimization yeah it could be the
primary objective of that optimization
is the for production that's interesting
I'm I'm struggling to keep constant
meaning it's just so I mean I'm I'm such
a I'm a human right so for me the form
without having
introspected on this the form and the
meaning are tied together like
deeply because I'm a human like for for
me what I'm speaking cuz I haven't
thought about language like in a
rigorous way about the form of language
look for any
event there's there's an an unbounded I
I don't want to say infinite but sort of
Unbound you know ways of that I might
communicate that same event this two
dogs entered a room I can say in many
many different ways I can say hey
there's two dogs they entered the room
hey the room was entered by something
the thing that was entered was two dogs
I mean there's I mean it's kind of
awkward and weird and stuff but those
are all similar
messages with different forms but
different ways I might frame and of
course I use the same words there all
the time I could have referred to the
dogs as you know a Dalmatian and a
poodle or something you know I I could
have been more specific or less specific
about what they are and I could have
said been more abstract about about
about the number there's like so I like
I'm trying to keep the meaning which is
this event constant and then how am I
going to describe that to get that to
you it kind of depends on what you need
to know right and what I think you need
to know but I'm like trying let's
control for all that stuff and not and
it's like I'm just like choosing about
I'm doing something simpler than you're
doing which is just forms yes just words
to You specifying the species the the
breed of dog and whether they're cute or
not is changing the meaning that might
be yeah yeah that would be Chang well
that would be changing the meaning for
sure right so you're just yeah well yeah
yeah that's changing the meaning but say
even if we keep that constant we can
still talk about what's easy or hard for
me right The Listener and the and the
right which phrase structures I use
which combinations which you know this
is so fascinating and just like a uh a
really powerful window into human
language
but I wonder still throughout
this how vast the gap between meaning
and form I just I just have this like
maybe romanticized notion that they're
close together that they evolve close
like hand in hand that you can't just
simply optimize for one without the
other being in the room with
us like it's well it's kind of like an
iceberg form is the tip of the iceberg
and the rest the the meaning is the
iceberg but you can't like sep but I
think that's why these large language
models are so successful is because
they're good at form and form isn't that
hard it's some sense and meaning is
tough still and that's why they're not
they're you know they don't understand
what they're do we're going to talk
about that later maybe but uh like we
can
distinguish in our forget about large
language models like humans maybe you'll
talk about that later too is like the
difference between language which is a
communication system and thinking which
is meaning so language is a communic
system for the meaning it's not the
meaning and so that's why I mean that
and there's a lot of interesting
evidence we can talk about re relevant
to that well I mean that's a really
interesting question what is the differ
what is the difference between uh
language written
communicated versus
thought what do use the difference
between them well you or anyone has to
think of a task which they think is is a
good thinking task and there's lots and
lots of tasks which should be good
thinking tasks and whatever those tasks
are let's say it's you know playing
chess or that's a good thinking task or
playing some game or doing some complex
puzzles uh maybe maybe remembering some
digits that's thinking remembering some
a lot of different tasks we might think
maybe just listening to music as
thinking or there's a lot of different
tasks we might think of as thinking
there's this woman in my department at
federenko and she's done a lot of work
at on on this question about what's the
connection between language and thought
and and so she uses I was referring
earlier to MRI fmri that's her primary
method and so she has been really
fascinated by this question about
whether what language is okay and so as
I mentioned earlier you can localize my
language area your language area in a
few minutes okay like 15 minutes I can
listen to language listen to non-
language or backward speech or something
and and and we'll find areas left
lateralized Network in my head which is
specially which is very sensitive to
language as opposed to whatever that
control was okay can't specify what you
mean by language like communicated
language like what is just sentences you
know I'm listening to English of any
kind story or I can read sentences
anything at all that I understand if I
understand it then it'll activate my
language Network so right now my
language network is going like crazy
when I'm talking and when I'm listening
to you because we're both we're
communicating and that's pretty stable
yeah it's incredibly stable so I've I I
happen to be married to this woman at
feno so I've been scanned by her over
and over and over since 2007 or six or
something and so my language network is
exactly the same you know like a month
ago as it was back in 2007 it's
amazingly stable it's astounding within
it's it's a really fundamentally cool
thing and so my language network is it's
like my face okay it's not changing much
over time inside my head can ask a quick
question Sor is a small tangent uh at
which point in the as you grow up from
baby to adult does it stabilize we don't
know like that's that's a very hard
question they're working on that right
now because of the problem scanning
little kids like doing the trying to do
local trying to do the the localization
on little children in this scanner
you're lying in the fmri scan that's the
best way to figure out where something's
going on inside our brains and the
scanner is loud and you're in this tiny
little com you know area you're
claustrophobic and it doesn't bother me
at all I can go to sleep in there but
some people are bothered by it and
little kids don't really like it and
they like to lie still and you have to
be really still because if you move
around that's that messes up the
coordinates of where where everything is
and so you know try to get you know your
question is how and when are language
developing you know how when how does
this left lateralized system come to
play where's it you know and it's really
hard to get a two-year-old to do this
task but you can maybe where they're
starting to get three and four and
fiveyear olds to do this task for short
periods and it looks like it's there
pretty early so clearly when you lead up
to like a baby's first
words before that there's a lot of
fascinating turmoil going on about like
figuring out like what is what are these
people saying and you're trying to like
make sense how does that connect to the
world and all that kind of stuff yeah
that that that might be just fascinating
development that's happening there
that's yeah hard to introspect but
anyway you anyway we're back to the
scanner and I can find my network in 15
minutes and now we can ask a we can ask
find my network find yours find you know
20 other people do this task and and we
can do some other tasks anything else
else you think is thinking of some other
thing I can do a spatial memory task I
can do a music perception task I can do
programming task if I program okay I can
do uh where I can like understand
computer programs and none of those
tasks tap the language Network at all
like at all there's no overlap they do
they're they're highly activated in
other parts of the brain there's a
there's a bilateral Network which I
think she she tends to call the multiple
demands Network which does anything kind
of hard and so anything that's kind of
difficult in some ways will um activate
that multiple demands Network I mean
music will be in some music area you
know there's music specific kinds of
areas and so uh but they but but none of
them are activating the language area at
all unless there's words like so if you
have music and there's a song and you
can hear the words then then then you
get the language area we're talking
about speaking and listening but are or
we also talking about reading this is
all comprehension of any kind and so so
that is fascina so what this this this
network doesn't make any difference if
it's written or spoken so the the Lang
the thing that she calls ferano calls
the the language network is this high
level language so it's not about the
spoken the spoken language and it's not
about the written language it's about
either one of them and so we're so when
you do speech you're sort of listen you
either you listen to speech and you and
you subtract away some language you
don't understand and so and or you
subtract away back backward speech which
signs sounds like speech but it isn't
and and then so you you take away the
sound part
and so and then if you do written you
get exactly the same network so for just
reading the language versus reading sort
of nonsense words or something like that
you you'll find exactly the same network
so this is about high level um the
compreh comprehension of language yeah
in this case and and the same thing
happen production is a little harder to
run the scanner but the same thing
happens in production you get the same
network so production is a little harder
right you have to figure out how do you
run a task you know in the network such
that you're doing some kind of
production and I can't remember what
they've done a bunch of different kinds
of tasks there where you get people to
produce things yeah figure out how to
produce and the same network goes on
there exactly the same place so if wait
wait so if you read random words yeah if
you read things like um like jish yeah
yeah Lewis carols TW Brill Jabberwocky
right they call that Jabberwocky speech
the network doesn't get activated not as
much there are words in there there's
function words and stuff so it's lower
activation fascinating yeah yeah so
there's like basically the more language
like it is the higher it goes in the
language Network and that network is
there from when you speak from as soon
as you learn language and and and it's
it's there like you you speak multiple
languages the same network is going for
your multiple languages so you speak
English you speak Russian then both of
them are hitting that same network if
you if you're fluent in those languages
so programming not at all isn't that
amazing even if you're a really good
programmer that is not a human language
it's just not conveying the same
information and so it is not in the
language Network and so is that as
mindblowing as I think that's that's
weird it's amazing so that's like one
set of dat this hers like shows that
what you might think is thinking is is
not language language is just the SE
just just this conventionalized system
that we've worked out in in human
languages oh another fascinating little
bit tidbit is that even if they're these
constructed languages like Klingon or um
I don't know the languages from Game of
Thrones I'm sorry I don't remember those
languages maybe a lot of people offended
right now there's people that speak
those languages they they they really
speak those languages because the people
that wrote the languages for the shows
um they did an amazing job of
constructing something like a human
language and those that that lights up
the language area that's like because
they can speak you know pretty much
arbitrary thoughts in a human language
it's not a it's a constructed human
language and probably it's related to
human languages because the people that
were constructing them was were making
them like human languages various ways
but it it also activates the same
network which is pretty pretty cool
anyway sorry to go into a place where
you may be a little bit philosophical
but is it possible that this area of the
brain is doing some kind of translation
into a deeper set of almost like a
Concepts that's it has to be doing so
it's doing in communication right it is
translating from thought whatever that
is it's more abstract and it's doing
that that's what it's doing like it is
that that is kind of what it is doing
it's kind of a meaning Network I guess
yeah like a translation network but I
wonder what is at the core at the bottom
of it like what are thoughts are they
are thoughts to me like I don't know
thoughts and words are they neighbors or
are is it one turtle sitting on top of
the other meaning like is there a
deep set of Concepts that we well
there's connections right between the
what what these things mean and then
there's probably other other parts of
the brain what these things mean and so
you know when I'm talking about whatever
it is I want to talk about if it's some
it'll be represented somewhere else that
that knowledge of whatever that is will
be represented somewhere else well I
wonder if there's like some stable
nicely compressed encoding of meanings I
don't know that's separate from language
that you
know I guess I guess the implication
here is
that that we don't think in language
that's correct isn't that cool and and
and that's so interesting so people I
mean this is like hard to experiments on
but there is this idea of an inner voice
and a lot of people have an inner voice
and so if you do a poll on the internet
and ask if you you hear self talking
when you're just thinking or whatever
you know about 70 or 80% of people will
say yes uh most people have an inner
voice I I I don't and so I always find
this strange when so when people talk
about an inner voice I always thought
this was a metaphor and and uh they here
I'm I know most of you whoever is
listening to this thinks I'm crazy now
because I'm I don't have an in your
voice and I I just don't know what
you're listening to I I just it sounds
so kind of annoying to me but that to
have this voice going on while you're
while you're thinking but I guess most
people have that and I don't have that
and uh I don't we don't really know what
that that connects to I wonder if the
inner voice activates that same network
I I wonder I don't I don't know I don't
know I mean this could be speechy right
so that's like that you hear do you have
an ear voice I don't think so oh a lot
of people have this sense that they hear
other PE they hear themselves and say
they read someone's email I've heard
people tell me that they hear that other
other person's voice when they read
other
people's emails and I'm like wow that
sounds so disruptive I do think I like
vocalize what I'm reading but I don't
think I hear a voice MH well that's you
probably don't have an inter voice yeah
I don't think I have people have an
voice people have this strong percept of
hearing sound in their heads when
they're just thinking I refuse to
believe that's the majority of people
majority absolutely it's it's like 2/3
or 3/4 it's lot I when never ask class
they and and I went internet they always
say that so you're you're minority it
could be a self-report flaw it could be
you know when I'm reading yeah inside my
head I'm kind of like saying the words
which is probably the wrong way to read
but I don't hear a voice there's no PR
percept of a voice I refuse to believe
the majority of people have anyway it's
a fascinating the human brain is fasc
but it still blew my mind that the that
language does appear comprehension does
appear to be separate from thinking MH
so that's one set one set of data from
feder Eno's group is that um no matter
what task you do if it doesn't have
words and combinations of words in it
then it won't light up the language
Network and you know you could it'll be
active somewhere else but not there so
that's one and then this other um piece
of evidence relevant to that question is
it turns out there are these this group
of people who've had a massive stroke on
the left side and wiped out their
language Network and it as long as they
didn't wipe out everything on the right
as well in that case they wouldn't be
you know cognitively functionable but if
they just wiped out language which is
pretty tough to do because it's it's
very expansive on the left but if they
have then there are these there's
patients like this uh called so-called
Global aphasic Who Um can do any task
just fine but not language they can't
they can't talk to them I mean you they
don't understand you they can't speak
they can't write they can't read but
they can do all they can play chess they
can drive their cars they can do all
kinds of other stuff you know do math
they can do all like so math is not in
the language area for instance you do
arithmetic and stuff that's not language
area it's got symbols so people sort of
confused some kind of symbolic
processing with language and symbolic
processing is not the same so there are
symbols and they have meaning but it's
not language it's not a you know
conventionalized language system and so
Lang so math isn't there so they can do
math they they do just as well as their
control age match controls and all these
tasks this is Rosemary vley over in
University College London who has a
bunch of patients who who she's shown
this that they're just um so that that
sort of combination suggests that
language isn't necessary for thinking it
it it doesn't mean you can't think in
language you could think in language
because language allows a lot of
expression but it's just you don't need
it for thinking it's it suggests that
language is separate is a separate
system this is kind of blowing my mind
right now I'm trying to load that in
because it has implications for large
language models it sure does and they've
been working on that well let's take a
stroll there you wrote that the best
current theories of human language are
arguably large language models so this
has to do with form it's a kind of a big
Theory and uh but the reason it's
arguably the best is that it it does the
best at predicting what's English for
instance it's it's like incredibly good
you know better than any other Theory
it's so you know but you know we don't
you know there's it's not sort of
there's not enough detail well it's Opa
like there's not you don't know what's
going on you what's going on it's
another black box but I think it's you
know it is a theory what's your
definition of a theory because it's a
gigantic it's a gigantic black box with
you know a very large number of
parameters controlling to me Theory
usually requires uh a Simplicity right I
don't know maybe I'm just being loose
there I I think it's a it's not not it's
not a great Theory but it's a theory
it's a good theory in in one sense and
that it covers all the data like
anything you want to say in English it
does and so that's why it's that's how
it's arguably the best is that no other
theory is as good as a large language
model in in predicting exactly what's
good and what's bad in English you know
you now you're saying is it a good
theory well probably not you know
because I I want a smaller Theory than
that it's too big I agree you could
probably construct mechanism by which it
can generate a simple explanation of a
particular particular language like a
set of rules something like a i it could
generate a a dependency grammar for a
language right yes you could
probably uh you could probably just ask
it about itself well you know that's I
mean that
presumes and there's some evidence for
this that that that lar some large
language models are implementing
something like dependency grammar inside
them and so there's work from a guy
called Chris Manning and colleagues over
at um Stanford in natural language and
they looked at I don't know how many
large language model types but certainly
Bert and some others where and and where
where you do some kind of fancy math to
figure out exactly what the sort of what
kind of abstractions of representations
are going on and they and they were
saying it does look like dependency
structure is is what they're
constructing it doesn't like so it's
actually a very very good map so kind of
a they are constructing something like
that
um does it mean that you know that
they're using that for meaning I mean
probably but we don't know you right
that the kinds of theories of language
that llms are closest to are called
construction based theories can you
explain what construction based theories
are it's just a general theory of
language such that uh there's a form and
a meaning pair for um for lots of pieces
of the language and so it's it's it's
primarily usage based is a construction
grammar it's just it's trying to deal
with the things that people actually say
actually say and actually write and so
that's it's a usage based idea and
what's a Construction Construction is
either a a simple word so like a a
morphine plus its meaning or a
combination of words it's basically
combinations of words like the the rules
so but it's it's um it's uh un
unspecified as to what the form of the
grammar is under underlyingly and so I I
would I I would argue that dependency
grammar is maybe the the right form to
use for the types of construction
grammar construction grammar typically
um isn't kind of formalized quite and so
maybe the formalization a formalization
of that it might be independency grammar
uh I mean I I would think so but I mean
it's up to people other researchers in
that area if they agree or not so do you
think that large language models
understand language are they mimicking
language I guess the deeper question
there is are they just understanding the
surface form or do they understand
something deeper about the meaning that
then generates the form I mean I would
argue they're doing the form they're
doing the form doing it really really
well and uh are they doing the meaning
no probably not I mean there's lots of
these examples from various groups
showing that they can be tricked in all
kinds of ways they really don't
understand the the meaning of what's
going on and so there's a lot of
examples that he and other groups have
given
which just which show they don't really
understand what's going on so you you
know the Monty Hall problem is this
silly problem right where you know if uh
you have three door it's it's Let's Make
a Deal is this old game show and there's
three doors and there's a prize behind
one and there's some junk prizes behind
the other two and you're trying to
select one and if you you know he knows
Monte he knows where the target item is
the good thing he knows everything is
back there and you're supposed to he he
gives you a choice you choose one of the
three and then he opens one of the doors
and it's some junk prize and then the
question is should you trade to get the
other one and and the answer is yes you
should trade because he knew which ones
you could turn around and so now the
odds are 2third okay um and then if you
just change that a little bit to the
large language model large language
model have seen that that that
explanation so many times that it just
if you change the story it's a little
bit but it make it sound like it's the
Monty Hall problem but it's not you just
say oh um there's three doors and one
behind there is a good prize and there's
two bad doors I happen to know it's
behind door number one the good prize
the car is behind door number one so I'm
going to choose door number one Monty
Hall opens door number three and shows
me nothing there should I trade for door
number two even though I know the good
priz in door number one and then and
then the large language Mar say yes you
should trade because it's a it's it just
goes through the the the the the forms
that it's seen before so many times on
these cases where it yes you should
trade because you know your odds have
shifted from one and three now to two
out of three to being that thing it
doesn't have any way to remember that
actually you have 100% probability
behind that door number one you know
that that's not part of the of the the
scheme that it's seen hundreds and
hundreds of times before and so it can't
you can't even if you tried to explain
to it that it's wrong that they can't do
that it'll just keep giving you back the
the problem but it's also possible that
a larger language model would be aware
of the fact that there's sometimes over
representation of a of a particular kind
of formulation MH and it's easy to get
tricked by that and so you could see if
they get larger and larger models be a
little bit more skeptical so you see a
over representation so like you it just
feels like form
can training on form can go really far
in terms of being able to generate uh
things that look like the thing
understands deep L the underlying world
world model mhm of the kind of
mathematical
World physical world psychological world
that would generate these kinds of
sentences it just feels like you're
creeping close to the meaning part
easily fooled all this kind of stuff but
that's humans too so it just seems
really impressive how often it seems
like it understands Concepts
I I mean you don't have to convince me
of that I'm I am very very impressed but
does it does do I mean you're you're
giving a possible world where maybe
someone's going to train some other
version such that it'll be somehow
abstracting away from types of forms I I
mean I don't think that's happened and
so well no no no I I'm not saying that I
I I think when you just look at
anecdotal examples and just showing a
large number of them where it doesn't
seem to understand yeah it's easily
fooled yes does not seem like a
scientific um data driven like analysis
of like how many places is a damn
impressive in terms of meaning and
understanding and how many places is
easily fooled and like that's not the
inference yeah so I don't want to make
that the inference I don't I wouldn't
want to make was that in the inference
I'm trying to push is just that is it is
it like humans here it's probably not
like humans here it's different so
humans don't make that error if you
explain that to them they're not going
to make that error you know they don't
make that error and so that's something
it's doing something different from
humans are that they're doing in that
case what what what's the mechanism by
which humans figure out that it's an
error I'm just saying the error there is
like if I explain to you there's 100%
chance that the car is behind this case
this door will do you want to trade
people say no but this thing will say
yes because it's so trick it's that that
trick it's so wound up on the form that
it's that that's an error that a human
doesn't make which is kind of
interesting less likely to make I should
say yeah less likely because like humans
are very oh yeah you're asking you know
you're asking humans to you're asking a
system to understand 100% like you're
asking some mathematical Concepts and so
like look the places where large
language models are the form is amazing
so let's go back to nested structure
Center embedded structures okay if you
ask a human to complete those they can't
do it neither can a l large language
model they're just like humans in that
if you ask if I ask a large language
model that's fascinating by the way that
the central embedding the central
embedding is is it struggles with just
like exactly like humans exactly the
same way as humans they and that's not
trained so they do exactly so there so
that is a similarity so but then it's
it's that's not meaning right this is
form but when we get into meaning this
is where they get kind of messed up when
you start to saying oh what's behind
this door oh it's you know this is the
thing I want humans don't mess that that
up as much you know here they they the
form is it's just like the form of the
match is amazing is similar without
being trained to do that I mean it's
trained in the sense that it's getting
lots of data which is just like human
data but it's not being trained on uh
you know bad sentences and being told
what's bad it just can't do those it'll
actually say things like those are too
hard for me to complete or something
which is kind of interesting actually
kind of how does it know that I don't
know oh but it really often doesn't just
complete sense it off very often says
stuff that's true
MH and sometimes says stuff that's not
true and almost always the form is great
yeah but it's still very surprising that
with really great form it's able to
generate a lot of things that are
true based on what it's trained on and
so on so it's not just it's not just
form that is generating it's mimicking
true statements that's right that's
right from the internet I guess I guess
the underlying idea there is that on the
internet truth is over represented
versus falsehoods I think that's
probably right yeah so but the the the
fundamental thing is trained on you're
saying is just form it's I think so yeah
yeah I think so uh well that's a sad if
that's to me that's still a little bit
of an open question I probably lean
agreeing with you uh especially now you
just blown my mind that there's a
separate module in the brain for
language versus
thinking maybe there's a fundamental
part missing from the large language
model
approach that lacks the thinking the
reasoning
capability yeah that's what this group
argues so the the same group feder Enos
group has a recent paper arguing exactly
that there a guy called Kyle Mell who's
here in Austin Texas actually he's an
old student of mine but he's a in
linguistics at Texas and he was the
first author on that that's fascinating
still to me an open question yeah what
do you are the interesting limits of
llms you know I I don't see any limits
to their form their form is impressive
perect yeah yeah yeah it's pretty I mean
it's close to being well you said
ability to complete Central embeddings
yeah it's just the same as humans it
seems the same as but that's not perfect
right it should be that's good no but I
want to be like humans I I'm trying to I
want a model of humans but but but oh
wait wait wait oh so perfect use is uh
as close to humans as possible I got it
yeah but you should be able to if you're
not human you're like you're super human
you should be able to complete Central
embedded sentences right I mean that's
the the mechanism is if it's modeling
some I think it's kind of really
interesting that it it's really
interesting it's more like like I think
it's potentially underlyingly modeling
something like what the the way the form
is processed the form of human language
the way how and how humans process the
language yes I think that's plausible
and how they generate language process
language and generate language that's
FAS yeah so in that sense they're
perfect if we can just Linger on the
center embedding thing that's hard for
LMS produce and that seems really
impressive cuz that's hard for humans to
produce and how does that connect to the
thing we've been talking about before
which is the dependency grammar
framework in which you view language and
the finding that uh short dependencies
seem to be a universal part of language
so why is it hard to complete Center
embeddings so what I like about
dependency grammar is it
makes the cognitive cost associated with
long longer distance connections very
transparent Bas there's some there turns
out there is a cost associated with
producing and comprehending connections
between Words which are just not besided
other the further apart they are the
worse it is the the according to well we
can measure that and there is a cost
associated with that can you just Linger
on what do you mean by cognitive cost
and how do you measure oh well you can
measure it in a lot of ways uh the
simplest is just asking people to say
whether you know how good a sentence
sounds just ask that's one way to
measure and you you try to like
triangulate then across sentences and
across structures to try to figure out
what the source of that is you can look
at um reading times in controlled
materials you know so in certain kinds
of materials when they and then we can
like measure the dependency distances
there we can there's a recent study
which looked at we're we're talking
about uh the the brain here we could
look at the language Network okay we can
look at the language Network and we
could look at the activation in the
language Network and how big the
activation is depending on the length of
the dependencies and turns out in just
random sentences that you're listening
to if you're listening to turns out
there are people listening to stories
here uh and the bigger the longer the
dependence dependency is the the the
stronger the activation in the language
in the language Network and so there's
some measure there's a different there's
a bunch of different measures we could
do that's a kind of a neat measure
actually of actual activations
activation in the brain so that you can
somehow in different ways convert it to
a number I wonder if there's a beautiful
equation connecting cognitive cost and
length of dependency eal mc² kind of
thing yeah it's it's complicated but
probably it's doable I would I would
guess it's doable I you know I tried to
do that a while ago and I was reasonably
successful but some for some reason I
stopped working on that I do I agree
with you that it would be nice to figure
out so there's like some way to figure
out the the the cost I mean it's
complicated another issue you raised
before was like how do you measure
distance is it words is it it probably
isn't is the part of the problem is that
some words matter than more than others
and probably you know meaning like nouns
might matter depend depending and then
maybe depends on which kind of noun is
it a noun we've already introduced or a
noun that's already been mentioned is it
a pronoun versus a name like like all
these things probably matter so probably
the simplest thing to do is just like
let's forget about all that and just
think about words or morphemes for sure
but there might be a kind like there
might be some insight in the kind of
function yeah that fits the data meaning
like a quadratic like what I I think
it's an exponential we think it's
exponential such that the longer the
distance the less it matters and so then
then it's the sum of those is my that
that was our best guess a while ago so
that you've got a bunch of dependencies
if you've got a bunch of them that are
being connected at some point that's at
at the ends of those the the cost is the
is some exponential function of those is
my guess but because the reason it's
probably an exponential is like it's
it's not just the distance between two
words because I can make a very very
long subject verbal dependency by adding
Lots and lots of noun phrases and
prepositional phrases and it doesn't
matter too much it's when you do nested
when I have multiple of these then then
things get go really bad go south
probably somehow connected to working
memory or something like yeah that's
probably a function of the memory here
is is the access is trying to find those
earlier things it's kind of hard to
figure out what was referred to earlier
those are those connections that's
that's the sort of notion of murking as
opposed to a stagey thing but trying to
connect uh retrie retrieve those earlier
words depending on what was in between
and then then we're talking about
interference of similar things in
between that's the Right theory probably
has that kind of notion in it is
interference of similar and so I I'm I'm
dealing with an abstraction over the
Right theory which is just you know it's
count Words it's not right but it's
close and then may maybe you're right
though there's some sort of um an
exponential or something on on the on to
figure out the total so we can figure
out a function for any given for every
any given sentence in any given language
but you know it's funny you know people
haven't done that too much which I do
think is I I'm I'm interested that you
find that interesting I really find that
interesting and a lot of people haven't
found it interesting and I don't know
why I haven't got people to want to work
on that I really like that too no that's
a that's that's a beautiful IDE and the
underlying idea is beautiful that
there's a cognitive cost that correlates
with the length of dependency it just it
feels like it's a deep I mean language
is so fundamental to The Human
Experience and this is a nice clean
theory of language
MH where yeah it's like wow okay so like
we like our words close together the
dependent words close together yeah
that's why I like it too it's so simple
yeah the Simplicity of and yet it
explains some very complicated phenomena
if you if I write these very complicated
sentences it's kind of hard to know why
they're so hard and you can like oh nail
it down I can do like give you a math
formula for why each one of them is bad
and where and that's kind of cool I
think that's very neat have you gone
through the process is there like a you
you take a piece of text and then
simplify sort of like there's an average
uh length of dependency and then you
like uh you know uh reduce it and see
comprehension on the entire not just a
single sentence but like you know you go
from James joy to Hemingway or
something no no simple answer is no that
does there's probably things you can do
in that in that kind of Direction that's
fun we might you know we're going to
talk about legal EAS at some point and
so we maybe we'll talk about that kind
of
thinking would applied to Legal EAS but
let's talk about legal EAS because you
mentioned that as an exception we just
taking tangent upon tangent that's an
interesting one you given as an
exception it's an exception uh that you
say that most natural languages as we've
been talking about have local
dependencies with one exception legal e
that's right so what is legal first of
all oh well legal is what you think it
is it's just any legal language I mean
like I actually know know very little
about the kind of language that lawyers
use so I'm just talking about language
in laws and language in contracts got it
so the stuff that you have to run into
we have to run into every other day or
every day uh and you skip over because
it reads poorly and or you know partly
it's just long right there's a lot of
texts there that we don't really want to
know about and so but the the thing I'm
interested in so I I've been working
with um this guy called Eric Martinez
who is a um he was a lawyer who was was
taking my class I I was teaching a
psychol Linguistics lab class and I have
been teaching it for a long time at MIT
and he's a he was a law student at
Harvard and he took the class because he
had done some Linguistics as an
undergrad and he was interested in the
problem of why legal e sounds hard to
understand you know why and and so why
is it hard to understand and why do they
write that way if it is so hard to
understand it seems apparent that it's
hard to understand the question is why
is it and so we we didn't know and uh we
did uh an evaluation of a bunch of
contracts actually we just took a bunch
of sort random contracts because I don't
know you know there's contracts in laws
might not be exactly the same but uh
contracts are kind of the things that
most people have to deal with most of
the time and so that's kind of the most
common thing that humans have like
humans that that adults in our
industrialized Society have to deal with
a lot and so so that's what we we pulled
and we we didn't know what was hard
about them but it turns out that the way
they're written is is very Centum beded
has nested structures in them so it has
low frequency words as well that's not
surprising lots of texts have low it
does have surprising slightly lower
frequency words than other kinds of
control texts even sort of academic text
legal Le is even worse it is the worst
that that we were able to find you just
you just reveal the game that lers are
playing they're optimizing a different
well you know it's interesting that's a
that's a now you're getting at why and
so and I don't think so now you're
saying it's they're doing intentionally
I don't think they're doing
intentionally but let let let's um let's
it's an emerging phenomena okay yeah
yeah we'll get to that we'll get to that
and so but but we wanted to see why see
what first as oppos so like because it
turns out that we're not the first to
observe that legal EAS is weird like
back to uh Nixon had a plain language
act in in 1970 and and Obama had one and
uh boy a lot of these you know a lot of
presidents have said oh we've got a
simplify legal language must simplify it
but if you don't know how it's
complicated it's not easy to simplify it
you need know what it is you're supposed
to do before you can fix it right and so
you need to like you need a psych
linguist to analyze the text and see
what's wrong with it before you can like
fix it you don't know how to fix it how
am I supposed to fix something I don't
know what's wrong with it and so what we
did was just that's what we did we
figured out what's okay we just took a
bunch of contracts had people uh and we
encoded them for the the a bunch of
features and so another feature the
people one of them was Centrum Bing and
so uh that is like basically how often a
um a clause would in would would
intervene between a subject and a verb
for example that's one kind of a cent
edding of a clause okay and um turns out
they're massively Cent embedded like so
I think in random contracts and in
random laws I think you get about 70% or
80 something like 70% of sentences have
a center embedded clause which is
insanely high if you go to any other
text it's down to 20% or something it's
it's it's so much higher than the any
control you can think of including you
think oh people think oh Tech technical
um academic text no people don't write
Cent embedded sentences in in technical
academic text I mean they do a little
bit but much it's it's on the 20% 30%
realm as opposed to 70 and so and so
there's that and and there's low
frequency words and then people oh maybe
it's passive people don't like the
passive passive for some reason the
passive voice in English has a bad rap
and I'm not really sure where that comes
from um and there is a lot of passive in
uh the there's much more passive voice
in the in the uh in than there is in
passive voice accounts for some of the
low frequency words no no those are
separate those are separate oh so
passive voice sucks frequency where suck
suck is different so these are different
Jud yeah yeah yeah pass drop the
Judgment it's just like these are
frequent these are things which happen
in legal these texts then we can ask the
dependent measure is like how well you
understand those things with those
features okay and so then and it turns
out the passive makes no difference so
it has a Zero Effect on your
comprehension ability on your recall
ability no nothing at all that has no
effect you're the the words matter a
little bit they do low frequency words
are going to hurt you in recall and
understanding but what really what
really hurts is the S betting that kills
you that is like that slows people down
that makes them that makes them very
poor understanding that makes them uh
they they they can't recall what was
said as well nearly as well and we we
did this not only on lay people we did
on a lot of late people we ran it on a
100 lawyers we recruited lawyers from a
from a wide range of of um um sort of
different levels of law firms and stuff
and they have the same pattern so they
also like why when when when they did
this I did not know it would happen I
thought maybe they could process they're
used to Legal e they process just as
well as it it was normal no no they they
they're much better than lay people so
they're much like they can much better
recall much better understanding but
they have the same main effects as as as
lay people as lay people exactly the
same so they also much prefer the
non-centric so we we oh we constructed
non- Center embedded versions of each of
these we constructed versions which have
um higher frequency words in those
places and we we did we un un UNP
passivized we turned them into active
versions The the passive active made no
difference the words made a little
difference and the UN uncenter embedding
Mak makes big differences in all the
populations uncenter embedding how hard
is that process by the way don't
question but how hard is it to detect
Center embedding oh easy easy to detect
just looking at long dependencies or
you can just you can so there's
automatic parsers for English which are
pretty good very and they can detect
Cent or I guess Nest perectly yeah lar
yeah pretty much so you're not just
looking for long dependencies you're
just literally looking for Center bed
yeah we are in this case in these case
but long dependencies are they're highly
correlated these to this so like a
center embedding is a is a big bombb you
throw inside inside of a sentence that
just blows up the that that makes can I
read a sentence for for you from these
things I I see I I mean this is just
like one of the things said this is just
my eyes might glaze over in the middle
mids
sentence no I understand that I mean
legal is hard this is it go it goes in
the event that any payment or benefit by
the company all such payments and
benefits including the payments and
benefits under Section 3A here of being
here here and after referred to as a
total payments would be subject to the
excise tax then the cash Severance
payments shall be reduced so that's
something we pulled from a regular text
from a from a contract wow and and and
the Cent embedded bit there is just for
some reason there's a definition they
throw the definition of what payments
and benefits are in between the subject
and the verb let's how about don't do
that how about put the definition
somewhere else as opposed to in the
middle of the sentence and so that's
that's very very common by the way
that's that's what happens you just
throw your definitions you use a word a
couple words and then you define it and
then you continue the sentence like just
don't write like that and and you ask so
then we ask lawyers we thought oh maybe
lawyers like this lawyers don't like
this
they don't like this they don't want to
they don't want to wrate like this they
they we asked them to rate materials
which are with the same meaning with
with uncenter bed and centered and they
much preferred the uncentered versions
on the comprehens on the reading side
yeah well and we asked them we asked
them would you hire someone who writes
like this or this we asked them all
kinds of questions and they always
preferred the less complicated version
all of them so I don't even think they
want it this way yeah but how did it
happen how did it happen that's a very
good question and and and the answer is
I still don't know but I have some
theories well our our best theory at the
moment is that there's there's actually
some kind of a performative meaning in
the center embedding in the style which
tells you it's legal ease we think that
that's the kind of a style which tells
you it's legal ease like that's a it's a
reasonable guess and maybe it's just so
for instance if you're like it's like a
magic spell so we kind of call this a
magic spell hypothesis so when you give
them when you tell someone to put a
magic spell on someone what what do you
do they you know people know what a
magic spell is and they they do a lot of
rhyming you know that's that's kind of
what people will tend to do they'll do
rhyming and they'll do sort of like some
kind of poetry kind of thing Abracadabra
type of thing exactly yeah and um maybe
that's there's a syntactic sort of
reflex here of a of a magic spell which
is Centum betting and so that's like oh
it's trying to like tell you this is
like this is something which is true
which is what the goal of law law is
right is telling you something that we
want you to believe as certainly true
right that's that's what legal contracts
are trying to enforce on you right and
so maybe that's like a a form which has
this is like an abstract very abstract
form srum betting which has a has a has
a meaning associated with it well don't
you think there's an incentive for
lawyers to generate things that are hard
to understand that was our one of our
working hypothesis we just couldn't find
any evidence of that no lawyers also
don't understand it but you're creating
space why
you I mean you ask in a communist Soviet
Union the individual members uh their
self-report is not going
to correctly reflect what is broken
about the gigantic bureaucracy that
leads to Chernobyl or something like
this um I think the incentives under
which you operate are not always
transparent to the me Members within
that system so like it just feels like a
strange coincidence that like there is
benefit if you just zoom out and look at
the system as opposed to ask individual
lawyers that making something hard to
understand is going to make a lot of
people money yeah like there's going to
you're going to need a
lawyer uh to figure that out I guess
from the perspective of the individual
but then that could be the performative
as it could be as opposed to the
incentive driven to be complicated could
be performative to where we lawyers
speak in this sophistic at way and you
regular humans don't understand it so
you need to hire a lawyer yeah I don't
know which one it is but it's
suspicious suspicious that it's hard to
understand and everybody's eyes glaze
over and they don't read I I'm
suspicious as well I'm still suspicious
and I I hear what you're saying it could
be kind of you know no individual and
even average of individuals it could
just be a few bad apples in a way which
are driving the effect in some way
influential bad apples at the sort of uh
that everybody looks up to or whatever
they like Central figures and and how
you know but it turns but it is it is
kind of interesting that among our 100
lawyers they did not share they didn't
want this they really didn't like it so
it they weren't better at than regular
people at comprehending it or they were
on average better but they had the same
difference the same same difference
exact same difference so they but I they
wanted it fixed so they they also and so
that that gave us hope that because it
actually isn't very hard to construct
a a material which is uncenter embedded
and has the same meaning it's not very
hard to do just basically in that
situation you're just putting
definitions outside of the subject verb
relation in that particular example and
that's kind of that's pretty general
what they're doing is just throwing
stuff in there which you didn't have to
put in there it there's extra words
involved typically um you may need a few
extra words sort of to refer to the
things that you're defining outside in
some way or as a because if you only use
it in that one sentence then there's no
reason to introduce extra extra terms
but so we might have a few more words
but it'll be easier to understand um so
I mean I I have hope that now that may
maybe we can make legal ease less uh
less convoluted in this so maybe the the
next president of the United States can
instead of saying generic things say
exactly I ban uh cent center
embeddings and make Ted the uh the
language Zar of the Eric Martinez is the
guy you should really put in there yeah
yeah
I mean but Center Bings are the the the
bad thing to have that's right so if you
get rid of that that'll do a lot of it
that fixs a lot it's fascinating that is
so fascinating yeah um and it just
really fascinating on many fronts that
humans are just not able to deal with
this kind of thing and that language
because of that evolved in the way it
did it's fascinating so one of the
mathematical formulations you have when
talking about languages communication is
uh this idea of noisy channels
what's a noisy channel so that's about
communication and so this is going back
to Shannon so Shannon Claud Shannon was
a u a student at MIT in the 40s and so
he wrote this very influential piece of
work about communication Theory or
information Theory and uh he was
interested in human language actually he
was trying to he was interested in this
problem of communication of getting a a
a message from my head to to your head
and and so and he he was concerned or
interested in um what was a robust way
to do that and so that assuming we both
speak the same language we both already
speak English whatever you know whatever
the language is we we speak that what is
a way that I can say the language so
that it's most likely to get the signal
that I want uh to you and so and then
the the problem there in the
communication is the noisy channel is
that there's I make there's a lot of
noise no in the system I don't speak
perfectly I make errors that's noise um
there's background noise you know you
know that as like a literal literal
background noise there is like white
noise in the background or some other
kind of noise or some speaking going on
that you or just you're at a party
that's background noise you're trying to
hear someone it's hard to understand
them because there's all this other
stuff going on in the background um and
and then there's noise on the
communication on the commun on the uh
receiver side so that you have some
problem maybe understanding me for stuff
that's just internal to you in some way
so you've got some other problems
whatever with uh understanding for
whatever reasons maybe you're maybe you
had too much to drink you know who knows
why you're not able to pay attention to
the signal so that's the noisy Channel
and so so that language if it's
Communication System we are trying to
optimize in some sense the the passing
of the message from one side to the
other and um so it turn I mean one idea
is that maybe you know aspects of like
word order for example might have
optimized in some way to to make
language a little more easy to be passed
from speaker to listener so Shannon's
the guy that did this stuff way back in
the 40s he you know it's very
interesting you know historically he was
interested in working in linguistics he
was in at MIT and he did this is his
master's thesis of all things you know
it's crazy how much how much he did for
his master's thesis in 1948 I think or
49 something and and he wanted to keep
working in language and it it just
wasn't a popular communication
as a as a reason a source for what
language was wasn't popular at the time
so Chomsky was becoming was moving in
there he was and he just wasn't able to
get a a handle there I think and so uh
and so he moved to bellhs and worked on
communication uh from a mathematical
point point of view and was you know uh
did all kinds of amazing work and so
he's just more on the signal side versus
like the language side yeah yeah it
would have been interesting to see if he
pursued the language side that's really
interesting he was interested in that is
examples in the 40s are are are kind of
like they're very language like like
things yeah we can kind of show that
there's a noisy Channel process going on
in when you're listening to me you know
you you can often sort of guess what I
meant by what I you know uh what you
think I meant given what I said um and I
I mean with respect to sort of why
language looks the way it does we might
there might be sort of as I I alluded to
there might be ways in which word order
is is somewhat optimized for for because
of the noisy channel in some way I mean
that's really cool to sort of model if
you don't hear certain parts of a
sentence or have a some probability of
missing that part like how do you
construct a language that's resilient to
that that's somewhat robust to that yeah
that's the idea and then you're kind of
saying like the word order and the
syntax of language the dependency length
are all helpful yeah well the dependency
length is is really about memory really
I think that's like about sort of what's
easier or harder to produce in some way
and these other ideas are about sort of
robustness to communication so the
problem of potential loss of loss of
signal due to noise it's so that there
there may be aspects of word order which
is somewhat optimized for that and and
you know we have this one guess in that
direct and these are kind of Just So
Stories I have to be you know pretty
Frank they're not like I can't show this
is true all we can do is like look at
the current languages of the world this
is a like we can't sort of see how
languages change or anything because
we've got these snapshots of a few you
know hundred or a few thousand languages
we don't have we don't really we we
can't do the right kinds of
modifications to test these things
experimentally and so you know so just
take that this with a grain of salt okay
from here this this stuff the dependency
stuff I can I'm much more solid on and
like here's what the lengths are and
here's and here's what's hard here's
what easy and this is a reasonable
structure I think I'm pretty reasonable
here's like why you know why does the
word order look the way it does is we're
now into shaky territory but it's kind
of cool but we're talking about just to
be clear we're talking about maybe just
actually the sounds of communic like you
and I are sitting in the bar it's very
loud and you you uh model with a noisy
channel the loudness the noise and we
have the signal that's coming across
that and you're saying word order might
have something to do with optimizing
that when there's a presence of noise
yes I it's really interesting I mean to
me it's interesting how much you can
load into the noisy Channel like how
much can you bake in you said like you
know cognitive load on the receiver end
we think that those are there's three at
least three different kinds of things
going on there and we probably don't
want to treat them all as the same and
so I think you you know the right Model
A Better model of a noisy channel would
treat would have three different Source
sources of noise which because which are
background noise you know speaker
speaker um inherent noise and listener
inherent noise and those are not the
those are all different things sure but
then underneath it there's a million
other subsets like what that's true on
on the receiving I mean I just mentioned
cognitive load on both sides then
there's like uh speaking uh Speech
impediments or just everything uh world
view I mean the meaning will start to
creep into the meaning realm of like we
have different World Views well how
about just form still though like just
just what language you know like so how
well you know the language and so if
it's second language for you versus
first language and and how maybe what
other languages you know these are still
just form stuff and that's like
potentially very informative and and you
know how old you are these things
probably matter right so like a child
learning language is is a you know as a
noisy representation of English grammar
uh you know depending on how old they
are so maybe when they're six they're
perfectly formed but uh you mentioned
one of the things is like a way to
measure the the a language is learning
problems so like what's the correlation
between everything we've been talking
about and how easy it is to learn a
language so is is uh like a short
dependencies correlated to ability to
learn language is there some kind of or
like the dependency grammar is there
some kind of connection there how easy
it is to learn yeah well all the
languages in the world's language none
is right now we know is any better than
any other with respect to sort of
optimizing dependency lengths for
example they're all kind of do it do it
well they all keep low it's so I I think
of every human language is some kind of
an opt sort of an optimization problem a
complex optimization problem to this
communic ation problem and so they' like
they've solved it and you know they're
just sort of noisy solutions to this
problem of communication there's just so
many ways you can do this so they're not
optimized for learning they're probably
optimized forun and and learning so yes
one of the factors which yeah so
learning is messing this up a bit and so
so for example if it were just about
minimizing dependency lengths and and
that was all that matters you know then
we you know so then then we might find
grammars which didn't have regularity in
their rules like but languages always
have regularity in their rules so so
what I mean by that is that if if I
wanted to say something to you and the
in the optimal way to say it was what
really mattered to me all that mattered
was keeping the
dependencies as close together as
possible Then I then I would have a very
lack set of phrase structure rule or or
dependency rules I wouldn't have very
many of those I would have very little
of that and I would just put the words
as close the things that refer to the
things that are connected right beside
each other but we don't do that like
there like there are word order rules
right so they're very and depending on
the language they're more and less
strict right so you speak Russian
they're less strict than English English
is very rigid word order rules we order
things in a very particular way and and
so why do we do that like that's
probably not about um communication
that's probably about learning I mean
then we're talking about learning it's
like probably easier to learn regular
regular things things which are very
predictable and easy to so so that's
that's probably about learning as my is
our guest because that can't be about
communication just noise can it be just
the the messiness of the development of
a language well if it were just a
communication then we we should have
languages which have very very free word
order and we don't have that we have
free ER but not free like there's always
well no but what I mean by noise is like
cultural like sticky cultural things
like the way the way you communicate
just there there's a stickiness to it
that it's it's an imperfect it's a noisy
op it's
stochastic yeah the the function over
which you're optimizing is very noisy
yeah so uh because I don't it feels
weird to say that learning is part of
the objective function CU some languages
are way harder to learn than others
right or is that that's not true that's
interesting I mean that's the public
sort of perception right yes that's true
for a second language for second
language but that depends on what you
started with right so so it's it really
depends on how close that second
language is to the first language you've
got and so yes it's very very hard to
learn Arabic if you started with English
or it's hard to you hard to learn
Japanese or if you started with Chinese
I think is the worst in the there's like
Defense Language Institute in the in the
United States has like a a list of of
how hard it is to learn what language
from English I think Chinese is the this
is a second Lang you're saying babies
don't care no there's no evidence that
there's anything harder or easier about
any baby any language learned like it's
by three or four they speak that
language and so there's no evidence of
any anything harder or easier about any
human language they're all kind of equal
so to what degree is language this is
returning to Chomsky a little bit is is
innate you said that for chamski he used
the idea that language is some aspects
of language are innate to explain away
certain things that are observed but do
how much are we
born with language at the core of our
mind
brain I mean I I you know the answer is
I don't know of course but uh the uh I
mean I I like to I'm an engineer at
heart I guess and I sort of think it's
fine to postulate that a lot of it's
learned and so I I'm guessing that a lot
of it's learned so I think the reason
Chomsky went with
innateness is because he he he
hypothesized movement in his grammar he
was interested in grammar and movement's
hard to learn I think he's right
movement is a hard it's a hard thing to
learn to learn these two things together
and how they interact and there's like a
lot of ways in which you might generate
exactly the same sentences and it's like
really hard and so he's like oh I guess
it's learned sorry I guess it's not
learned it's a eight and um if you just
throw out the movement and just think
about that in a different
way you know then you you get some
messiness but the messiness is human
language which it it actually fits
better it's that messiness isn't a
problem it's actually a a a a a it's a
valuable asset of of uh of the theory
and so so I think I don't really see a
reason
to postulate much much innate structure
and that's kind of why I think these
large language models are learning so
well is because I think you can learn
the form the forms of human language
from the input I think that's like it's
likely to be true so that part of the
brain that lights up when you're doing
all the comprehension that could be
learned that could be just you don't
need you don't need doesn't have to be
an eight so like lots of stuff is um
modular in the brain that's learned it
doesn't have to you know so there's
something called the visual word form
era in the back and so it's in the back
of your head near the you know the
visual cortex okay and that is very
specialized language sorry very
specialized brain area which
does um visual word processing if you
read if you're a reader okay if you
don't read you don't have it okay guess
what you spend some time learning to
read and you develop that that brain
area which does exactly that and so
these the modularization does not
evidence for innateness so the
modularization of a language area
doesn't mean we're born with it it we
could have easily learned that I I we
might have been born with it I I we we
just don't know at this point we might
very well have been born with this left
lateralized area I mean that there's
like a lot of other interesting
components here features of this kind of
argument so some people get a stroke or
something goes really wrong on the left
side where the left where the language
area would be and that and that isn't
there it's not not available and it
develops just fine in the right so it's
no long so it's not about the left it
goes to the left like this is a very
interesting question it's like
why is the why are any of the brain
areas the way that they are and how how
how did they come to be that way and uh
you know there's these natural
experiments which happen where people
get these you know strange events in
their brains at very young Ages which
wipe out sections of their brain and and
they behave totally normally and no one
knows anything was wrong and we find out
later because they happened to be
accidentally scanned for some reason
it's like what what's happened to your
left hemisphere it's missing there's not
many people who miss their whole left
hemisphere but they'll be missing some
other section of their left or their
right and they behave absolutely
normally would never know so that's like
a very interesting you know current
research you know this is another
project that this person feno is working
on she's got all these people contacting
her because she's scanned some people
who have been mesing sections one person
missing missed a section of her brain
and was scanned in her lab and and she
and she happened to be a writer for the
New York Times and there was a article
in New York Times about about uh the uh
just about the scanning procedure and
and about what might be learned about by
sort of the general process of MRI and
language and necess language and and
because she's writing for the New York
Times then all these people started
writing to her who who also have similar
similar kinds of deficits because
they've been you know accidentally you
know scanned for some reason and uh and
found out they're missing some section
they they say they volunteer to be
scanned these are natural experiments
natural experiments they're kind of
messy but natural experiments kind of
cool she calls them interesting brains
the first few hours days months of human
life are fascinating like U well inside
the womb actually like that
development that Machinery whatever that
is seems to create powerful humans that
are able to speak comprehend think all
that kind of stuff no matter what happen
not no matter what but robust to the
different ways that U um the the the
brain might be damaged and so on that's
that's really that's really interesting
but yeah what what would Chomsky say
about the fact the thing you're saying
now that the
languages is seems to be happening
separate from thought because as far as
I understand maybe you can correct me he
thought that language underpins yeah he
thinks so I don't know what he'd say he
would be surprised cuz for him the idea
is that language is a sort of the
foundation of thought that's right
absolutely and it's
pretty U mind-blowing to things that it
could be completely separate from
thought that's right but so you know
he's basically a philosopher philosopher
of language in a way thinking about
these things it's a fine thought you
can't test it in his methods you can't
do do a thought experiment to figure
that out you need a scanner you need
brain damage people you need something
you need ways to to measure that and
that's what you know fmri offers as a
and and you know patients are a little
Messier fmri is pretty unambiguous I'd
say it's like very unambiguous there is
no way to say that the language network
is doing any of these tasks there's like
you you should look at those data it's
like there's no chance that you can say
that there those networks are
overlapping they're not overlapping
they're just like completely different
and so uh you know so the you know you
can always make you know it's only two
people it's four people or something for
the the patients and there's something
special about them we don't know but
these are just random people and and
with lots of them and you find always
the same effects and it's very robust
I'd say what's a fascinating effect uh
what's the you mentioned Bolivia M uh
what's the connection between culture
and
language
uh you've uh you've also mentioned that
you know much of our study of language
comes from uh Weir D weird people
Western educ at industrialized rich and
Democratic so when you study like remote
cultures such as around the Amazon
jungle what can you learn about language
so uh that term weird is from Joe
Henrich he's at uh Harvard he's a
Harvard evolutionary biologist and so he
works on lots of different topics and uh
he basically was pushing that
observation that we should be careful
about the inferences we want to make
when we're talk in Psychology or soci
yeah mostly in Psychology I guess about
humans if we're talking about you know
undergrads at MIT in Harvard those
aren't the same right these aren't the
same things and so if you want to make
inferences about language for instance
you there's a lot of VAR a lot of other
kinds of languages in the world than
English and French and Chinese you know
and so maybe in for for for language we
care about how culture because cultures
can be very I mean of course English and
Chinese cultures are very different but
in you know hunter gatherers are much
more different in in some ways and so
you know if culture has an effect in
what language is then we kind of want to
look there as well as looking it's not
like the industrialized cultures aren't
interesting of course they are but we
want to look at non-industrialized
cultures as well and so I worked with
two I work with the chimani which are in
um Bolivia and and there Amazon both in
the Amazon these cases and there are
so-called farmer foragers which is not
hunter gatherers um it's sort of one up
from hunter gatherers and that they do a
little bit of farming as well a lot of
hunting as well but a little bit of
farming and the kind of farming they do
is the kind of farming that I might do
if I ever were to grow like tomatoes or
something in my backyard it's it's that
it's not like so it's not like Big Field
farming it's just a farming for a family
a few things you do that and so that's
what that's the kind of farming they do
um and the other group I've worked with
are the pan which are in uh also in the
Amazon and happen to be in Brazil and
that's with um a guy called Dan Everett
who is a um linguist Anthropologist who
actually lived and worked in the I mean
he was a missionary actually initially
back in the 70s working with trying to
translate languages so they could teach
them the bible teach them Christianity
what what can you say about that yeah so
the two groups I've worked with the
chimani and the P are both
isolate languages meaning there's no
known connected languages at all they're
just like on their own yeah there's a
lot of those and and most of the
isolates occur in in the in the Amazon
or in Papu Guinea and these these places
where the world has sort of stayed still
for long enough and they're have like so
there there aren't earthquakes there
aren't um ah well certainly no
earthquakes in the Amazon jungle and and
and uh the climate isn't bad so you
don't have droughts and so you know in
Africa you've got a lot of moving of
people because there's drought problems
and so so they get a lot of language
contact when you have when people have
to if you got to move because you're you
got no water then you got to get going
and then uh then you run into contact
with other other tribes other groups in
in the Amazon that's not the case and so
people can stay there for hundreds and
hundreds and probably thousands of years
I guess and so these groups
the the chimani and the pan are both
isolates in that and they just I guess
they've just lived there for ages and
ages with minimal contact with other
outside groups um and so I I mean I'm
interested in them because they are I
mean I you know in these case I'm
interested in their words I I would love
to study their syntax their orders of
words but I'm mostly just interested in
how languages you know are connected to
um their their cultures in this way and
so with the pahan so most interesting I
was working I was working on number
there number information and so the the
basic idea is I think language is
invented this what I get from the words
here is that I think language is
invented we talked about color earlier
it's the same idea so that what you need
to talk about with someone else is what
you're going to invent words for okay
and so we invent labels for colors that
I need not that I that I can see but
that but the things I need to tell you
about so that I can get objects from you
or get you to give me the right objects
and I just don't need a word for teal or
or a word for aquamarine in in the in
the Amazon jungle for the most part
because I don't have two things which
differ on those colors I just don't have
that and so and so numbers are really
another fascinating in um source of
information here where um you might you
know naively I certainly thought that
all humans would have words for exact
counting uh and the pioa don't okay so
they don't have any words for even
there's not a word for one in their in
their language and so there's certainly
not a word for two three or four so so
so that kind of blows people's minds
often yeah that blowing my mind that's
pretty weird how are you how are you
going to ask I want two of those you
just don't and so that's just not a
thing you can possibly ask in the P it's
not possible that is there is no words
for that so here's how we found this out
okay so so it was thought to be a one
too many language there are three words
for quantifiers for for for sets but um
and and people had thought that those
meant one two and many uh but what they
really mean is few some and many many is
correct it's few some and many and so
and so the way we figured this out and
uh this is kind of cool is that um we
gave
people uh we had a set of objects okay
these were having to be spools of thread
doesn't really matter what they are
identical objects and and and I sort of
start off here I just give you know give
you one of those and say what's that
okay see you're a p speaker and you tell
me what it is and and then I give you
two and say what's that and and
nothing's changing in the set except for
the number okay and then I just ask you
to label these things we just do this
for a bunch of different people and and
frankly it's a I I did this task this is
f and it's a weird it's a little bit
weird so you they say the word that they
thought that we thought was one it's few
but for the first one and then maybe
they say few or maybe they say some for
the second and then for the third or the
fourth they start using the word many
for the set and then five six seven
eight I go all the way to 10 and it's
always the same word and they look at me
like I'm stupid because they told me
what the word was for six S8 and I'm
going to continue asking them at nine
and 10 I'm like I'm sorry I just I just
they understand that I want to know
their language that's the point of the
task is like I'm trying to learn their
language and so that's okay but it does
seem like I'm a little slow what because
I they already told me what the word for
many was 5 six s and I keep asking so
it's a little funny to do this task over
and over we did this with the guy called
Dan was the our translator he's the only
one who really speaks Paha fluently he's
a good you know bilingual um for bunch
of languages but also English and in
Peta and then a guy called Mike Frank
was a also a student with me down there
he he and I did these things and um so
you do that okay and everyone does the
same thing all all all you know we asked
like 10 people and they all do exactly
the same labeling for one up and then we
just do the same thing down on like
random order actually we do some of them
up some of them down first okay and so
we do in instead of one to 10 we do 10
down to one and so so I give them 10
nine at eight they start saying the word
for some and then down when you get to
four everyone is saying the word for few
which we thought was one so it's like
it's the context determined what word
what what what that quantifier they used
was so it's not a count word they're not
they're not count Words they're they're
just approximate words and they're going
to be nois when you interview a bunch of
people the what the definition of few
and there's going to be a threshold in
the context yeah yeah I don't know what
that means that's that's going to be
depending on context I think that's true
in English too right if you ask an
English person what a few is I mean
that's depend completely on the context
and it might actually be at first hard
to discover yeah cuz for a lot of people
the jump from one to two will be few
right so it's a jump yeah it might be
still be there yeah like it's I mean
that's fascinating that's fascinating
that numbers don't present themselves
yeah so the words aren't there and then
and so then we do these other things
well if if they don't have the words can
they do exact matching kinds of tasks
can they even do those tasks and and and
the answer is sort of yes and no and so
yes they can do them so here's the tasks
that we did we we put out those spools
of thread again okay so put like three
out here and then um that we gave them
some objects and those happen to be
uninflated red balloons it doesn't
really matter what they are it's just a
bunch of exactly the same thing and it
was easy to put down right next to these
um um spools of thread okay and so then
I put out three of these and your task
was to just put one against each of my
three things and they could do that
perfectly so I mean I would actually do
that it was a very easy task to explain
to them because I have I did this with
this guy Mike Frank and he would be my I
I'd be the experimentor telling him to
do this and showing him to do this and
then we just like just do what he did
you know copy him all we had to I didn't
have to speak P except for know what
copy him like do what he did is like all
we had to be able to say and and then
they would do that just perfectly and
and so we'd move it up we'd do some sort
of random number of items up to 10 and
they basically do perfectly on that they
never get that wrong I mean that's not a
counting task right that is just a match
you just put one against doesn't matter
how I don't need to know how many there
are there to do that correctly and and
they would make mistakes but very very
few and no more than MIT undergrads just
GNA say like there there's no these are
low stakes so you know you make mistakes
counting is not required to complete the
matching test that's right no not at all
okay and so and and so that's our
control and this guy a guy had gone down
there before and said that they couldn't
do this task but I just don't know what
he did wrong there cuz they can do this
task perfectly well and you know I can
can train my dog to do this task so of
course they can do this task and so you
know it's not a hard task but the other
task that was sort of more interesting
is like so then we do a bunch of tasks
where you need um some way to encode the
set so like one one of them is just a I
I just put a a um uh opaque sheet in
front of the of the things I put down a
bunch a set of these things and I put no
Pig sheet down and so you can't see them
anymore and I tell you do the same thing
you were doing before right you know
it's easy if it's two or three it's very
easy but if I don't have the words for
eight it's a little harder like maybe
you know with practice wa well
no because you have to count for us for
us it's easy because we just we just
count them it's just so easy to count
them but but they don't they can't count
them because they don't count they don't
have words for this thing and so they
would do approximate it's totally
fascinating so they would get them
approximately right uh you know uh you
know after four or five you know because
you can basically you always get four
right three or four that looks that's
something we can visually see but but
after that you kind of have it's
approximate number and so then and
there's a bunch of tasks we did and they
all failed as I mean failed they did
approximate after five on all those
tasks and it kind of shows that the
words uh you kind of need the words you
know to be able to do these these kinds
of tasks there's a little bit of a
chicken and egg thing there because if
you don't have the
words then maybe they'll limit you in
the kind of like a little baby Einstein
there won't be able to come up with a
counting task you know what I mean like
uh the ability to count enables you to
come up with interesting things probably
mhm so yes you develop counting because
you need it but then once you have
counting you can probably come up with a
bunch of different inventions MH like
how to I don't
know what kind of thing they do matching
really well for building purposes
building some kind of Hut or something
like this mhm so it's interesting that
language is um a limiter on what you're
able to do yeah here language is just is
the words here is the words like the
words for exact count is the limiting
factor here they just don't have them
yeah in this yeah well that that's what
I mean the LI that limit is also a limit
on the Society of what they're able to
build that's going to be true yeah so
it's proba I mean we don't know this is
one of those problems with the snapshot
of just current languages is that we
don't know what causes a culture to
discover SL invent accounting system but
the hypothesis is the guess out there is
something to do with farming so if you
have a bunch of goats and uh you want to
keep track of them and you say have 17
goats and you go to bed at night and you
get up in the morning boy it's easier to
have a count system to do that you know
if I have that's an abstract abstraction
over said so I don't have like people
often ask me when I talk to tell them
about this kind of work they say well
don't these pahan don't they have kids
don't they have a lot of children I'm
like yeah they have a lot of children
and they do they they often have
families of three or four or five kids
and like well don't they need the
numbers to keep track of their kids and
and I always ask the person who says
this like do you have children and the
answer is always no because that's not
how you keep track of your kids you you
care about their identities it's very
important to me when I go I think I have
five children it's it's doesn't matter
which yeah doesn't it matters which five
it's like if you replaced one with
someone else I would I would care a goat
maybe not right that's the kind of point
it's an abstraction something that looks
very similar to the one wouldn't matter
to me probably but if you care about
goats you're going to know them actually
individually also yeah you will I mean
cows goats if it's a source of food and
milk and all that kind of stuff you're
absolutely right but but I'm saying it
is an abstraction such that you don't
have to care about their identities to
do this thing fast that's that's the
hypothesis not m
from Anthropologist is are guessing
about where words for counting came from
is from farming maybe any yeah uh do you
have a sense why Universal languages
like espiranto have not taken
off um like why do we have all these
different languages yeah yeah well my
guess
is the the function of a language is to
do something in a community and and uh I
mean unless there's some function to
that language in the community it's it's
not going to survive it's not going to
be use so here's a great example so what
like language death is super common okay
languages are dying all around the world
and here's how here's why they're dying
and it's like yeah I see this in you
know in it's not happening right now in
either the chiman or the or the Pan but
it probably will and so there's a
neighboring group called mosan which is
I I I I said that it's a Isola actually
there's a duel there's two of them okay
so it's actually there's two languages
which are really close which are most
ofon and um and chimane which are
unrelated to anything else and mosan is
unlike chimane in that it has a lot of
contact with Spanish and it's dying so
that language is dying the reason it's
dying is there's not a lot of value for
the local people in their native
language so there's much more value in
knowing Spanish like because they want
to feed their families and how do you
feed your family you learn Spanish so
you can make money so you can get a job
and do these things and then you can and
then you make money and so they want
Spanish things they want and so so most
ofan is D is in danger and is dying and
that's normal and so basically the
problem is that
people the reason we learn language is
to
communicate and we need to we use it to
to make money and to do whatever it is
to to feed our families and if that's
not
happening uh then it won't take off it's
not like a game or something this is
like something we you like why is
English so popular it's it's not because
it's an easy language to learn maybe it
is I don't really know it's but that's
not why it's popular but because it's a
gig uh the United States is a gigantic
economy therefore it's big economies
that do this it's all it is it's all
about money and that's what and so you
know there's a motivation to learn
Mandarin there's a motivation to learn
Spanish there's a motivation to learn
English these languages are very
valuable to know because there's so so
many speakers all over the world that's
fascinating there's less of a value uh
economically it's like kind of what
drives this it's not about it's not a
you know it's not just for fun I mean
there are these groups that do want to
learn language just for language's sake
and they want and then and there's
something you know to that but th th
those are rare those are Rarities in
general those are few small groups that
do that not most people don't do that
well if that was a primary driver then
everybody was peing English or speaking
one language there's also ATT tension
that's happening that well we're moving
towards fewer and fewer languages
exactly I wonder if you're right maybe
maybe on a you know this is slow but
maybe that's where we're moving but
there is a tension you're saying uh
language that the fringes but if you
look at geopolitics and superpowers it
does seem that there's another thing in
tension which is a language is a
national identity sometimes oh yeah for
certain Nation I mean that's the the war
in Ukraine language Ukrainian language
is a symbol of that war in many ways
like a country fighing for its own
identity so it's not merely the
convenience I mean those two things are
ATT tension
the the convenience of trade and the
economics and be able to uh communicate
with neighboring countries and uh trade
more efficiently with neighboring
countries all that kind of stuff but
also identity of the group I completely
agree because language is the way for
every Community like uh dialects that
emerge are a kind of identity for people
and sometimes a way for people to say f
you to the more powerful yeah people
that's interesting so in that way
language can't be used as that tool yeah
it it I completely agree and there's a
lot of work to try to create that
identity so people want to do that speak
you know as a cognitive scientist and
language expert I I I hope that
continues because I don't want languages
to die I want languages to survive
because I um because they're so
interesting for for so many reasons but
I mean I I I find them fascinating just
for the language part but I think they
you know there's a lot of connections to
culture as well which is also very
important do you have
hope uh for machine translation that can
break down the barriers of language so
while all these different diverse
languages exist I guess there's many
ways of asking this question but
basically how hard is to it to translate
in an automated way from one language to
another there's there's going to be
cases where it's going to be really hard
right so there are Concepts that are in
one language and not in another like the
most extreme kinds of cases are these
cases of number information so exact
like good luck translating a lot of
English into Peta it's just impossible
there's no way to do it because there
are no words for these Concepts that
we're talking about there's probably the
flip side right there's probably stuff
in Peta which is going to be hard to
translate into English on the other side
and so I just don't know what those
concepts are I mean you know the the
space the world space is a little is
different from my world space and so I
don't know what like so that the things
they talk about things are you know it's
going to have to do with their life as
opposed to you know my industrial life
which is going to be different and and
so there's going to be problems like
that always um you know there's like
it's not maybe it's not so bad in the
case of some of these spaces and maybe
it's going to be harder in others and uh
so it's pretty bad in number it's like
you know extreme I'd say in the number
space you know exact number space but in
the the color Dimension right so that's
not so bad there's I mean but it's a
problem that that you don't have ways to
talk about the concept
and there might be entire Concepts that
are missing so to you it's more about
the space of concept versus the space of
form like form you can probably map yes
yeah but so you were talking earlier
about
translation and about how
translations there's good and bad
translations I mean now we talking about
translations of form right so what makes
a writing good right you know it's not
music form right it's it's not just the
content it's you know it's how it's
written and translating that I you know
I you know that's that sounds
difficult we should we should say that
there is like I don't I hesitate to say
meaning but there's a music and a rhythm
to the form when you look at the broad
picture like difference between dust and
tolto uh or Heming you know Hemingway
Bowski James Joyce like I mentioned
there's a beat to it there's an edge to
it that like is in the
form we can probably get measures of
those yeah I I I don't know I'm
optimistic that we could get measures of
those things and so maybe that's M I
don't know I don't know though I I have
not worked on that I would love
fascinating translation to he I mean
Hemingway is probably the lowest I would
love to do see different authors but the
average per
sentence uh dependency length for
Hemingway is probably the shortest
that's your sense huh it's simple
sentences short short yeah yeah yeah
yeah I mean that's when if you have
really long sentences even if they don't
have Center and Bing like they can have
longer connections yeah they can have
longer connections they don't have to
right you can have a long long sentence
with a bunch of local words yeah but
it's but it is much more likely to have
the possibility of long dependencies
with long sentences yeah uh I met a guy
named Azar rosin who uh who does a lot
of cool stuff really brilliant works
with Tristan Harris and a bunch of stuff
but he was talking to me about
communicating with animals he co-founded
Earth species project where you're
trying to find the common language
between whales crows and humans MH and
he was saying that there is a there's a
lot of promising work that even though
the signals are very different right
like the actual like um if you have
embeddings of the languages they're
actually trying to communicate similar
type things and um is there something
you can comment on that like where is
there promise to that in everything
you've seen in different cultures
especially like remote cultures that
this is a possibility or no that we can
talk to
whales I I I would say yes I I I think
it's not crazy at all I think it's quite
reasonable there's this sort of weird
view well odd view I think that to think
that human language is somehow special I
mean it
is maybe it is uh we can certainly do
more than any of the other species you
know we you know and so and maybe maybe
our language system
is part of that it's possible but but
that but people do have often talked
about how human like Chomsky in fact has
talked about how human only only human
language has you know this you know this
this compositionality thing that he
thinks is sort of key in in language and
it's the the problem with that argument
is he doesn't speak
whale and he doesn't speak uh Crow and
he doesn't speak monkey you know he's
like they they say things like well
they're making a bunch of Grunts and
squeaks and and and that the reasoning
is like that's bad reasoning like you
know I'm pretty sure if you asked a
whale what we're saying they'd say well
I'm making a bunch of weird noises
exactly and so it's like this is a very
odd reasoning to to be making that human
language is special because we're the
only ones who have human language I'm
like well we don't know what those other
we just don't we can't talk to them yet
and so there probably a signal in there
and it might very well be something
complicated like human language I mean
sure with a small brain in in in lower
in lower species there's probably not a
very good communication system but in
these higher higher species where you
have you know what seems to be you know
abilities to communicate something uh
there might very well be a lot more
signal there than we're uh than we might
have otherwise thought but but also if
we have a lot of intellectual humility
here there somebody formerly from MIT n
oxman who I admire very much has talked
a lot about has worked
on communicating with plants mhm so like
yes the signal there is even less than
but like it's not out of the realm of
possibility that all
Nature has a way of communicating and
it's a very different language but they
do develop a kind of language through
the chemistry uh through some way of
communicating with each other and if you
have enough humility about that
possibility I think you can um I think
it would be very interesting in a few
decades maybe centuries hopefully not um
a humbling possibility of being able to
communicate not just between humans
effectively but between all of living
things on
earth well I mean I think some of them
are not going to have much interesting
to say but some of them will we don't
know we certainly don't know I think I I
think if we were humble there could be
some interesting trees out
there well they're probably talking to
other trees right they're not talking to
us and so to the extent they're talking
they're saying something interesting to
some other you know you know con
specific as opposed to us right and so
there probably is there may be some
signal there I I you know so there are
people out there actually it's pretty
common to say that Lang that human
language is special and different from
any other animal communication system
and I I I just I just don't think the
evidence is there for that claim I think
it's not obvious uh
we just don't know what what because we
we don't speak these other communication
systems until we get uh better you know
I do think there's there are people
working on that as you pointed out the
people working on whale speak for
instance like that's really fascinating
let me ask you a wild outo sci-fi
question if we make contact with an
intelligent alien
civilization and um you get to meet them
MH how hard do you think it like how
surprised would you be about their way
of
communicating do you think you would be
recognizable maybe there's some
parallels here to when you go to the
remote tribes I mean I I would want Dan
Everett with me he is like amazing at
learning uh foreign languages and so he
like this is an amazing feat right to be
able to go this is a language Paha which
has no translators before him I mean
there were he was aary went there well
there was a guy that had been there
before but he wasn't very good and so he
learned the language far better than
anyone else had learned before him he's
like good at he's just a he's a very
social person I think that's a big part
of it is being able to interact so I
don't know it kind of depends on these
these uh this this species from Outer
Outer Space how how much they want to
talk to us is there something you can
say about the process he follows like
what how do you show up to a tribe and
socialize I mean I guess colors and
Counting is is one of the most basic
things to figure out yeah you start that
you actually start with like objects yes
and just say you you just throw a stick
down and say stick and then you say what
do you call this and do this few and
then they'll say the word whatever and
he say a standard thing to do to throw
two sticks at two sticks and then you
know he learned pretty quick that there
weren't any count wordss in this
language because they didn't know this
wasn't interesting it was kind of weird
they'd say some or something the same
word over and over again and so but that
is a standard thing you just like try to
but you have to be pretty out there
socially like willing to talk to random
people which these are you know really
very different people from you and he
was and he's he's very social and so I
think that's a big part of this is like
that's how you know a lot of people know
a lot of languages is they're willing to
talk to other people that's a tough one
where you just show up knowing nothing
yeah oh God it's a it's beautiful that
humans are able to connect in that way
yeah yeah uh you've had an incredible
career exploring this fascinating topic
what advice would you give to young
people um about how to have a
career like that or a life that that
that they can be proud of when you see
something interesting just go and do it
like I do I do that like I that's
something I do which is kind of unusual
for most people so like when I saw the p
like petan was available to go and visit
I was like yes yes I'll go and then when
we couldn't go back we had some trouble
with the uh Brazilian government there's
some corrupt people there it was very
difficult to get go back in there and so
I was like all right I got to find
another group and so we searched around
and we're able to find the Chim because
I wanted to keep working on this kind of
problem and so we found the chimani and
just go there I didn't really have we
didn't have contact we had a little bit
of contact and brought someone and and
that was you know we just just kind of
just try things I I say it's like a lot
of that's just like ambition just try to
do something that other people haven't
done just give it a shot is what I I
mean I I do that all the time I don't
know I love it but and I love the fact
that your pursuit of fun has landed you
here talking to me this was an
incredible conversation that you're uh
you're uh you're just a fascinating
human being thank you for taking a
journey through human language with me
today this is awesome thank you very
much Alexis been pleasure thanks for
listening to this conversation with
Edward Gibson to support this podcast
please check out our sponsors in the
description and now let me leave you
with some words from wienstein
the limits of my language mean the
limits of my
world thank you for listening and hope
to see you next time