How Complex is Natural Language? The Chomsky Hierarchy
How can we describe the complexity of linguistic systems? Where does natural language fit in? In this week's episode, we talk about the
Chomsky hierarchy: what it captures, what characterizes different kinds of grammars on the hierarchy, and whether we can find grammars that sit higher on the scale than human language.
This is
Topic #63!
This week's tag language:
Croatian!
Related topics:
Happy Little Trees: X'
Theory - https://youtu.be/7UOcoQr0hvg
Trace Evidence: Syntactic
Movement - https://youtu.be/x5iBbSkp8rk
Last episode:
Quantifying Sets and Toasters: Generalized Quantifiers - https://youtu.be/U1l3C_hmjqM
Other of our syntax and morphology videos:
Raising the Bar:
Raising and
Control Verbs - https://youtu.be/SYoYNeaSYrU
Organizing Meanings: Morphological
Typology - https://youtu.be/Ts2DS0ZsTyo
Referential
Treatment:
Binding Theory - https://youtu.be/9sqm_cex4kA
Also, if you'd like to know more about the
Chomsky Hierarchy and its impact on computer science, Computerphile's had a few videos about them:
- Their episode on the hierarchy: https://www.youtube.com/watch?v=224plb3bCog.
- Their episode about finite-state machines: https://www.youtube.com/watch?v=vhiiia1_hC4.
- And their episode about how finite-state machines relate to grammar: https://www.youtube.com/watch?v=RjOCRYdg8BY.
Find us on all the social media worlds:
Tumblr:
http://thelingspace.tumblr.com/
Twitter: http://twitter.com/TheLingSpace
Facebook: http://www.facebook.com/thelingspace/
And at our website, http://www
.thelingspace.com/ !
You can also find our store at the website, https://thelingspace.storenvy.com/
Our website also has extra content about this week's topic at http://www.thelingspace.com/episode-63/
(Or it will by Thursday morning.)
We also have forums to discuss this episode, and linguistics more generally.
Sources:
(1) I-Language (
1st Edition,
Daniela isac &
Charles Reiss)
(2)
Introduction to the
Theory of Computation (
3rd Edition,
Michael Sipser)
(3)
Mathematical Logic for
Computer Science (3rd Edition,
Mordechai Ben-Ari)
(4) Evidence
Against the Context-Freeness of
Natural Language (
Stuart Shieber - http://www.eecs.harvard.edu/~shieber/Biblio/
Papers/shieber85
.pdf)
(5) https://en.wikipedia.org/wiki/Chomsky_hierarchy
(6)
Syntactic Structures (
Noam Chomsky)
(7) Mathematical Methods in Linguistics (
Barbara Partee,
Alice G. ter Meulen,
Robert Wall)
Proof regarding crossing dependencies (adapted from the first edition of Introduction to
Automata Theory, Languages, and Computation, by
John Hopcroft and
Jeffrey Ullman.
Note where carets appear that the following character should be taken as superscript):
We first capture the general pattern of embedded clauses in
Swiss German with the language a^nb^mc^nd^m . We then treat this as the result of intersecting Swiss German with the regular language a*b*c*d*.
Now, let L = {a^nb^mc^nd^m | n ≥ 1 and m ≥ 1}.
Suppose L is a context-free language, and let p be the pumping length referred to in the pumping lemma for context-free languages (https://en.wikipedia.org/wiki/Pumping_lemma_for_context-free_languages).
Consider the string z = a^pb^pc^pd^p. Let z = uvwxy satisfy the conditions of the pumping lemma. Then as |vwx| ≤ p, vx can contain at most two different symbols. Furthermore, if vx contains two different symbols, they must be consecutive, for example, a and b. If vx has only a’s, then uwy has fewer a’s than c’s and is not in L, a contradiction. We proceed similarly if vx consists of only b’s, only c’s, or only d’s. Now suppose that vx has a’s and b’s. Then uwy still has fewer a’s than c’s. A similar contradiction occurs if vx consist of b’s and c’s or c’s and d’s. Since these are the only possibilities, we conclude that L is not context-free.
Since context-free languages are closed under intersection with regular languages, and the above intersection is not context-free, Swiss German must be non-context-free.
Q.E.D.
A proof of the pumping lemma itself can be found in Introduction to the Theory of Computation (among other places). For a discussion of the closure properties of context-free languages, see Mathematical Methods in Linguistics (among other places).
Looking forward to next week!