It looks like Mark Liberman is going to give the full Language Log treatment to yesterday's assertion by the AP that songbirds can learn grammar; nevertheless, I'm still going to take a whack at summarizing and then criticizing the article in Nature that prompted the news. I'm writing this up partly as an exercise in puzzling out the details of a technical paper, and partly becaues I think the researchers have made claims that are too strong. (You can find the article by going to the table of contents for the current issue and scrolling down to "Recursive syntactic pattern learning by songbirds", but you need to be a subscriber or access it through a library with a site license.)
The central claim of the article is that European starlings (Sturnus vulgaris) can be taught grammar. The definition of grammar the authors are using is a very precise one. Up to now, no animal communication system has been shown to require anything more complex than a finite state machine to recognize or generate it—in other words, they're regular languages. What the researchers have done is to train starlings to recognize strings of birdsong that are beyond the capacity of finite state machines, and they claim this shows the birds must be doing recursion.
Here's what they did. (Warning: I'm simplifying, and it's always possible I'm misreading, too.) First, they recorded a single starling singing eight versions of two "motifs" they call "rattle" and "warble"—let's call them A and B like the authors do. They then used various combinations of those recordings to make two training sets: one containing songs of the form A2B2 (i.e. AABB), and the other songs of the form (AB)2 (i.e. ABAB). They then tried to train the starlings to differentiate between the two sets by rewarding them with food when they responded correctly, and punishing them by dimming the lights when they responded incorrectly. Next, they played back songs with the same patterns, A2B2 and (AB)2, but made up of new combinations of recorded A's and B's. Lo and behold, the starlings still chose the right "sentences" even though the particular "words" were different. They also played the birds some possibly confusing new songs with patterns like AAAA, BBBB, ABBA, and BAAB, which they reacted to differently from A2B2 and (AB)2.
Next, the researchers constructed and played back songs of the forms AnBn and (AB)n with n=3 and n=4—that is to say, the songs AAABBB, AAAABBBB, ABABAB, and ABABABAB. Even though the birds hadn't been trained on sequences like that, they still differentiated between them correctly. The researchers conclude from this that the birds have acquired a grammar for the songs that can handle recursive center-embedding.
I think that claim is too strong. The first round of training and results with the A2B2 and (AB)2 songs strikes me as unremarkable, because fixed-length songs like that could be learned by rote. In effect, all they show is that starlings can do categorial perception on their own songs, distinguishing between "rattle" and "warble". This isn't surprising, and I don't think the researchers are claiming it is. The interesting (and troublesome) claims are those based on the tests with songs of the form AnBn. It's true that finite state machines can't generate or recognize languages of the form AnBn, so the fact that the birds can apparently distinguish songs of that form implies that they can process song "languages" that are outside of the regular languages.
However, this does not imply that the birds are recognizing context-free languages when they're processing the songs. It's true that CFGs are the next step up the Chomsky hierarchy from regular languages, but that's a fact about the hierarchy, not about the universe—there are other kinds of languages that are supersets of the regular languages while still being subsets of context-free languages. The researchers propose a few of these and show why they're not sufficient to account for the birds' success, but I suspect the birds may be processing one they didn't consider. In particular, the starlings may just be using memory.
One way of describing the computational limitations of finite state machines is to say that they don't have any memory—that is, they only know their current state, and nothing more, so they can't, for example, count the number of A's that have gone by in order to recognize the same number of B's later on. However, simply adding the capacity to count would allow an FSM to handle AnBn languages. Maybe that's all the starlings are doing: simply counting the A's and B's. That would account for their managing AnBn languages, and wouldn't imply they're capable of processing full CFGs. Before I'd be convinced of that, I'd want to see a significantly more complex language being recognized—say, one with four symbols A, B, C, and D, and where the grammatical strings were things like AA, BB, BCCB, CDCCDC, DCBAABCD (i.e. every character is matched in a nested center-embedded structure). That would clearly require a CFG—no simple counting strategy would suffice.
In fact, there may be an even simpler explanation than counting. Perhaps the starlings have simply learned that correct songs consist of several A's that go on for a while, then several B's that go on for the same amount of time. Note that most of the "rattle" and "warble" recordings, whose spectrograms are supplied in the supplementary material, appear to be very nearly the same length, with the exception of one double-length "warble". I'd be surprised if starlings don't have at least a rudimentary sense of time—when one flies off in some directly and then needs to return to its starting point, it must know to fly back in the opposite direction for about the same duration. Maybe they've just learned a little timing.
There's a final aspect of the research that I don't understand clearly from the article, and that's the precise statistics involved. It's clear that the starlings were able to recognize grammatical songs with better than chance frequency, but it's not clear to me exactly how well they were doing. Did they get them right 55% of the time, or was it more like 95%? I'll have to go read up on this d' measure the researchers used. Maybe Mark will explain it...
Chomsky's comments here. (Via the Linguist List.)
Posted by: AJ | April 27, 2006 at 07:06 AM
d' is a measure of sensitivity as used in signal processing. If you haven't already, google "d prime" to get what you need.
I'm also confused as to why a finite state machine wouldn't be able to distinguish between A^nB^n and (AB)^n, for n > 1. I can draw one up on a napkin that seems to do the trick.
Also, wouldn't adding the ability to count basically turn a finite automaton into a pushdown automaton, thereby giving it the ability to recognize CFGs?
Maybe I should take a look into the actual paper...
Posted by: Hao | April 27, 2006 at 01:35 PM
A push-down automaton has a stack that can remember an arbitrary number of different items and their order. The starlings could get by with the equivalent of a single, very special purpose "register" that only keeps track of the number of A's. It wouldn't have to count above four, either.
Posted by: The Tensor | April 27, 2006 at 02:42 PM