Thursday March 10, 2005

Using the Same Examples

I've been working on linguistic typology recently (beyond my thesis, I mean).  One of the things that a good cross-linguistic survey ought to do is select as random a sample of languages as possible.  However, selecting languages is very hard to do without biasing the sample—for example, in practice you're often limited to written sources, which restricts you to descriptive grammars written in a language you can read and that your library has a copy of.  On top of that, a linguist sometimes has to ignore all the little tidbits of information he or she has picked up over the years about various languages in order to avoid inadvertently choosing the same languages every other linguist would use.

Allow me to illustrate what I mean with a short quiz.  For each question, your answer should be the first example that pops into your head.  I predict that, although our answers won't agree every time, with much greater than chance frequency, you'll pick the same language I did.  My answer follows each question on the same line in the background color—select the line to see it.

[Update: If you're reading this post through an aggregator, the formatting has likely been stripped off, making my answers visible.  Try clicking through to the actual blog to take the quiz.]

  1. Name a non-Indo-European language.  Basque
  2. Name an Australian language.  Dyirbal
  3. Name a group of non-mutually intelligible languages that are often called "dialects".  Chinese
  4. Name a group of several mutually intelligible dialects that are often called "languages".  Scandinavian
  5. Name a head-initial language.  English
  6. Name a head-final language.  Japanese
  7. Name a language with free word order (scrambling).  Classical Latin
  8. Name an isolating language.  Mandarin Chinese
  9. Name a language that permits null subjects.  Spanish or Italian
  10. Name a language with basic word order SVO.  English
  11. Name a language with basic word order SOV.  Japanese
  12. Name a language with basic word order VSO.  Irish
  13. Name a language with many noun cases.  Finnish
  14. Name a language with many forms for each verb.  Turkish
  15. Name a language with many noun classes (genders).  Swahili
  16. Name a language with noun classifiers (counters).  Mandarin Chinese
  17. Name a language with productive reduplication.  Tagalog
  18. Name an ergative-absolutive language.  Georgian
  19. Name a language with dual number.  Classical Greek
  20. Name a language with complex consonant clusters.  Berber or Polish
  21. Name a language with tones.  Mandarin Chinese
  22. Name a language with vowel harmony.  Turkish
  23. Name a language with a small phoneme inventory.  Hawaiian

See what I mean?  I stayed away from questions where there's only one or two good answers (e.g. "Name a language that contains grammtical structures that are beyond the power of context free grammars"), but I'll bet we chose the same language more than once.  If we have the same answers for such wide-ranging phenomena as these, that means linguistics as a field has, by a sort of unspoken agreement, agreed upon a set of standard examples.  Now, there's probably more than one common answer for some of these questions—number 1 was just a stab in the dark, although I bet Basque is the answer more than once per (Nlanguages – NIndo-European) respondents—but there's often a very small group of languages that turns up in discussions of a particular phenomenon over and over.

This is, I suppose, inevitable, and not entirely negative.  It's handy if everyone in the field shares a common working set of examples so we can easily have discussions about various interesting phenomena.  However, if we're really interested in exploring the full range of possible human languages, playing in the same small sandbox of well-known languages (and occasionally adding a new example when somebody strays off the beaten path) simply won't cut it as a method of searching the human language space.

I can think of a solution, but it's hard: learn more "exotic" languages, specialize in language families beyond the familiar (I think we've got Indo-European covered at this point), and fer chrissake stop using English as a source of examples.  Did I say "hard"?  Maybe I should have said "unrealistic"—I have to admit that I'm not ready to abandon the use of examples from my native language—but a real effort to stay away from the standard example languages can only lead us to a broader perspective and a better basis for cross-linguistic generalizations.

[Now playing: "Los Angeles" by X]

[Update: In the comments, Heidi Harley mentions a similar list she's been collecting: examples upon which well-known bits of linguistic theory are based.]

I am The Tensor, and I approve this post.
04:48 AM in Linguistics | Submit: | Links:


TrackBack URL for this entry:

Listed below are links to weblogs that reference Using the Same Examples:

» Tenser, said the Tensor: Using the Same Examples from Semantic Compositions
If you've ever taken a linguistics class, you need to take The Tensor's quiz. The only thing I have to add is: Name a language with an unusually large phoneme inventory. !XÓÕ I've hidden my answer in white text after [Read More]

Tracked on Mar 10, 2005 2:37:27 PM

The Tensor has a very interesting post illustrating one of the occupational hazards of linguistics: the limited pool of standard examples used to demonstrate linguistic phenomena. If you've had any exposure to this sort of thing, think of a language... [Read More]

Tracked on Mar 12, 2005 4:41:33 PM


11/23. Which is probably much greater than chance; I'm actually surprised we differed that much. I was most shocked to see you choose Dyirbal instead of Lardil. (I differed for you about tones, too; I'm surprised you didn't choose one of the African languages.)

Posted by: wolfangel at Mar 10, 2005 6:18:35 AM

13 of my answers were the same as yours. And on at least two of the ones that were different, I seem to have been thinking along the same lines as Wolfangel; I had Lardil for #2 and Shona for #21.

Posted by: Q. Pheevr at Mar 10, 2005 6:43:52 AM

You may want to not have the full text of this post in the RSS feed, as it doesn't put it in the background colour (at least not in my aggregator).

For what it's worth I got about 9 the same as you. But I'm not a real linguist.

I'm surprised you chose Georgian for Ergative/Absolutive - I mean, it is, more or less (the one linguist I know with an interest in Georgian doesn't think that's a very good description) but there're several languages I'd have chosen ahead of it.

How come you get two picks for #20? (Where, incidentally, I'm surprised you didn't choose Georgian.)

Posted by: Tim May at Mar 10, 2005 9:00:24 AM

Unfortunately my linguistic education was "poisoned" early by exposure to Warlpiri, so I tend to use it as an example wherever possible. Thus, "Warlpiri" was my answer to 1, 2, 7, 9, 18, 19, and 22.

I would have agreed with you more often if it had been part of the instructions to try to agree, before seeing your responses.

I am told that Danish and Norwegian are mutually intelligible in writing, but not in speech, because Danish has undergone some fairly severe sound changes that aren't reflected in its standard orthography.

I had Georgian for #20. You can't beat vprckvni.

Posted by: ACW at Mar 10, 2005 11:13:08 AM

15/23. How did you plug into my brain without me knowing?

Oddly, Dyirbal was my answer to 2, but Warlpiri was my answer to 7. Warlpiri is stuck in my head from being introduced in a class on LFG as an example of scrambling. But that's about the only place I know anything about it from.

I'd like to think that generally, I have some idea of the sources of these examples, but you've got me in a few places. Aside from the original OT manuscript, I can't remember coming across Berber as an example of complex consonant clusters. Also, the only place I remember coming across Classical Greek as an example had to do with extrasyllabic material, not duals. But maybe this means I'm just losing my memories of grad school.

Posted by: Semantic Compositions at Mar 10, 2005 2:22:37 PM

Summary: Score of 6/13;
proposed explanations for the use of common "stock examples,"
with some justifications.

My exposure to linguistics is casual
recreational reading, a 101 class ca. 15 years ago,
and formally studying two foreign languages
(and others informally to a minor or trivial extent).

I had answers (maybe three of them only
questionable guesses) for 13 questions.

6/13 were the same as yours. Wow.

I think there are understandable
and even some justifiable reasons
for some of the similarities in "stock
examples", though.

A person can choose a language to use as
an example for either or both of two reasons:
1. The person recognizes the example's appropriateness
because of personal knowledge of the language.
2. The person is using the language as an
example because s/he learned from someone else
that the langauge is a good example.
(The example was originally chosen for reason #1.)

Possible supplementary considerations might be
3. A preference to choose an example that the audience
is more likely to speak/read/write, because a
langauge understood by an audience member might
be easier for the audience member to more thoroughly
appreciate and understand the aplicability of.
4. Preference for an example that your 101 class / casual reader /
fellow linguist has maybe at least heard of. :-)
Then you might be inclined to re-use this example on other

With the above considerations, it's not surprising
if there's a tendency for languages that are more commonly
known (and known of?) to be chosen as examples.

As for myself, some of the common examples that I chose,
I used primarily because I personally knew the languages
well enough to pick them as an easy example from my own

Basque as "a non-Indo-European language"
probably IMHO does have something to do with
the writers' methods of mental categorizing
and a Euro-centric history, though.
(Linguistics as a field of that name
probably derives in semi-recent history
more from certain areas of Europe than elsewhere, I suppose.)
Isn't being an unusually geographically placed non-Indo-European
orphan language the main thing Basque is famous for, anyway? :-)


Posted by: Anonymous Non-Linguist at Mar 10, 2005 6:59:23 PM

Stereotypes aside, I have taken allegedly "general" linguistics classes where every other example seemed to come from languages spoken in Africa, Asia, or the South Pacific. I remember breathing a sigh of relief when I heard ONE Scandinavian-language citation being discussed in class... *shrug*

Posted by: Ingeborg S. Nordén at Mar 10, 2005 9:49:40 PM

There are certain language phenomena that are so remarkable that beginning linguistics students are likely to be skeptical about them. It's useful, then, for instructors to have examples lined up. I think that phenomena like ergativity and scrambling fall into this category. This list of examples becomes part of linguistic culture through the usual process of academic mentoring.

It also works the other way around. Most linguists can reel off a list of salient properties for particular languages, and I suspect the salience is largely determined by pedagogical tradition. So, for example, somebody says, "Japanese", and I think, "Simple syllable structure; head-final; elaborate morphologized respect system; systematic diglossia with Chinese as the source of the 'high' vocabulary; quirky hybrid orthography; moraic prosody ..." For Georgian: "Few phonotactic constraints; elaborate verb morphology with extensive argument incorporation; productive glottalization of stops; examples of metathesis; unintuitive "mother" and "father" words ..."

I think it would be fun to have a page that collected these little factoids about the world's languages.

Posted by: ACW at Mar 11, 2005 9:35:17 AM

Hmmm. Trying to get together a list of serious "factoids" for the languages I actually speak fluently is harder than it appears, at first glance. For instance, almost no one thinks of Swedish as a "tone language" (it's a borderline example, but minimal pairs contrasted by tone still exist).

Posted by: Ingeborg S. Nordén at Mar 11, 2005 8:54:22 PM

15 for me, and I too was surprised at #20. I think Georgian would be much more likely to collect concurring responses. Interesting topic.

Posted by: language hat at Mar 12, 2005 4:31:42 PM

"Name a language that contains grammtical structures that are beyond the power of context free grammars"

Curiosity is killing me. I'm not a linguist.

Nivkh? Burushaski? Yukagir?


Posted by: John Emerson at Mar 13, 2005 11:18:16 AM

My answer would be "Swiss German", because that's the example I seem to encounter most often. I gather that Dutch and Bambara also have non-context-free phenomena in them, too.

Posted by: The Tensor at Mar 13, 2005 6:10:53 PM

You can't beat vprckvni.

Well, xłp̓x̣ʷłtłpłłs kʷc̓, but Georgian's a better-known language than Bella Coola.

Posted by: Tim May at Mar 15, 2005 5:51:27 PM

Tim May: Holy shit, that looks scary.

Ingebord S. Nordén: Phonemic tone accents in (some varieties of) Swedish is definitely one of the items I (and I suspect most linguists) know about Swedish. Along with the unique initial consonant of sjö.

Some varieties of Serbocroatian also have phonemic tone accents.

Also in the scrapbook it should say that Danish preserves the same contrast as Swedish, not as a tone accent but in the form of an unwritten postvocalic glottal stop called stod.

Posted by: ACW at Mar 16, 2005 8:35:56 AM

Somewhat along the same lines, I recently started collecting example sentences that, within my corner of the linguistic world, all by themselves immediately evoke an entire line of analysis and literature of argumentation: say any one of these to a linguistic graduate student and they will more than likely be able to describe the basic problem and at least one attempt at an account, and probably cite someone who's talked about them. So far they're nearly all English, again an artifact of the way undergraduate linguistics tends to be taught. I take the bold step of posting them here (in absolutely no order, though subgrouped) without identifying the phenomenon they're supposed to be illustrating; see how many you can name. (Ok, some aren't sentences, and some aren't connected to a literature of argumentation but are just famous examples.) I haven't gotten too far with it, really. Suggested additions, alternative examples and similar examples from some other corner of the linguistic universe very welcome -- email me at

John is easy to please.
John is eager to please.
The cat seems to be out of the bag.
*The cat wants to be out of the bag.
Tabs were kept on Jane Fonda.
John persuaded Mary to leave.
John promised Mary to leave.
Who(i) does his(i) mother love?
His(i) mother loves every boy(i).
Every horse didn't jump over the fence.
There seems to be a man in the room.
The horse raced past the barn fell
Colorless green ideas sleep furiously.
More people have been to Russia than I have.
The rat the cat the dog chased bit hid.
Don’t giggle me!
Hesperus is Phosphorus.
Snow is white.
"water" refers to XYZ
Someone loves everyone.
Two languages are spoken by everyone in this room.
Who did you say bought what?
*What did you say who bought?
Navratifuckinlova, *Nafuckinvratilova
writer, rider
pit, spit
Buffalo buffalo Buffalo buffalo buffalo buffalo Buffalo buffalo Buffalo buffalo buffalo
Mary saw the man with the telescope.
It’s cold in here!
John kicked the bucket.
Into the room whistled the bullet.
John loves his father and so does Bill.

Posted by: Heidi Harley at Mar 16, 2005 4:22:40 PM

Off the top of my head:

Donkey sentences: Every farmer who owns a donkey beats it.
Parasitic gap: Which report did you file without reading?
productive morphology: wug/wugs

Is there a standard "going to"/"gonna" example?

Posted by: The Tensor at Mar 16, 2005 5:31:59 PM

Ah! don't know about gonna, but can't believe I forgot

Who do you wanna leave?

Posted by: Heidi Harley at Mar 16, 2005 5:41:46 PM

1) I had a lot of the languages, and I am a beginner linguist!
2) I am Norwegian and I understand Danish very well, most of the time! I actually have an American friend who speaks Danish and we can have a perfectly intelligent and long conversation in Danish and Norwegian. The fact that he understands me stands to prove that they are dialects, I would argue!
3) Norwegian has tone accents, too, as does Japanese.
4) I am taking a class on tone now, and my answer for a tone language would be Chumburung. When Keith Snider and his Register Tier Theory gets the recognition it deserves, I bet that will be the standard answer! :)

I love meeting other linguists, we are such dorks! hehe

Posted by: Vera at Mar 16, 2005 5:45:52 PM

Oh, and going to an SIL and working with them might help broaden the scope, allthough they have their own set of standard languages to pick examples from. But how many out there has done a term paper on Northern Tepehuan, or Pokomchi?? :) Or has a professor who speaks the language in question fluently? (Keith Snider, Chumburung). It is fun fun fun!

Posted by: Vera at Mar 16, 2005 6:13:44 PM

"Also in the scrapbook it should say that Danish preserves the same contrast as Swedish"

Mmmmm, danish preserves... [Homer drool]

Posted by: Terminal Student at Mar 16, 2005 10:58:41 PM

"I am taking a class on tone now, and my answer for a tone language would be Chumburung."

Chumburung? Ok, now you people are just makin' up words.

Posted by: Terminal Student at Mar 16, 2005 11:00:08 PM

It is a Kwa language from Ghana. Honest! Lots of fun, with automatic downstep and floating tones all over the place :)

Posted by: Vera at Mar 17, 2005 5:26:49 PM

Posting those ex. sentences here made me realize that what I really want is my very own blog to put this kind of musing up in... so I've joined the linguablogosphere, here:

come and visit sometime...and thanks for the good reading!

Posted by: Heidi Harley at Mar 18, 2005 5:33:36 PM

Once a fashionable concept, the head-initial or head-final parameter lost its theoretic status and descriptive utility years ago. Data from such languages never supported such parametrisation, especially for typological ends, and since Kayne's anti-symmetry it is no longer part of syntactic theory.

Posted by: Tony Marmo at Mar 21, 2005 2:25:39 PM

oh - to add to that list above:
John is a bachelor.
married/single as "complementary pairs". how out of date can you get - i'm neither!!

Posted by: tania at Jun 10, 2005 2:58:05 AM