Penn Research Shows That Young Children Have Grammar and Chimpanzees Don’t

A new study from the University of Pennsylvania has shown that children as young as 2 understand basic grammar rules when they first learn to speak and are not simply imitating adults.

The study also applied the same statistical analysis on data from one of the most famous animal language-acquisition experiments — Project Nim — and showed that Nim Chimpsky, a chimpanzee who was taught sign language over the course of many years, never grasped rules like those in a 2-year-old’s grammar.

The study was conducted by Charles Yang, a professor of linguistics in the School of Arts and Sciences and of computer science in the School of Engineering and Applied Science. It was published in the Proceedings of the National Academies of Science.

Linguists have long debated whether young children actually understand the grammar they are using or are simply memorizing and imitating adults. One of the difficulties in resolving this debate is the inherent limitations of the data; 2-year-old children have very small vocabularies and thus don’t provide many different examples of grammar usage.

“While a child may not say very much, that doesn’t mean that they don’t know anything about language,” Yang said, “Despite the superficial lack of diversity of speech patterns, if you study it carefully and formulate what having a grammar would entail within those limitations, even young children seem very much on target.”

Yang’s approach was to look at one area of grammar that young children do regularly display: article usage, or whether to put “a” or “the” before a noun. He found a sufficient number of examples of article usage in the nine data sets of child speech he analyzed, but there was another challenge in determining if these children understood the grammar rules they were using.

“When children use articles, they’re pretty much error free from day one,” Yang said. “But being error free could mean that they’ve learned the grammar of article usage in English, or they have memorized and are imitating adults who wouldn’t make mistakes either.”

To get around this problem, Yang took advantage of the fact that most nouns can be paired with either the definite or indefinite article to produce a grammatically correct phrase, but the resulting phrases have different meanings and usages. This makes the combinations vary in frequency.

For example, “the bathroom” is a more common phrase than “a bathroom,” while “a bath” is more common than “the bath.” This difference has nothing to do with grammar but rather the frequency with which phrases containing those combinations are used. There are simply more opportunities to use phrases like “I need to go to the bathroom” or “the dog needs a bath” than there are phrases like “there’s a bathroom on the second floor” or “the bath was too cold.”

This means that the likelihood of using a particular article with a given noun is not 50/50; it is weighted toward either “the” or “a.” Such lopsided combination tendencies can be characterized by general statistical laws of language, which Yang used to develop a mathematical model for predicting the expected diversity of noun phrases in a sample of speech.

This model was able to differentiate between the expected diversity if children were using grammar, as compared to if they were simply imitating adults. Due to the differences of these frequencies, an adult might only say “the bathroom” — never saying “a bathroom” — to a child, but that child would still be able to say “a bathroom” if he or she understood the underlying grammar.

“When you compare what children should say if they follow grammar against what children do say, you find it to almost indistinguishable,” Yang said. “If you simulate the expected diversity when a child is only repeating what adults say, it produces a diversity much lower than what children actually say.”

As a comparison, Yang applied the same predictive models to the set of Nim Chimpsky’s signed phrases, the only data set of spontaneous animal language usage publicly available. He found further evidence for what many scientists, including Nim’s own trainers, have contended about Nim: that the sequences of signs Nim put together did not follow from rules like those in human language.

Nim’s signs show significantly lower diversity than what is expected under a systematic grammar and were similar to the level expected with memorization.

This suggests that true language learning is — so far — a uniquely human trait, and that it is present very early in development.

“The idea that children are only imitating adults’ language is very intuitive, so it’s seen a revival over the last few years,” Yang said. “But this is strong statistical evidence in favor of the idea that children actually know a lot about abstract grammar from an early age.”