Creuset of Ideas
» The minimum


The minimum

2007-03-1 @ 4:22

I was discussing machine translation (MT) with a friend of mine, a fellow linguist, the other day, trying to see how a computer could acquire enough information to be able to do a fairly accurate job. The big thing, of course, is meaning. But, from a MT point of view, this pretty much amounts mapping one language onto another (I’m simplifying, of course). This brought about the big question: what is needed to learn a language?

What does a child need, a priori, to be able to start learning, to pick up its first language? In other words, what innate knowledge or ability is required, as a bare minimum (besides, of course, the ability to process inputs from our senses)?

First of all, I’d say we need the notion of communication, that the sounds mean something. Or does observation tell us that? Do we only need to see (hear) that specific sound patterns are related to interactions?

Pattern recognition is a pretty obvious choice for bare requirement. The ability to extract patterns from the flow of sounds. Even before birth, the foetus can distinguish language from noise, and even recognize the sounds of its mother tongue. Studies have shown that the foetus is equipped to recognize patters, be they (for hearing) in music or language. By three and a half month, babies can separate words in a sentence; the phonological word comes before meaning. The brain sees the physical patterns (sound waves) before tackling conceptual ones (meanings).

Generalization and particularization: the ability to come to a general concept based on observation of particular instances, and conversely, to extract an instance from a general idea. This is pretty much relation to pattern recognition.

Am I missing something? Do we need anything else?

2 comments to “The minimum”

  1. Context, context, context. Language is more that a string of defined words with rules with how they are used together. Every sentence is related to its surroundings, both in relation to the other words around it and the bigger picture. For example, this comment would be much harder to parse if it was read in isolation from the blog post above. Mechanical (computer) translation is light years away from being able to analyze the bigger picture and be able to translate the text in question into elegant and accurate translations.

  2. Of course, context is key; it is how we learn language, by being bathed in the context of its use (I actually should have added that to the notion of communication); thanks for reminding me.

    I agree, it is a big problem in machine translation. The same word can have widely different senses depending on context. A researcher I know use to give the example of the word “Dallas” which, in a sentence like “Since Dallas, no American president has been assassinated,” refers to a specific historical event that cannot be infered by a dictionary meaning. This sort of metaphorical use, which is more frequent than we might think, will always be one of the downfalls of MT.

What do you think?