The Mysterious Law About Zipf

Around about 1935, American linguist George Zipf noted that when listing words in descending order about their use in different contexts, the frequency of a first word in a list used to be (approximately) 2 times carry out than a second word, 3 times carry out than a third largest, and so on.

For example, seeing that three most used words zero English thus an article “the”, the preposition “of” electronic the conjunction “and”, where “the” appears 1,42 times more perform than “of” electronic 2 ,42 times more perform than “and”.

In fact, this distinct behavior had not been pointed out before, by the French shorthand Jean-Baptist Estup (92 1935) electronic by German physicist Felix Auerbach, electronic also zero privilege of an English language: it applies to all operating system languages ​​known, including artificial languages ​​like u Esperanto.

Ma that is, it is zero sony ericsson restricts to the domain of a linguistics: a same type over distribution occurs in lists over data from different sources. Some of these more studied situations, pointed out by Auerbach in 1913, concern size over cities.

For example, when we list while Brazilian cities in descending order on their populations we observed that the largest (So Paulo) 1,92 times higher than the second (Rio sobre Janeiro) electronic 2,92 times higher perform than the third (Braslia).

The first attempt to explain this phenomenon mathematically was due to the very curious Zipf electronic itself. He started with the principle that both speaker and listener want to make as little effort as possible in communication, electronic used statistical arguments to conclude that this would lead to the type of frequency distribution provided for by law. However, it is not clear how this idea could be extended to other instances of Zipf’s law, outside of linguistics.

Other possible scientific explanations were proposed over 2 years, however the validity of a law on Zipf continues being a mystery. In part, this is due to the fact that, unlike most of these mathematical claims, this law is only approximately correct: the frequencies of words in language, the populations of cities, electronic and other similar data have a complex behavior, which Zipf’s law it only reflects in a gross way.

Hyperlink PRESENT: Did you like this text? Subscriber can release five free hits of any hyperlink per day. Just click on the blue Farreneheit below.

Back to top button