Original Articles |
|
|
|
|
Distribution of Words on the World-Wide Web |
WEI Fang-Ping;LI Sheng;MA Hong-Ru |
Institute of Theoretical Physics, Shanghai Jiao Tong University, Shanghai 200240 |
|
Cite this article: |
WEI Fang-Ping, LI Sheng, MA Hong-Ru 2005 Chin. Phys. Lett. 22 762-764 |
|
|
Abstract The diverse words of all kinds of language are added into the World-Wide Web in an extremely complex and arbitrary manner. Behind the apparent arbitrariness topology, as we show here, there is an order hidden in the word network. By making use of Google search engine, we find that the distributions of the basic English words and Chinese characters on the web follow a universal power law. The power law exponent of rank-ordered frequency distribution is α ~ 0.99 for basic English words and α ~ 0.98 for Chinese characters. The Zipf law and page size distribution on the Web are used to explain the phenomena.
|
Keywords:
89.20.Hb
89.75.Hc
89.75.Da
|
|
Published: 01 March 2005
|
|
|
|
|
|
|
Viewed |
|
|
|
Full text
|
|
|
|
|
Abstract
|
|
|
|
|