Winners don't take all: Characterizing the competition for links on the web
David Pennock, Gary Flake, Steve Lawrence, Eric Glover, C. Lee Giles

New NEC study appears in Proceedings of the National Academy of Sciences, 99(8): 5207-5211, April 2002. Download the paper directly from the PNAS website in HTML or PDF formats. Or download our cached PDF or PostScript. Contact: Dr. David Pennock,

The web is changing communication

As more people use the web for more tasks, changes in how people find information can have substantial effects on competition and diversity in society.

"Winners take all" or "rich get richer"

Earlier research has shown that the distribution of links to web sites approximates a "power law" (Science, 286; Nature, 401), where a small number of sites receive the majority of links, and most sites receive very few links. The distribution has been attributed to a process called "preferential attachment", wherein new links on the web are more likely to go to sites that already have many links.

Competition varies in different categories

NEC researchers discovered that the degree of "rich get richer" or "winners take all" behavior varies in different categories and may be significantly less than previously thought. A new model has been developed which can be used to predict and analyze competition and diversity in different communities on the web.

Predicting e-commerce competition

NEC's new model can be used to predict the degree of "winners take all" or "rich get richer" behavior in different categories. The following are predictions for various e-commerce categories (from the Open Directory):

Publications   0.99   Small number of sites receive most of the links and traffic /
Entertainment   0.96   Harder for new entrants to compete
Consumer electronics   0.94  
Toys & games   0.92  
Antiques & collectibles   0.86  
Sports   0.76  
Health   0.75  
Home & garden   0.72  
Jewelry   0.70  
Weddings   0.65   Links and traffic more evenly distributed /
Photographers   0.38   Easier for new entrants to compete

The number represents the degree to which link growth is preferential (new links are created to already popular sites). Numbers close to one indicate that link growth is mostly preferential making it more difficult for new sites to compete with already popular sites. Lower numbers indicate more balanced link growth and a greater probability of new sites gaining attention. The relative differences between each category can be seen on the following plot. By "more competitive", we mean that competition is tougher - it's harder to compete with existing popular sites (as opposed to a different definition where a category with fewer major competitors is considered less competitive in economic terms). Note that more difficulty competing with existing popular sites does not mean that substantially better newcomers cannot become popular quickly (cf. Google).

Dynamics of information dissemination

Consider e-commerce sites for example. The web represents a new medium for locating stores, and the dynamics of locating online stores differs from the process of locating physical stores. For example, a consumer's decision to visit a physical store may be based on having driven past the store, receiving a recommendation from a local friend, or via local advertising. In contrast, the decision to visit an online store is strongly influenced by links to the store on the web (see below). The nature of link growth on the web can result in a substantial "first mover" advantage where it is more difficult for new entrants to gain links and traffic when competing with established and well-linked stores. This can vary greatly between categories, for example depending on how people typically find sites in each category.

Links increasingly used in search

The number of links to web sites is closely related to the traffic that sites receive. The more links there are to a site, the more likely people surfing the web will encounter such a link. Many search engines use links when selecting and ranking sites. Search engines are often more likely to index sites with many links, and more likely to return highly-linked sites among the top ranked results for queries.

The new model

NEC's new model combines preferential and uniform attachment, whereby new links preferentially link to highly-linked sites with some probability, but also link to uniformly selected sites with some probability. The model parameters quantify the degree to which rich (highly-linked) sites grow richer, and how new sites can compete. The researchers found that the new model accurately accounts for the true distributions of community-specific web sites, the web as a whole, and other social networks.

Beyond the web, the new research can also be used to model and analyze other social networks such as the network of research paper citations or the network of US power grid connections.

Detailed examples

The example page shows detailed examples of link distributions and the new model.


Download the study in HTML, PDF, or PostScript formats.
Contact: Dr. David Pennock,