Actualités

Chiffres-clés

imprimerenvoyerrecevoir

CIO en VO: l'index de Google dépasse le trillion (version US)


Edition du 29/07/2008 - par IDG News Service

L'index du moteur de recherche Google a dépassé sur l'échelle courte (pays anglosaxons) le trillion d'adresses URL, soit 10 puissance 12, soit 1.000.000.000.000 ou encore 1.000 milliards de pages ! En Europe, nous parlerons de billion...

Google Counts More Than 1 Trillion Unique Web URLs
By Juan Carlos Perez

In a discovery that would probably send the Dr. Evil character of the "Austin Powers" movies into cardiac arrest, Google recently detected more than a trillion unique URLs on the Web.

This milestone awed Google search engineers, who are seeing the Web growing by several billion individual pages every day, company officials wrote in a blog post Friday.

In addition to announcing this finding, Google took the opportunity to promote the scope and magnitude of its index.

"We don't index every one of those trillion pages -- many of them are similar to each other, or represent auto-generated content ... that isn't very useful to searchers. But we're proud to have the most comprehensive index of any search engine, and our goal always has been to index all the world's data," wrote Jesse Alpert and Nissan Hajaj, software engineers in Google's Web Search Infrastructure Team.

It had been a while since Google had made public pronouncements about the size of its index, a topic that routinely generated controversy and counterclaims among the major search engine players years ago.

Those days of index-size envy ended when it became clear that most people rarely scan more than two pages of Web results. In other words, what matters is delivering 10 or 20 really relevant Web links, or, even better, a direct factual answer, because few people will wade through 5,000 results to find the desired information.

It will be interesting to see if this announcement from Google, posted on its main official blog, will trigger a round of reactions from rivals like Yahoo, Microsoft and Ask.com.

In the meantime, Google also disclosed interesting information about how and with what frequency it analyzes these links.

"Today, Google downloads the web continuously, collecting updated page information and re-processing the entire web-link graph several times per day. This graph of one trillion URLs is similar to a map made up of one trillion intersections. So multiple times every day, we do the computational equivalent of fully exploring every intersection of every road in the United States. Except it'd be a map about 50,000 times as big as the U.S., with 50,000 times as many roads and intersections," the officials wrote.

Rejoignez cio-online.com, commentez cet article
Nombre de commentaires postés (0) - Lire tous les commentaires
Pour commenter cet article inscrivez vous ou identifiez vous ci-dessous si vous êtes déjà inscrit :

Email :
Mot de passe :  oublié ?
Mémoriser mes identifiants
L'ACTUALITÉ DU MOMENT
Cigref : une AG sous le signe du SaaS, de l'Open-Source et d'un nouveau président

(10/10/2) - Le Cigref (Club Informatique des Grandes Entreprises Françaises ) a tenu son assemblée (...)

Les pôles de compétitivité à la loupe

(10/10/2) - Après trois ans d'existence, le gouvernement a désiré évaluer l'efficacité du dispositif (...)

CIO en VO : les NetBooks sont les chevaux de Troie de Linux sur le poste de travail

(10/10/2) - A surge in demand for netbooks is helping drive business for Linux, as the devices (...)

Agro-industrie : Syngenta loue ses infrastructures sous forme de service flexible

(09/10/2) - Le groupe agro-industriel Syngenta a signé un contrat d'infogérance avec HP pour (...)

Internet des objets: l'Europe cautionne le « droit au silence des puces »

(09/10/2) - « Les puces doivent pouvoir être déconnectées ». C'est l'un des messages forts que (...)

Recherche

Sondage flash
La directive européenne DEEE réglemente la manière dont les entreprises doivent se débarasser de leurs ordinateurs. Mais quel est le rôle du DSI ?
Conférences
15/10/2008
GREEN IT FRANCE
De 8h30 à 19h00 sur le Toit de la Grande Arche
Agenda
Du lundi 13 octobre 2008 au lundi 13 octobre 2008
Conference OpenSolaris
Toulouse