Feed: Boy Genius Report
Posted on: Wednesday, June 09, 2010 10:11 AM
Author: Andrew Munchbach
Subject: Caffeine: Google's new search index
Posted on: Wednesday, June 09, 2010 10:11 AM
Author: Andrew Munchbach
Subject: Caffeine: Google's new search index
Today, Google announced the completion of its new web indexing system title Caffeine. Google boasts that Caffeine, "provides 50 percent fresher results for web searches than our last index, and it's the largest collection of web content we've offered." Google goes onto explain just how their old index and Caffeine differ:Our old index had several layers, some of which were refreshed at a faster rate than others; the main layer would update every couple of weeks. To refresh a layer of the old index, we would analyze the entire web, which meant there was a significant delay between when we found a page and made it available to you.It all sounds good to us, after all…we like to keep it fresh from time to time. Google also indulges us with some fairly mind-numbing statistics about just how fast Caffeine actually crawls the web: Caffeine lets us index web pages on an enormous scale. In fact, every second Caffeine processes hundreds of thousands of pages in parallel. If this were a pile of paper it would grow three miles taller every second. Caffeine takes up nearly 100 million gigabytes of storage in one database and adds new information at a rate of hundreds of thousands of gigabytes per day. You would need 625,000 of the largest iPods to store that much information; if these were stacked end-to-end they would go for more than 40 miles.Read |
View article...
No comments:
Post a Comment