How Google works

Paul Makepeace bookmarks@paulm.com
Fri, 1 Feb 2002 18:58:39 -0800

A bunch of papers providing some info on the theoretical background
behind Google, and related topics. It's not however known (at least to
me) how much of this is still in effect or how the various methods are
used relative to one another. Fascinating stuff though!

The Anatomy of a Large-Scale Hypertextual Web Search Engine
Sergey Brin and Lawrence Page

Efficient Crawling Through URL Ordering
Junghoo Cho, Hector Garcia-Molina, Lawrence Page

WSQ: Web-Supported (Database) Queries
Roy Goldman, Jennifer Widom

Finding near-replicas of documents on the web
Narayanan Shivakumar, Hector Garcia-Molina
http://www-db.stanford.edu/~shiva/Pubs/web.ps (PS)
http://dbpubs.stanford.edu:8090/cgi-bin/makehtml.cgi?document=1998/31 (HTML)

Copy Detection Mechanisms for Digital Documents 
Sergey Brin, James Davis, Hector Garcia-Molina

Et cetera:

The PageRank Citation Ranking: Bringing Order to the Web
Larry Page, Sergey Brin, R. Motwani, T. Winograd

Seeking search engine perfection
Neil McIntosh, Guardian (UK) (Jan 2002)

[Thanks to Chris Devers, Chris Carline and Robin Houston for original
pointers to most of these.]