[Surfside] How Google works

Paul Makepeace bookmarks@paulm.com
Fri, 1 Feb 2002 18:58:39 -0800


A bunch of papers providing some info on the theoretical background
behind Google, and related topics. It's not however known (at least to
me) how much of this is still in effect or how the various methods are
used relative to one another. Fascinating stuff though!

The Anatomy of a Large-Scale Hypertextual Web Search Engine
Sergey Brin and Lawrence Page
http://www-db.stanford.edu/~backrub/google.html
http://www7.scu.edu.au/programme/fullpapers/1921/com1921.htm

Efficient Crawling Through URL Ordering
Junghoo Cho, Hector Garcia-Molina, Lawrence Page
http://www-db.stanford.edu/~cho/crawler-paper/

WSQ: Web-Supported (Database) Queries
Roy Goldman, Jennifer Widom
http://www-db.stanford.edu/wsq/

Finding near-replicas of documents on the web
Narayanan Shivakumar, Hector Garcia-Molina
http://www-db.stanford.edu/~shiva/Pubs/web.ps (PS)
http://dbpubs.stanford.edu:8090/cgi-bin/makehtml.cgi?document=1998/31 (HTML)

Copy Detection Mechanisms for Digital Documents 
Sergey Brin, James Davis, Hector Garcia-Molina
http://www-db.stanford.edu/~sergey/copy.html

Et cetera:
http://www.google.com/search?q=+site:www-db.stanford.edu+google+stanford+paper

The PageRank Citation Ranking: Bringing Order to the Web
Larry Page, Sergey Brin, R. Motwani, T. Winograd
http://citeseer.nj.nec.com/page98pagerank.html

Seeking search engine perfection
Neil McIntosh, Guardian (UK) (Jan 2002)
http://www.guardian.co.uk/Archive/Article/0,4273,4336874,00.html

[Thanks to Chris Devers, Chris Carline and Robin Houston for original
pointers to most of these.]