The Internet Archive

Web Archive

The Internet Archive is a nonprofit digital library in San Francisco, California dedicated to providing universal access to human knowledge. This is the home of the Internet Archive's Web Archive.

We offer free access to our archived collection of worldwide web content, over 80 billion URL captures from 1996 to the present, via the Wayback Machine.

Collaborative Archiving

We work with National and State libraries, archives, and educational or heritage institutions to create more focused collections of web content, through Archive-It and our other archiving services. Some of our partners include the Library of Congress and the National Archives and Records Administration (NARA).

Community Archiving

We work with libraries and cultural institutions to preserve important and spontaneous events from around the world. These collections provide permanent access to historians, researchers, scholars and the general public. Examples include the 2004 Asian Tsunami Archive and 2005 Hurricane Katrina Collection.

Open Source

Much of the software we use for collecting and offering access to web archives is available as open source software. See the Heritrix archive crawler and archive-access project pages.

News and Commentary

For the latest in our world of web archiving, follow our Web Archive Blog.

Around the World
in 2 Billion Pages
With a grant from the Mellon Foundation, the Internet Archive completed a 2 billion page web crawl in 2007, the largest we've ever done in-house or with open source tools. The 2 Billion Page Collection is now accessible.