The Internet Archive is the closest thing we have to a backup of the Internet. Founded by Brewster Kahle in 1996, the Internet Archive is a non-profit organization with the bold mission statement of providing “universal access to all knowledge.” It provides free access to collections of archived websites, software, games, music, images and public domain books. The most popular part of the Internet Archive is probably the “Wayback Machine” service, which allows users to insert a web address and discover if the Internet Archive contains earlier versions of that webpage. There are currently more than 445 billion webpage captures in the archive.
It was recently announced that the Internet Archive has received a $1.9 million grant from the Laura and John Arnold Foundation that will allow for a rebuild of the Wayback Machine’s code. According to press announcements, this grant will allow a very substantial upgrade of the service, making it even more useful for the researcher. Some of the planned improvements from this grant are:
- Keyword searching for websites. While this will not be keyword searching of the Internet Archive itself, it will free researchers from having to know exact URL’s of a website they wish to find.
- Optimization of the crawling capabilities of the Wayback Machine. Currently about one billion webpages are captured a week, but the new code will permit an even greater number to be archived.
- Improving the playback capability of files found on media-rich and interactive websites.
- Battling “link rot” by partnering with other services to identify broken links on their sites and replacing them with links to archived pages in the Internet Archive.
Lawyers have made use of the Internet Archive in interesting ways, most notably as a method of proving of how a website appeared on a certain date. There are numerous reported decisions where attorneys have attempted to introduce print-outs of pages produced through the Wayback Machine. This has led to interesting evidentiary battles centered upon hearsay objections, proper authentication and the “best evidence” rule. Not all courts have accepted Wayback Machine results into evidence, but for certain categories of use, such as proving “prior art” in patent cases, courts have been inclined to accept this form of evidence.