Brewster Kahle founded The Internet Archive in his San Francisco attic in 1996. By 2009, the project needed more space.
He tracked down a real estate listing on Funston Street in the outer Richmond and went to visit. The defunct Christian Science church made immediate sense. Kahle describes,
We bought this building because it matched our logo.
The Archive moved into a new home where it currently contains (at last count according to Wikipedia) over 20 million books, 3 million videos, 400,000 software programs, 7 million audio files, and 400 billion web pages. Webpages are collected and accessed through The Wayback Machine, which is a kind of Google-plus allowing a user to not only search a specific website *now* but also to access *previous* versions. It is an impossible and surely Sissyphian task, however The Internet Archive crawls the ever-changing world wide web making backups of the digital material it collects en route. Much of this data lives physically in the former church on custom-designed storage clusters called Petabox.
This all reminds me of the story about painting the Golden Gate Bridge (the entrance to which is not far from The Internet Archive). Because the Golden Gate has such intense fog and weather, repainting the bridge is a continuous task. When a painting crew has reached the far side, they start again moving in the opposite direction. Painting the Golden Gate its distinctive International Orange is a never-ending job.
This story sounds apocryphal, but a quick Google search confirms.
Anyway, what the Internet Archive does is, in fact, impossible. The internet is massive and constantly changing. Any attempt to collect a complete picture of it, like completely repainting the Golden Gate bridge, is not possible at any one time. However, a partial picture is still massively valuable. Brewster Kahle describes what he imagines like this:
The idea was to try to build the Library of Alexandria, version two.
The Library of Alexandria was burned and much of the written material of the ancient Mediterranean world was lost forever. The Internet Archive was conceived to avoid this fate and has more than 40 petabytes of digital data stored across redundant data centers on what amounts to simply a lot of hard drives.
November 3, 2020
A Live Archive
A Live Archive
The-Cobweb.pdf (Jill Lepore)