Uncategorized —

Internet Archive’s Wayback Machine under threat

A company whose trademark claims were thrown into question by the Internet …

If you didn't already know about the Wayback Machine, let me give you a quick introduction. Unlike a traditional search engine, the Wayback Machine takes "snapshots" of pages over time, allowing you go go back and, oh, see what Ars Technica looked like on April 22, 1999, HardOCP just a few months later, or CNET back in '96. It's fun, cool, and dare I say, a bit of history. The Wayback Machine is run by the Internet Archive project, a "non-profit that was founded to build an ?Internet library,? with the purpose of offering permanent access for researchers, historians, and scholars to historical collections that exist in digital format."

Would it surprise you that they're embroiled in a lawsuit?

Harding Earley Follmer & Frailey, a Philly-based law firm, defended a client from a trademark dispute by firing up the Wayback Machine and demonstrating that the plaintiff's claims were undone by merely looking at their web page from several years ago. Doh! As you can imagine, the plaintiff wasn't pleased at all, and now they're suing the both Harding Early and the Internet Archive for copyright infringement, and violations of the DMCA and the Computer Fraud and Abuse Act.

On a certain level, you can see their concerns. For example, the Wayback machine has scores of pages from Ars Technica, and we've never given them permission to represent an entire page of content. Then again, the Wayback Machine is a non-profit company providing materials for fair use purposes. Indeed, the irony of the situation is that lawyers use the Wayback Machine all the time in order to investigate intellectual property disputes.

But it gets more complex. Website operators can use a robots.txt file to tell the Wayback Machine to not only stop indexing the site, but also to stop showing previous versions of the site that had been indexed.  The plaintiffs claim that they did this, but somehow someone at Harding Earley supposedly defeated this by issuing hundreds of requests to the Archive. This is how we get to the DMCA, via means of a supposed technical circumvention of measures meant to protect copyrighted material. Just how exactly this happened is a bit of a mystery to me, but one thing is for certain: robots.txt is a voluntary deal. It's not a legal notice, or a legally binding method of blocking bots and the like. It is not an access control mechanism in the slightest. No one—Google, Microsoft, or the Internet Archive—is legally bound to obey the contents of a robots.txt file.

The case is extremely complex and plenty of salient details are missing, but one thing strikes me strongly: this is one heck of a case of sour grapes.

Channel Ars Technica