The recent brou-ha-ha (wow, I think that’s the first time I’ve ever written that word down!) over Google Print has me thinking about copyright law. I am not a lawyer. I have no legal training or education. I have picked up a bit about copyright law while working in the area of digital libraries for the past five years, however. I think what I think I know is accurate, but hey, I'm wrong a reasonable amount of the time.
The publishers who have objected to the Google Print project say that the project violates copyright law by scanning the books in question (copying, which is the first exclusive right granted to copyright holders by section 106 of U.S. copyright law) to index them. So how is this different than Google’s Web index? Well, in creating the Web index Google caches Web pages too. Caching may not actually be the right word there – Google probably more actively, intentionally, or permanently creates a copy than Random J. User’s Web browser does. One could argue there’s some sort of difference between the caching done by Google of Web pages and scanning page images of printed books, but it seems to me this difference is a matter of degree rather than of real substance. So if the digitization for Google Print is a copyright violation, does that mean all Web search engines are copyright violations?
Let’s take this exercise one step further. Indexes have been around for a very long time: the Readers’ Guide to Periodical Literature, Academic Search Premier, the MLA International Bibliography, and on ad infinitum. I admit to being ignorant as to whether these more traditional indexes tend to operate with the blessing of the copyright holders (although many of them are actually produced by publishers to cover their content), but surely not all of them do, and the library world isn’t exactly abuzz with these copyright holders crying foul. One difference is that the processing that happens to create these more traditional indexes (although this may no longer be true today!) is entirely an intellectual exercise. Any “copying” of the work done to create the index is purely in a person’s head. Is this difference one of degree or of substance?
To go yet another step further, library catalogs use a copyrighted item to create a new representation – is there an argument there that catalog records are derivative works? Obviously we’re in danger of descending into the ridiculous here, but the need for some sort of balance is clear. The concept of balance between the rights of the creator of a work and the benefit to the public good from its use is inherent in copyright law. Too bad the specifics of maintaining this balance are in language that languishes far behind current technologies.
I think it will take a copyright challenge to a large for-profit like Google (rather than to even the most resource-rich library) to overhaul copyright law, to bring it up to the times. Google seems to me to have the desire and the resources to present a reasonable defense, and persist through a legal battle rather than settling the short-term problem through an agreement with publishers. But, as I’ve said, I’m wrong a reasonable amount of the time.