Sunday, August 28, 2005

Musings on the state of coyright

The recent brou-ha-ha (wow, I think that’s the first time I’ve ever written that word down!) over Google Print has me thinking about copyright law. I am not a lawyer. I have no legal training or education. I have picked up a bit about copyright law while working in the area of digital libraries for the past five years, however. I think what I think I know is accurate, but hey, I'm wrong a reasonable amount of the time.

The publishers who have objected to the Google Print project say that the project violates copyright law by scanning the books in question (copying, which is the first exclusive right granted to copyright holders by section 106 of U.S. copyright law) to index them. So how is this different than Google’s Web index? Well, in creating the Web index Google caches Web pages too. Caching may not actually be the right word there – Google probably more actively, intentionally, or permanently creates a copy than Random J. User’s Web browser does. One could argue there’s some sort of difference between the caching done by Google of Web pages and scanning page images of printed books, but it seems to me this difference is a matter of degree rather than of real substance. So if the digitization for Google Print is a copyright violation, does that mean all Web search engines are copyright violations?

Let’s take this exercise one step further. Indexes have been around for a very long time: the Readers’ Guide to Periodical Literature, Academic Search Premier, the MLA International Bibliography, and on ad infinitum. I admit to being ignorant as to whether these more traditional indexes tend to operate with the blessing of the copyright holders (although many of them are actually produced by publishers to cover their content), but surely not all of them do, and the library world isn’t exactly abuzz with these copyright holders crying foul. One difference is that the processing that happens to create these more traditional indexes (although this may no longer be true today!) is entirely an intellectual exercise. Any “copying” of the work done to create the index is purely in a person’s head. Is this difference one of degree or of substance?

To go yet another step further, library catalogs use a copyrighted item to create a new representation – is there an argument there that catalog records are derivative works? Obviously we’re in danger of descending into the ridiculous here, but the need for some sort of balance is clear. The concept of balance between the rights of the creator of a work and the benefit to the public good from its use is inherent in copyright law. Too bad the specifics of maintaining this balance are in language that languishes far behind current technologies.

I think it will take a copyright challenge to a large for-profit like Google (rather than to even the most resource-rich library) to overhaul copyright law, to bring it up to the times. Google seems to me to have the desire and the resources to present a reasonable defense, and persist through a legal battle rather than settling the short-term problem through an agreement with publishers. But, as I’ve said, I’m wrong a reasonable amount of the time.


Thom said...

Lots of things are a matter of degrees when it comes to law. (Ok, this piece of wisdom came from an episode of the West Wing) ;-)

I don't see the caching/indexing as as much a copyright violation as Google's original scanning of the copyrighted work. Even it is stored on their servers and is never made public, their status as a "for-profit" company along with other reasons will not give them the fair use protection that a university might get. Furthermore, I could see the universities getting in trouble for allowing/encouraging this use of their materials which goes far beyond the fair use guidelines for fair use, for educational institutions, or for libraries and archives.

That's just my opinion--and as always, I remain a non-lawyer.


walt said...

Also not a lawyer, but Thom gets it exactly right as far as I can see: They're creating a complete copy of each copyright work, they're a for-profit company, and there's no fair-use defense that I know of in that case.

Indexers aren't copying items; they're creating new works that reference the items. Same with catalogers.

When you copy someone else's cataloging, you're doing so through a series of licenses and agreements inherent in your membership or agreement with OCLC, RLG, or other agency--or, in the case of LC, taking advantage of the fact that works created by the government are not subject to copyright within the U.S.

Simon Chamberlain said...

Also not a lawyer, but it was my understanding that abstracts are covered by fair use, and so can be reproduced. Other bibliographic information (author, title, etc) is surely simply factual, and therefore non-copyrightable (like the phone book).

Thom's comments make sense to me, too.

Anonymous said...

Abstracts are not covered by fair use. Either the indexer has to get permission from the publisher, or produce their own. Google "abstracts" & "copyright" for examples.