Thursday, October 06, 2005

User-contributed metadata

OCLC recently announced the Wiki WorldCat pilot service. What a fantastic idea! Too bad I'm having trouble trying it out. I looked at a few books in Open WorldCat (via Google), including this one that I read recently and the book shown on the Wiki WorldCat page (The Da Vinci Code), and I didn't see the reviews tab or the links to add a table of contents or a note shown on the project page. Hmm. I wonder what I'm missing.

But, anyhoo... incorporating user-contributed metadata into library systems is something I've been thinking about for a while. Librarians tend to be pretty wedded to the notion of authority, that as curators of knowledge we're the best qualified folks out there to perform the documentation of bibliographic information. Assuming for a moment that this is true for some data elements, there are still several classes of data that could easily benefit from end-user involvement.

The first is detailed information from specialized domains. I work on a number of projects related to music. Information such as exactly which members of a jazz combo play on any given piece on a CD or the date of composition of a relatively obscure work is the sort of thing our catalogs could be providing to serve as research systems instead of just finding systems. But this sort of metadata is expensive to create; it requires research and domain expertise on the part of the cataloger. Many of our users, however, do have this specialized knowledge and love to share it.

Other information that might be appropriate for supplying by end-users could be tables of contents, instrumentation of a musical work, language of a text, and others of this type of "objective" information. Before you say, "But what about standard terminology, spelling, capitalization?!?" in a panicked voice, consider basic interface capabilities in 21st-century systems such as picking values from provided lists rather than typing them in.

But should we restrict ourselves to these more obvious of elements? I've been hoping for some time to be able to test various degrees of vetting of user-contributed metadata to a digital library system. I have in mind a completely open Wiki-type system, one that simply sends a suggestion to a cataloger, and a number of options in between. I suspect the quality of the user-contributed metadata will be overall much higher than critics assume. Yet even if it isn't, what sort of trade-off between quality and quantity are we willing to make? Traditional cataloging operations don't have extensive quality control operations, perhaps because QC is expensive work. And catalogers make mistakes, every day, just like the rest of us. Assuming a system where users can correct errors, how quickly will errors (made by a cataloger or by another end-user) be found and corrected? Will the "correct" data win out in the end? Surely these issues are worth a serious look.


Anonymous said...

Yes, the page says September, but we didn't make it. I understand it will be 'real soon'.

--Thom Hickey

Jenn Riley said...

Aha, thanks for the clarification, Thom! I'm anxious to see this in action, and I'm sure I'm not the only one. Just from the explanation, though, it really looks like a fantastic resource.


Lorcan Dempsey said...

Check out the review of On Beauty.