Saturday, September 02, 2006

On "authority"

I recently got around to reading the response from Encyclopedia Britannica to the comparison of the “accuracy” of articles in Britannica and Wikipedia by Nature. It’s got me thinking about the nature of authority, accuracy, and truth.

Britannica’s objections to the Nature article arise from a different interpretation of the words “accuracy” and “error.” The refutations by Britannica fall into two general categories. The first is the disputation of certain factual statements, mostly when such facts were established by research. Here, these facts aren’t truly objective, rather, they’re a product of what a human is willing to believe based on the evidence. Different humans will draw different conclusions based on the same evidence. And then there’s the other human element: mistakes. We make them, both those of us who work for Britannica and those who work for Nature. The “error” rates Nature reported for both sources are astonishingly high. Certainly not all of these are true mistakes, maybe not even very many of them, but they exist, in every resource humans create, despite any level of editorial oversight.

Second, and more prevalent, are differing opinions among reasonable people, even experts in a given domain, about what is appropriate at what isn’t to include in text written for a given audience. Anything but the most detailed, comprehensive coverage of a subject requires some degree of oversimplification (and maybe even those as well). By some definition, all such oversimplifications are “wrong” – it’s a matter of perspective and interpretation whether or not they’re useful to make in any given set of circumstances. Truth is circumstantial, much as we hate to admit it.

I’d say the same principles apply to library catalog records. First, think about factual statements. At first glance, something like a publication date would seem to be an objective bit of data that’s either wrong or right. But it’s not that simple. There are multitudes of rules in library cataloging governing how to determine a publication date and how to format it. Interpretation of those rules is necessary, therefore often two different reasonable decisions based on them as to what the publication date is are possible. In cases where a true mistake has been made, our copy cataloging workflows require huge amounts of effort to distribute corrections among all libraries that have used the record with that mistake. Only sometimes is a library correcting a mistake able to reflect this correction in a shared version of a record, and no reasonable system exists to populate that correction to libraries that have already made their own copy of that record. The very idea of hundreds of copies of these records, each slightly different, floating around out there is ridiculous in today’s information environment. We’re currently stuck in this mode for historical reasons, and a major cooperative cataloging infrastructure upgrade is in order.

More subjective decisions are not frequently recognized as such when librarians talk about cataloging. We talk as if one would only follow the rules, the perfect catalog record would be produced, and that if two people were to just follow the same rules, they would produce identical records. But of course that’s not true. There will always be individual variation, no matter how well-written, well-organized, or complete the instructions. Librarians complain about “poor” records when subject headings don’t match their ideas of what a work is about. But catalogers don’t (and of course can’t) read every book, watch every video, or listen to every musical composition they describe. Why have we set up a system whereby we spend a great deal of duplicate effort overriding one subjective decision with another, based on only the most cursory understanding of the resources we’re describing, and keeping multiple but different copies of these records in hundreds of locations? How, exactly, does this promote “quality” in cataloging?

An underlying assumption here is that there is one single perfect cataloging record that is the best description of an item. But of course this isn’t true either. All metadata is an interpretation. The choices we make about vocabularies, level of description, and areas of focus all preference certain uses over others. I’m fond of citing Carl Lagoze’s statement that "it is helpful to think of metadata as multiple views that can be projected from a single information object." Few would argue with this statement taken alone, yet our descriptive practices don’t reflect it. It’s high time we stopped pretending that the rules are all we need, changed our cooperative cataloging models to do it truly cooperatively, and use content experts rather than syntax experts to describe our valuable resources.

2 comments:

ralph papakhian said...

hi,

i don't think it's really just a matter of one intelligent interpretation versus another intelligent interpretation. there really is a serious "dumbing down" of cataloging by reducing the value of catalogers ( by hiring non-specialists to catalog specialized materials, but hiring clerical staff to "catalog" because the cataloging process is regarded by some as unprofessional. so the subjective decisions can vary considerably depending on individuals involved. it is definetly possible to have "poor" records. while it is obvious to everyone, that two catalogers are not necessarily going to come up with identical catalog records, the idea that anyone's subjective decision is as good as anyone else's needs to be proved (this idea, of course, is pushed in the wiki world, and one could argue, i suppose, for post-modern cataloging!)--ralph p.

quoting jenn:
Librarians complain about “poor” records when subject headings don’t match their ideas of what a work is about. But catalogers don’t (and of course can’t) read every book, watch every video, or listen to every musical composition they describe. Why have we set up a system whereby we spend a great deal of duplicate effort overriding one subjective decision with another, based on only the most cursory understanding of the resources we’re describing, and keeping multiple but different copies of these records in hundreds of locations? How, exactly, does this promote “quality” in cataloging?

Jenn Riley said...

Ralph, you raise an excellent point when you say that the subjectivity issue has been used to justify decisions to deprofessionalize cataloging. I actually believe a better approach based on the reality of subjectivity in cataloging is to make it more professional. Let's have content experts trained a bit in using well-designed cataloging systems that take care of sytactic details and prevent simple mistakes before they happen rather than syntax experts trying to guess at content decisions in all variety of fields. Let's use that professional expertise by implementing a shared cataloging system that allows everyone access to the expertise of one, and share the burden. Catalog records are currently falling victim to the "not invented here" syndrome - nobody else can do it as good as us, goes the thought process. By admitting there's subjectivity involved, and re-professionalizing the intellectual, substantive parts of resource description (note: I'm not talking x p. of music vs. 1 score (x p.) here...) we can embrace workflows that allow us to accept cataloging records created somewhere else can be as good as those we create or edit ourselves.