I recently got around to reading the response from Encyclopedia Britannica to the comparison of the “accuracy” of articles in Britannica and Wikipedia by Nature. It’s got me thinking about the nature of authority, accuracy, and truth.
Britannica’s objections to the Nature article arise from a different interpretation of the words “accuracy” and “error.” The refutations by Britannica fall into two general categories. The first is the disputation of certain factual statements, mostly when such facts were established by research. Here, these facts aren’t truly objective, rather, they’re a product of what a human is willing to believe based on the evidence. Different humans will draw different conclusions based on the same evidence. And then there’s the other human element: mistakes. We make them, both those of us who work for Britannica and those who work for Nature. The “error” rates Nature reported for both sources are astonishingly high. Certainly not all of these are true mistakes, maybe not even very many of them, but they exist, in every resource humans create, despite any level of editorial oversight.
Second, and more prevalent, are differing opinions among reasonable people, even experts in a given domain, about what is appropriate at what isn’t to include in text written for a given audience. Anything but the most detailed, comprehensive coverage of a subject requires some degree of oversimplification (and maybe even those as well). By some definition, all such oversimplifications are “wrong” – it’s a matter of perspective and interpretation whether or not they’re useful to make in any given set of circumstances. Truth is circumstantial, much as we hate to admit it.
I’d say the same principles apply to library catalog records. First, think about factual statements. At first glance, something like a publication date would seem to be an objective bit of data that’s either wrong or right. But it’s not that simple. There are multitudes of rules in library cataloging governing how to determine a publication date and how to format it. Interpretation of those rules is necessary, therefore often two different reasonable decisions based on them as to what the publication date is are possible. In cases where a true mistake has been made, our copy cataloging workflows require huge amounts of effort to distribute corrections among all libraries that have used the record with that mistake. Only sometimes is a library correcting a mistake able to reflect this correction in a shared version of a record, and no reasonable system exists to populate that correction to libraries that have already made their own copy of that record. The very idea of hundreds of copies of these records, each slightly different, floating around out there is ridiculous in today’s information environment. We’re currently stuck in this mode for historical reasons, and a major cooperative cataloging infrastructure upgrade is in order.
More subjective decisions are not frequently recognized as such when librarians talk about cataloging. We talk as if one would only follow the rules, the perfect catalog record would be produced, and that if two people were to just follow the same rules, they would produce identical records. But of course that’s not true. There will always be individual variation, no matter how well-written, well-organized, or complete the instructions. Librarians complain about “poor” records when subject headings don’t match their ideas of what a work is about. But catalogers don’t (and of course can’t) read every book, watch every video, or listen to every musical composition they describe. Why have we set up a system whereby we spend a great deal of duplicate effort overriding one subjective decision with another, based on only the most cursory understanding of the resources we’re describing, and keeping multiple but different copies of these records in hundreds of locations? How, exactly, does this promote “quality” in cataloging?
An underlying assumption here is that there is one single perfect cataloging record that is the best description of an item. But of course this isn’t true either. All metadata is an interpretation. The choices we make about vocabularies, level of description, and areas of focus all preference certain uses over others. I’m fond of citing Carl Lagoze’s statement that "it is helpful to think of metadata as multiple views that can be projected from a single information object." Few would argue with this statement taken alone, yet our descriptive practices don’t reflect it. It’s high time we stopped pretending that the rules are all we need, changed our cooperative cataloging models to do it truly cooperatively, and use content experts rather than syntax experts to describe our valuable resources.