Sunday, March 27, 2005

Random thoughts on XOBIS

Kevin Clarke, one of the authors of XOBIS, kindly left a comment on my recent blog post on the topic. It shamed me into returning to the XOBIS general overview document I peeked at briefly when originally writing about it. I've now given the entire document a quick read. I can't claim to have an in-depth understanding of it at this point; it certainly took me several readings of the FRBR report and a decent amount of time thinking about modeling different things in FRBR before I felt I could really say anything intelligent about it. Nevertheless, I have a few initial impressions on XOBIS.

The most obvious difference I see between XOBIS and FRBR is that XOBIS attempts to be a model that can describe all of knowledge, while FRBR limits itself to modelling bibliographic relationships. In a practical sense, for recording bibliographic data (and this certainly isn't the only possible use of XOBIS!) this means that XOBIS explicitly handles entities that represent in a bibliographic environment creators or subjects of bibliographic items (and, in FRBR, other Group 1 entities), currently residing in a relatively unstructured way in name and subject authority files. FRBR, on the other hand considers only briefly its Group 2 ("person" and "corporate body") and Group 3 ("concept," "object," "event," and "place") entities, focusing instead on Group 1 entities.

Relationships between entities is a key feature of XOBIS; they are also a bit confusing to me on my first read. My initial impression is that the relationships as specified focus more on subject-type relationships rather than relationships among bibliographic items. My reading is that the XOBIS definition of work is much closer to what we currently consider a bibliographic item than FRBR's work. The discussion and examples in the overview document talk about versions of works and how they are related, but I saw much less about the "accidental" sort of relationship a FRBR-ish work (as its expressed in a specific manifestation) would have to another expressed work on the same manifestation, for example, two symphonies appearing on the same CD. It would be an interesting excercise to map out how the XOBIS model would handle this sort of situation, where the symphony itself is the entity of primary interest to a majority of end-users rather than the specific performance or the title of the CD on which it appears.

XOBIS comes out of the Medlane project of the Lane Medical Library at Stanford. I wonder what effect medical materials have had on the development of the XOBIS model. I know my focus on musical materials in various projects, most notably Variations2, certainly strongly affects my thinking about FRBR and related efforts. I'm sure that's obvious from my earlier question wondering how XOBIS would handle a situation that the Variations2 model is designed around.

There are also some very interesting items in the report's bibliography, including a project mailing list (renamed since the version listed here, and looks low-traffic). Time for citation chasing!

Wednesday, March 23, 2005

Postcoordinated subject headings

There has been an interesting discussion on the Autocat mailing list over the last two weeks (well it's died down now, but I haven't gotten around to writing about it yet...) with the subject "The inadequacies of subject headings." The discussion has centered on a few posters questioning whether the LCSH-style focus on precoordinated headings is really a good idea. Several posters proposed (not all by name) postcoordinated headings as more useful, both for end-users and for catalogers. More than one person mentioned the large amount of training required for catalogers to effectively apply headings from a precoordinated system.

I was struck in the discussion by the widespread lack of big-picture thinking about the issue, and the corresponding lack of awareness of the many initiatives going on in this area. There were certainly some members contributing to the discussion who have spent some time thinking about this issue, but many seemed afraid of the idea. I got the sense that many folks were trained on LCSH, that's what they use, and why in the world would they want to use anything else? When posts mentioned specific postcoordinated schemes (FAST, AAT, etc.) they tended to be mentioned as something the person had heard of but never used and didn't fully understand. I'm generalizing a bit here, but that tone was definitely present.

I don't know that I have anything concrete to say other than that I've noticed a trend of resistance to non-LCSH subject systems, but I do think that as catalogers are increasingly being asked to be metadata experts (and by that I mean metadata in a broad sense, not just traditional cataloging practice!) they'll more and more need to know about what vocabularies are out there. A huge part of my job as a Metadata Librarian is choosing among the various data structure and data content standards available for a given implementation. We're definitely past the days when one size (MARC/AACR/LCSH) fits all. The more all sorts of librarians learn about alternatives and can make good decisions about when they're appropriate to use, the better off our whole profession will be.

Sunday, March 20, 2005

"We're not competing with Google"

I was in a meeting recently that had as an agenda item Google Print and its effect on our current library services. (I seem to be having this meeting a lot lately.) I was by far the youngest in the room and by far the attendee working most frequently in areas outside of "traditional" librianship (whatever that means). I intentionally spent most of the meeting listening rather than talking. One statement in particular made by someone in the room struck me and started me thinking a great deal: "We're not competing with Google."

I didn't respond to it at the time, but the statement has been churning around in my head ever since. Whether or not it's true depends, of course, on what one means by "competing." If we mean, "attempting to do exactly the same thing," then that's pretty much true. While we're both in the information business, the way in which we approach it is fundamentally different. And that's OK. But if we mean "fighting for the attention of users" or "fighting for the perception that we provide valuable services worth funding," then maybe we are competing with Google. The differences between libraries' missions and the way in which we go about achieving them is important to us, but perhaps it's too subtle for a large proportion of the population. Certainly there are lots of folks out there that think Google can and will replace libraries, even if we think they're wrong.

So what does this mean? Well, I think it means that libraries need to continue to promote what we do and why. Not in the preachy Michael Gorman style proclaiming from on high to the masses that libraries are the cornerstone of high civilization and those who disagree aren't worth thinking about, but rather by building and delivering services that meet our users' needs. In the rapidly changing information environment, this means we do need to be rethinking how we do a lot of what we do. Let's remember our core principles of preservation, collocation, and free access, and find new ways to implement these in today's environment and for today's diverse users.

Wednesday, March 16, 2005

A DC frustration

I had another one of those <sigh> moments about Dublin Core today. I've got some really amazingly simple bibliographic data I need to put in a within a METS document. At first I said, "Hey, let's just use DC. It will be easy." (Note to self: anytime you say "It will be easy," you're asking for trouble.) Everything was going along swimmingly until I was thinking about boilerplate text to put in all the records. One of these pieces of text would be to indicate the department at my institution that housed the materials in question. Ding, ding, ding! Alarm bells! There's no good place for this in simple Dublin Core! (Or qualified Dublin Core for that matter.)

I've dealt with this exact situation before, I guess I was blocking it out because it's SO annoying. Some folks would put this information in <dc:contributor>, and in fact several of my OAI sets do just this in their DC records. I suppose that might be OK, but the DC Contributor definition is "An entity responsible for making contributions to the content of the resource" and I don't know if I'm so comfortable calling "paying somebody to digitize this stuff and then asking another department to 'put it up on the Web'" "making contributions to the content of the resource." Some folks would put this information in <dc:publisher>, but again I'm skeptical. "An entity responsible for making the resource available" (DC Publisher definition) does apply to the digital resource. However, we're dealing with published materials here whose publisher for the print item can be an important access point. And we don't (nor does pretty much anybody) have a sophisticated mechanism in place for making good 1:1 principle records and linking them all together in a way that allows users to search on things meaningful to them and get meaningful results back. Putting our holding institution in Publisher in this environment would not serve our users' needs.

I started out using a hack I'd used before: put the holding info at the beginning of a <dc:source> field and add to the end the local call number so it fits the Source definition. But then I got annoyed at using what I consider a hack. So I started digging around. The Western States Dublin Core Metadata Best Practices made up their own element (currently called "Contributing Institution") and don't map it to DC. This is one of the very few elements they go completely out of DC for. The DC Libraries Working Group made a proposal in 2002 for a new DC element called holdingLocation, but by the time this proposal was reviewed by the Usage Board, MODS had gotten off the ground, so, the UB decision said to use the MODS <location> element instead.

So the DC solution to this problem is to use an Application Profile that borrows an element from another schema. But once you start doing this, the draw of DC (simplicity!) is lost. I'm probably just going to use MODS instead. Sigh.

Monday, March 14, 2005

Defining "librarian"

I've seen a few articles and discussions recently converging around the idea of defining what a "librarian" actually is. The March 2005 issue of American Libraries has a cover story about paraprofessionals working in libraries and the perceptions of them by patrons and "librarians" at their place of work, there has been an ongoing thread with the subject "End of Librarianship" over the Autocat mailing list weeks 1 and 2 of March 2005 (browse the archives), and a posting today at reporting a library director job indicating an MLS was optional for applicants. These all touch in some way on whether an MLS, a job title including the word "librarian," or a job in a library makes one a "librarian."

Certainly the definition of "librarian" is contextual. The American Libraries article asks some library paraprofessionals what their answer is to the question "Are you the librarian?" Since a patron asking that question almost certainly means "Can you help me?" rather than "Do you have an MLS?" or "Does your job title say you're a librarian?" so the answer there in my opinion should be an emphatic YES.

But many librarians are extremely protective of this label. It represents a significant investment of time, money, and intellect into earning a professional degree. And that's certainly nothing to sneeze at. (Even if some MLS programs in this country today can't reasonably be described as "rigorous.") However, I certainly know a number of people in jobs with titles including "librarian" who were hired under the rationale "MLS or equivalent experience" who do excellent jobs. Shouldn't one's ability to perform the duties of a position be the primary criterion for hiring them? I tend to think that a piece of paper bearing the designation MLS doesn't necessarily tell an employer whether or not an applicant is qualified.

I guess the argument comes down to whether the term librarian should refer to "what you do" or "who you are." And I can see how each would be appropriate in different circumstances. I tend to believe one should demonstrate his or her skill and professionalism in their interactions with people and in their work performance, rather than assuming an acronym and a diploma are an accurate indication.

Saturday, March 12, 2005

"Google at the Gate"

In the March 2005 issue of American Libraries, there is an article entitled "Google at the Gate" containing questions about the recently-announced Google digitization project, with answers from Michael Gorman of Cal State-Fresno and ALA president-elect, Deanna Marcum of LC, Susan McGlamery of OCLC, and Ann Wolpert of MIT. The article appears at an interesting time, just as the buzz is dying down from what some have called "Gormangate" - a huge reaction, especially among the blog community, to comments Michael Gorman made recently lampooning the value of bloggers.

In this article, Gorman continues the dismissive style of rhetoric that have incensed so many in his previous comments on the Google projcet and on blogging. The tone is very much one of a person who is certain he is right and need not consider any other arguments put to him. Two quotes in particular caught my eye:

"Any user of Google knows it is pathetic as an information retrieval system..."

This quote, of course, depends heavily on the definition of "information retrieval system." The remainder of the sentence references the traditional IR research metrics of recall and precision, so it's probably reasonable to assume that Gorman is measuring the effectiveness of Google along those lines. And that's one fair way to measure. However, your random Google user is probably unlikely to measure Google according to those terms. Most information needs are for something on a topic rather than everything on a topic. We in libraries are used to (and should be!) focusing on the latter. But that doesn't mean it's the only way to design a search engine.

"I cannot see the threat to small libraries [from the Google digitization project], nor can I see much of an advantage."

Gorman's answer to this question stands in stark contrast to those of the participants in the interview. The others give multi-sentence responses, addressing at least some possiblities for advantages and disadvantages to small libraries from the Google digitization project. But the style of Gorman's answer is, again, dismissive, giving the impression he's made up his mind that the Google project is "bad" and that there is no need to consider its impact on libraries, small or otherwise. Perhaps he's carefully thought through all the issues and this quote is the result of a great deal of reflection. But there's no explanation presented, so the reader cannot know. I suspect this style of rhetoric, passing down from on high a conclusion without any explanation or support, will not prove effective for libraries as we increasingly need to talk about our services and expertise to those outside the profession.

Wednesday, March 09, 2005

PCC reaction to AACR3 Part 1 draft

I've posted before about the divisive reaction occuring in the library world to the forthcoming AACR3 drafts. A new development came about this week when the Program for Cooperative Cataloging (PCC) publicly released their official comments sent to CC:DA (who will then send them on to the JSC) on the AACR3 part 1 draft.

I find the comments strange, to say the least. They're not consistent in tone, which I suppose can be expected for a document compiled from comments from a number of people. But for the most part, it keeps a reasonable message, connecting specific comments to large-scale goals. One major complaint of the document is that evaluation is happening before the introduction is written outlining the principles underlying the rules. While this is true, it's not as if there is no information available on what these principles are. They are discussed in virtually every presentation on AACR3 (some presentation slides are online) and are given in a more formal discussion paper on the AACR3 site. More information on AACR3, its goals, and development process, are available from the JSC site.

The PCC comments spend several pages lampooning the process by which AACR3 is being developed. No news there, and I understand they do this to get on the record objecting, but I wonder if it doesn't do them more harm than good. Certainly at this point the process isn't going to change, and I can envision a reader at the JSC giving less weight to the wealth of very reasonable feedback present in the bulk of the document as a result of the opening diatribe. But then again I could definitely be wrong about that.

There is one other major item of interest to me in this document. The paragraph beginning on page 9 and continuing on page 10 (those are the page #s in the PDF, the printed page #s are 8 and 9), contains what I can only characterize as a threat:

"If COP/JSC continue AACR3 along its current path, AACR3 simply will not be worth our while...AACR3 will become an even bigger laughing-stock in the broader information community. As Chair of the Program for Cooperative Cataloging's Standing Committee on Standards, I feel that I will need to recommend to PCC to look for or develop alternate metadata standards to govern its records."

If this is a threat, it rings as an empty one. While I back to some extent the community asking for more open access to the AACR3 revision process, I don't believe this sort of statement will cause anything to change. Maybe over the next few decades we'll see a large-scale shift in library descriptive practices away from AACR in favor of something else. I'm a bit skeptical of this, but it could happen, albeit slowly. The rhetoric here sounds to me like a child holding her breath and threatening to run away (and believe me, I know how those children feel!). It doesn't read as effective in meeting its goal.

But I don't want these few statements to take away from a great deal of very useful comments in this document. This is the chance to let voices be heard in AACR3 development; here's hoping they don't fall on deaf ears.

Monday, March 07, 2005


I just received this month's copy of American Libraries in the mail, and the printed preliminary program for ALA Annual was included with the mailing. I noticed at first glance that the first session listed under the "Collection Management and Technical Services" track is "Cataloging Cultural Objects: Toward a Metadata Content Standard for Libraries, Archives and Museums." I'm thrilled that CCO is getting some air time in the "mainstream" librarian community - this sort of widening of perspective is sorely needed.

There's a lot of confusion out there, even among experts, about what CCO is supposed to be for, and what its connection is to VRA Core and CDWA. As I read it, CCO is to VRA Core roughly as AACR is to MARC. But then again, many folks don't fully understand the relationship between AACR and MARC is, either, so I suppose the confusion here isn't all that surprising. I'm looking forward to more discussion of CCO to help get the word out.

There's a whole lot more in the program that looks very exciting. I can't wait for June!

Wednesday, March 02, 2005


An interesting development! The DCMI has been approached by the AACR3 folks for input into connecting AACR3 development with other metadata initiatives. Four representatives from the DC Libraries community will be providing offical feedback, and discussion will take place on the DC Libraries list. This is certainly encouraging news!

I'm going to be out of town again until next Monday and won't be posting for the road. Happy weekend!