Saturday, October 22, 2005

Separating data entry from data structure

I believe we've fallen extremely short in at least one area of potential for improving our cataloging and metadata creation systems -- user interfaces. We're still stuck in a mindset developed in the early days of the MARC format, whereby data is entered in the exact form in which it needs to be stored. When Web-based OPACs and cataloging modules emerged, cursory attempts to "improve" the interface appeared, but the changes were almost exclusively surface changes (labeling, etc.), and not implemented with community involvement.

But of course current technology provides many possibilities for a design layer in between the data entry interface and the data storage format. Metadata creation by humans is expensive. We need to do everything we can to design data entry interfaces that speed this process along, that help the cataloger to create high-quality data quickly. Visual cues, tab completion, and keyboard shortcuts are just a few simple tricks that could help. More fundamental approaches like automatic inclusion of boilerplate text and integration of controlled vocabularies could provide enormous strides forward.

Yet with all of this potential, I frequently (WAAAAAY too frequently) have conversations with librarians where it becomes clear they're focused exclusively on the data output format. It never even occurred to them that a system could do something with entered data that doesn't require cataloger involvement. (Man, I knew we librarians were control freaks, but this really takes the cake.) Of course, librarians aren't on the whole system designers. That's OK. But all librarians still need to be able to think creatively about possibilities. I'm convinced that the way forward here is to take the initiative to develop systems that demonstrate this potential, that show everyone what is possible with today's technology. Everyone has vision, yet that vision always has limits. By demonstrating explicitly a few steps forward from where we are, vision can then expand that much further.


Dorothea said...

Wordy McWord. Add a few statements about error checking (which computers can do a lot of), and I'm completely there.

Jenn Riley said...

Yes! Great call, Dorothea. Error checking is an area that we don't do enough of, but that most people are familiar enough with that it could be used as a launching pad for discussing further options.

I'm working with a set of MARC records now doing some FRBRization experiments. A developer on the project was just astounded at the number of records repeating non-repeatable MARC fields and subfields, and the fact that an ILS exists that doesn't perform even this basic level of validation. Starting with "what if the cataloging module checked to make sure..." is a fantastic way to open this conversation. Bravo.