Thursday, March 05, 2009

Must Watch! Michael Edson: "Web Tech Guy and Angry Staff Person"

I heard Michael Edson (Director of Web and New Media Strategy for the Smithsonian) speak at the IMLS WebWise conference last week. He delivered an astonishingly good talk centering around an animation entitled "Web Tech Guy and Angry Staff Person." It's a riot, and the animation sets a lighthearted attitude that reinforces his disclaimer that he's not poking fun or diminishing the very real tensions cultural heritage institutions face as our communication, collection, and even the dreaded B-word (business!) models change underneath us. Instead, I believe it's effective in using exaggeration to highlight some underlying issues and think intelligently about what it takes to say we CAN do something rather than taking the easy road and saying no. We can't just dismiss the challenges - understanding them will help us address them.

Sunday, March 01, 2009

Google vs. Semantic Web

On a number of fronts recently I've been thinking a bunch about RDF, the DCMI Abstract Model, and the Semantic Web, all with an eye towards understanding these things more than I have in the past. I think I've made some progress, although I can't claim to fully grok any of these yet. One thing does occur to me, although it's probably a gross oversimplification. The difference in the Semantic Web/RDF approach from the, say, Google approach is this: is the robustness in the data or is it in the system?

The Semantic Web (et al) would like the data to be self-explanatory, to say itself explicitly what it is it is describing and with explicit reference to all the properties used in the description. The opposite end of the spectrum is systems like Google which assume some kind of intelligence went into the creation of the data but doesn't expect the data itself to explicitly manifest it. The approach of these systems is to reverse engineer that data, getting at the human intelligence that created it in the first place.

The difference is one of who is expected to to the work - the sytem encoding the data in the first place (Semantic Web approach) or the system decoding the data for use in a specific application. Both obviously present challenges, and it's not clear to me at this point which will "win." Maybe the "good enough and a person can go the last bit" approach really is appropriate - no system can be perfect! Or maybe as information systems evolve our standards for the performance of these systems will be raised to a degree where self-describing data is demanded. As a moderate, I guess I think both will probably be necessary for different uses. But which way will the library community go? Can we afford to have feet in both camps into the future?