Our Cataloging Data: The Future of Sharing and Creating

Chris CatalfoMonday, 3:30 – 5:00

Suddenly our catalog records are hot commodities The Library of Congress wants out of its “alpha role” in providing cataloging. OCLC wants to clarify who may share contributed records with whom. Biblios.net and Open Library offer open access alternatives. RDS disseminates catalog records into web-based entities. Where is all this leading? How will our data be created and shared in the future? Chris Catalfo, programmer at LibraryThing.com, shares his thoughts on the future of cataloging data. The NETSL business meeting and reception are included.

Available tools: OCLC, Open Library (good for sharing with web users), Biblos.net (good for sharing between libraries), Hathi Trust (online digital hosting)

How are we doing?

  • Sharing: ineffectual. Mechanism are out-dated, not everyone can participate. Needs good software to support sharing and finding of records (OCLC does do this, but still not all libraries can afford it)
  • Z39.50 – permits standardized sharing, but dates to the 70’s so it is a bit old and is a barrier to non-librarian programmers who could help make our data more available
  • New/better protocols: OAI Protocol, SRW/U
  • Another issue is who owns the data and records? OCLC? The libraries? Can they be owned?
  • OPACs: need to embed metadata into html catalog page, using OpenURL COins, Zotero (Firefox plugin), Librarything for Libraries catalog plugin

Looking to the future, none of these tools quite meet all needs: sharing between libraries, easy to use for non-librarian web searchers. So how should we share in the future?

Sharing is important:

  • The more we share with each other, the cheaper it is for libraries
  • The easier the data is to find, the better for our patrons (and libraries, since we’re easier to find)

What do we need?

  • More modern protocol XML over HTTP?
  • Clear up the ownership question
  • A platform to share to

How do we create data, and how can we improve?

  • Copy catalog or original cataloging (then keep internally or share back with OCLC)
  • Non-libraries: Google Books data comes from publishers, libraries, OCR scans (this is not perfect); Amazon mostly comes from publishers; flickr and LibraryThing (the wider web world) mostly comes from users
  • Libraries can learn a little from each of these alternatives: users are not always accurate, but it is large volume, powerful and popular
  • Can cataloging rules be streamlined – AACR, Dublin Core – and give catalogers more time to focus on other things
  • Need to get past political arguments of today and work towards the betterment of the data

Questions and Answers

Where to publishers get their data?
They type it in, so we shouldn’t need to duplicate that effort?

Is there copyright issues if they are creating it?
-Not sure…
It is part of their marketing effort, so they want it out there. So you’d think they’d want it accurate, but that isn’t always the case – so we also need a shared maintenance system
-Yes, it’d be like open source software, where everyone has access to various versions

Does Librarything do data cleanup of contributed data?
LT staff doesn’t, but dedicated users do authority control of author cleanup and cross-references

Is it in our long-term best interest to consider record sharing by itself? OCLC isn’t just a source for records, it provides a service

Does LT have tag guidelines?
I don’t think LT does much (I work more with LTfL) – there was a tag combining feature, but it was turned off – so it’s all user-generated.

Tags have the great benefit of not just connecting users to books, but connects users to users, but it could benefit from standardization (“my sister” is subjective, not objective).
Right, tags should compliment a structured language, not used exclusively.

%d bloggers like this: