Tuesday, November 24, 2009

DRM & Me Part III: DOIs, Metadata and Long Tails

In Part II of this retrospective I discussed the NetRights years and our novel approach to binding static and dynamic metadata to objects in the early days of the Web. In this installment I'll cover my years at Yankee Rights Management (YRM) (a division of YBP, Inc., especially the development of Copyright Direct(tm) and my personal realization of the potential of content identifiers and their associated metadata. Note: It was actually during my YRM years that I coined my now-infamous expression (referenced in Part II of this series), Metadata is the lifeblood of e-commerce!

YBP, originally known as Yankee Book Peddler and now a division of Baker & Taylor, have been a leader in using information technology to provide books and other materials, including bibliographic data --- metadata! --- to university and research libraries for more than 35 years. YBP executive Glen M. Secor also happened to be a professor of law at the Franklin Pierce Law Center specializing in copyright law, with a particular interest in the unique challenges of copyright in the emerging digital, networked environment. Glen and I first met when I presented my early Ph.D. work at DAGS'95 in Boston (prior to the founding of NetRights) and from that point on took an interest in this metadata-oriented, iconoclastic approach to copyright. Glen spearheaded YBP's investment in NetRights in 1996, and with the sale of NetRights in 1997 I joined with Glen to launch Yankee Rights Management (YRM) in mid-1997.

One of YRM's goals was to build a business solving rights management problems for stakeholders in YBP's ecosystem, especially scientific/ technical/ medical (STM) publishers and their university and research customers. With the help of Kelly Frey, then VP of Business Development for the Copyright Clearance Center (CCC), we conceived of Copyright Direct(tm), which soon became the first web-based, real-time, pay-as-you go copyright permissions service for a wide variety of multimedia types. As with LicensIt(tm), the usage model for Copyright Direct(tm) would be simple:

  1. From a web page or PDF document, the user would click on a distinctive green "Copyright Direct" icon
  2. A mini-window would pop up clearly identifying the work and presenting available options for that item
  3. The user would step through a short series of menus to specify their use and, if available, transact their request (via credit card!) and receive their permissions
  4. If the usage they needed was not available, the system collected the user's plain-text request and began a managed workflow between the user and the rightsholder
  5. When all parties agreed, the agreement became a "template" and was added as an available option --- the system learned and adapted
  6. At the end of each month, rightsholders would receive royalty payments.

Glen Secor, Jennifer Goodrich and I demonstrated my Copyright Direct prototype to a variety of stakeholders and thought leaders at the Frankfurt Book Fair in October, 1997 and collected critical feedback. We returned "triumphantly" in October 1998 with a booth in the main hall, a live Copyright Direct demo (now powered by the fledgling DOI standard and a major "beta" rightsholder: the IEEE!

But throughout 1998-1999 we also came to realize a fundamental problem with the Copyright Direct model: it depended not only on a ready supply of clean descriptive metadata from rightholders, but also upon a rich set of rightsholder-generated rights metadata, including pricing and other licensing templates, none of which existed! Our goal was to use lightweight, easily accessible permissions transactions to provide "found money" to rightholders, but it cost too much to generate the metadata required to fuel the system! In the September 2006 issue of D-Lib magazine I extrapolate this problem in my article, Handle Records, Rights and Long Tail Economies.

Chris Anderson's "long tail" argument (see also his Long Tail blog) asserts that modern systems based entirely on metadata make "unlimited selection" economically viable. I argue that yes, metadata really is the lifeblood of e-commerce and is the enabler of phenomena like the seemingly-unlimited selection of products through Amazon.com ("make everything available, help anyone find it!"), but all metadata must somehow still be generated, verified and published, and the cost of creating and supporting the neccessary metadata supply chains must not exceed the anticipated value that can be redeemed. Since the demand of a given "unit" may be exceptionally low, the "per unit" cost of creating or aggregating each unit's metadata halo must be near-zero!

These principles can be extrapolated to the "Web of Data"; indeed, by coupling Linked Data principles with a low-overhead infrastructure for authenticating metadata assertions, the cost of metadata may indeed approach zero. I'll talk about that in a future blog entry...

No comments:

Post a Comment