Showing posts with label metadata. Show all posts
Showing posts with label metadata. Show all posts

Wednesday, December 16, 2009

Recent Efforts toward Linked Multimedia Metadata

Recently I've been "having a think" on issues ranging from rights expression for datasets to realising the value of linked data, but frankly I've felt that something is missing ; even with scientific and government linked datasets going online, a voice inside me wonders if the stakes are still (arguably) too low to really shake things up. I've been wondering what kind of data we haven't been hearing about --- the kind of data that if it were published according to linked data principles would surely lead to the emergence of outrageously cool applications, demonstrate the inherent value of the linked data approach, and perhaps even test some interesting new monetisation models? The area that immediately came to mind was multimedia metadata, especially semantic metadata for video and audio content.

Several recent venues have focused on the general topic of generating, publishing and using semantic multimedia metadata, including the Oct-Dec 2009 IEEE Multimedia Magazine special issue on Multimedia Metadata and Semantic Management, and SAMT2009: The 4th International Conference on Semantic and Digital Media Technologies (3-4 Dec 2009; Graz, Austria). Both of these are "powered" by members of the Multimedia Metadata Community, an outgrowth of the MPEG-7 and MPEG-21 worlds that "brings together experts from research and industry in the area of multimedia meta data interoperability for collaborative working environments." Finally, since 2008 the W3C has been host to its Video in the Web activity; within this the Media Annotations Working Group is developing an ontology and API to facilitate cross-community sharing and use of multimedia metadata in the Web.

IEEE Multimedia (Oct-Dec 2009): This special issue features six research articles focused on different facets of the "semantic management of multimedia and multimedia metadata" ranging from retrieval and processing to consumption and presentation. Of the six, perhaps the first two are most relevant in today's linked data environment

  • "Managing and Querying Distributed, Multimedia Metadata." This article advocates the use of a centralized metadata résumé --- a condensed, automatically-constructed version of the larger metadata set --- for locating content on remote servers. The authors demonstrate the advantages of their approach using conventional semweb technologies to represent and query semantic metadata.
  • "Semantic MPEG Query Format Validation and Processing." The authors present their semantic validation of MPEG Query Format (MPQF) queries and their implementation of a practical MPQF query engine over an Oracle RDBMS. The article introduces methods for evaluating MPQF semantic-validation rules not expressed by syntactic means within the XML schema. The authors highlight their prototype implementation of an MPQF-capable processing engine using several query types on a set of MPEG-7 based image annotations.
  • "Diversifying Image Retrieval with Affinity-Propagation Clustering on Visual Manifolds." The authors describe a post-processing subsystem for retrieval systems that improves the diversity of results presented to users. Image retrieval systems typically focus on the similarity between the retrieval and sample images, where the relevance of the retrieval results is considered but the diversity is neglected. Ideally, retrieval results should contain a diverse array of items representing a variety of subtopics. This article presents a method for removing duplicate images from a "top 20" list, replacing them with images representing new subtopics.
  • "A Media Value Chain Ontology for MPEG-21." The authors have created a semantic representation of intellectual property derived from MPEG-21 Part 19. Their model defines the minimal set of types of intellectual property, the roles of users interacting with them, and the relevant actions regarding intellectual property law. The article is a helpful guide to the standardization efforts, with its many examples and useful insight into the multimedia value chain.
  • "Using Social Networking and Collections to Enable Video Semantics Acquisition." The authors consider media production, acquisition, and metadata gathering, the first elements of the multimedia value chain. Methods from video annotation and social networking are brought together to solve problems associated with gathering metadata that describes user interaction, usage, and opinions of video content. Individual user-interaction metadata is aggregated to provide semantic metadata for a given video. Coolness alert: The authors have successfully implemented their model in a Flex-based Facebook application!
  • "A Web-Based Music Lecture Database Framework." This article describes semantic audio authoring and presentation for Web-published music lectures. The authors propose a dynamic programming-based algorithm for MIDI-to-Wave alignment to explore the temporal relations between MIDI and the corresponding performance recording. The synchronized MIDI and wave can be attached to many kinds of teaching materials where synchronized presentations can add value.

SAMT'09: Nearly 15 years ago I had the good fortune to present my early rights metadata research at EDMEDIA'95 in Graz (Austria); visiting the conference web site this weekend, especially seeing the real-time image of the historic "Urhturm" on the hill high about the city, brought back a flood of fond memories! The topics of the three tutorials offered at SAMT'09 demonstrate that current research has definitely taken a turn toward getting multimedia multimedia into the Web. (Unfortunately, only slides from the first are currently available):

  • "Web of Data in the Context of Multimedia (WoDMM)." How multimedia content can be integrated into the Web of Data and how users and developers can consume and benefit from linked data. (slides)
  • "MPEG Metadata for Context-Aware Multimedia Applications (MPEG)." Overview of MPEG metadata formats that enable the development and deployment of content- and context-aware multimedia applications.
  • "A Semantic Multimedia Web: Create, Annotate, Present and Share your Media (SemMMW)." How multimedia metadata can be represented and attached to the content it describes within the context of established media workflow practices, and how users can benefit from a Web of Data containing more formalized knowledge.

For much more information, see the Proceedings from the 20th International Workshop of the Multimedia Metadata Community on Semantic Multimedia Database Technologies (SeMuDaTe'09)

Metadata Standards for the Web of Data: Finally, research such as that describe above has led to progress on the standards front. As the IEEE Multimedia guest editors note in their foreword, since 2008 there as been quiet but steady progress within the W3C's Video in the Web activity, which was chartered to make video a first class citizen of the Web by creating an architectural foundation that by taking full advantage of the Web's underlying principles will enable people to create, navigate, search, link and distribute video... Of its three working groups, the editors highlight the Media Annotations Working Group as being motivated by progress in RDF and topic maps and appears most aligned with emerging linked data activities.

In their forward, the IEEE Multimedia editors provide a very nice summary of the core problem with multimedia metadata and thus the motivation for the W3C efforts:

Most of the standards are tailored to specific application domains. Examples include European Broadcasting Union P/Meta 2.0 for broadcasting; TV-Anytime and SMPTE Metadata Dictionary for TV; and MPEG-21 for the delivery chain of multimedia and technical aspects (such as EXIF). These standards exhibit a different semantic level of detail in their descriptions (from simple keywords to regulated taxonomies and ontologies). Only some of the standards are general purpose, for instance MPEG-7...

Coolness is on the Horizon: This rather lengthy posting is merely a sampling of works-in-progress, not only to put multimedia metadata on the Web but more importantly to establish such metadata as a useful and valuable part of the Web. Combine with such visionary efforts as the revamped, linked data-driven BBC web site, I'm increasingly confident that a new generation of linked data applications are around the corner, fueled this time by datasets that add video and audio to the semantic mix. Bring it on!

Tuesday, November 24, 2009

DRM & Me Part III: DOIs, Metadata and Long Tails

In Part II of this retrospective I discussed the NetRights years and our novel approach to binding static and dynamic metadata to objects in the early days of the Web. In this installment I'll cover my years at Yankee Rights Management (YRM) (a division of YBP, Inc., especially the development of Copyright Direct(tm) and my personal realization of the potential of content identifiers and their associated metadata. Note: It was actually during my YRM years that I coined my now-infamous expression (referenced in Part II of this series), Metadata is the lifeblood of e-commerce!

YBP, originally known as Yankee Book Peddler and now a division of Baker & Taylor, have been a leader in using information technology to provide books and other materials, including bibliographic data --- metadata! --- to university and research libraries for more than 35 years. YBP executive Glen M. Secor also happened to be a professor of law at the Franklin Pierce Law Center specializing in copyright law, with a particular interest in the unique challenges of copyright in the emerging digital, networked environment. Glen and I first met when I presented my early Ph.D. work at DAGS'95 in Boston (prior to the founding of NetRights) and from that point on took an interest in this metadata-oriented, iconoclastic approach to copyright. Glen spearheaded YBP's investment in NetRights in 1996, and with the sale of NetRights in 1997 I joined with Glen to launch Yankee Rights Management (YRM) in mid-1997.

One of YRM's goals was to build a business solving rights management problems for stakeholders in YBP's ecosystem, especially scientific/ technical/ medical (STM) publishers and their university and research customers. With the help of Kelly Frey, then VP of Business Development for the Copyright Clearance Center (CCC), we conceived of Copyright Direct(tm), which soon became the first web-based, real-time, pay-as-you go copyright permissions service for a wide variety of multimedia types. As with LicensIt(tm), the usage model for Copyright Direct(tm) would be simple:

  1. From a web page or PDF document, the user would click on a distinctive green "Copyright Direct" icon
  2. A mini-window would pop up clearly identifying the work and presenting available options for that item
  3. The user would step through a short series of menus to specify their use and, if available, transact their request (via credit card!) and receive their permissions
  4. If the usage they needed was not available, the system collected the user's plain-text request and began a managed workflow between the user and the rightsholder
  5. When all parties agreed, the agreement became a "template" and was added as an available option --- the system learned and adapted
  6. At the end of each month, rightsholders would receive royalty payments.

Glen Secor, Jennifer Goodrich and I demonstrated my Copyright Direct prototype to a variety of stakeholders and thought leaders at the Frankfurt Book Fair in October, 1997 and collected critical feedback. We returned "triumphantly" in October 1998 with a booth in the main hall, a live Copyright Direct demo (now powered by the fledgling DOI standard and a major "beta" rightsholder: the IEEE!

But throughout 1998-1999 we also came to realize a fundamental problem with the Copyright Direct model: it depended not only on a ready supply of clean descriptive metadata from rightholders, but also upon a rich set of rightsholder-generated rights metadata, including pricing and other licensing templates, none of which existed! Our goal was to use lightweight, easily accessible permissions transactions to provide "found money" to rightholders, but it cost too much to generate the metadata required to fuel the system! In the September 2006 issue of D-Lib magazine I extrapolate this problem in my article, Handle Records, Rights and Long Tail Economies.

Chris Anderson's "long tail" argument (see also his Long Tail blog) asserts that modern systems based entirely on metadata make "unlimited selection" economically viable. I argue that yes, metadata really is the lifeblood of e-commerce and is the enabler of phenomena like the seemingly-unlimited selection of products through Amazon.com ("make everything available, help anyone find it!"), but all metadata must somehow still be generated, verified and published, and the cost of creating and supporting the neccessary metadata supply chains must not exceed the anticipated value that can be redeemed. Since the demand of a given "unit" may be exceptionally low, the "per unit" cost of creating or aggregating each unit's metadata halo must be near-zero!

These principles can be extrapolated to the "Web of Data"; indeed, by coupling Linked Data principles with a low-overhead infrastructure for authenticating metadata assertions, the cost of metadata may indeed approach zero. I'll talk about that in a future blog entry...

Monday, November 23, 2009

DRM & Me Part II: "Copyright for the rest of us!"

In Part I of this retrospective I covered the raw beginnings of my interest and research in enabling copyright in the digital, networked environment. In this second part I'll discuss work my colleagues and I did to take these ideas commercial, and I'll continue to focus on core principles of my work in content identification and metadata architecture, summed up by this quote (attributed to me!): Metadata is the lifeblood of e-commerce!

As the spring of 1995 approached it became clear that there was an opportunity to make a unique contribution to improving the world of copyright in the digital, networked environment. As I prepared to present a paper at ED-MEDIA 95 in Graz, Austria, I was approached by local businessman who had been principals in a successful software company, Corporate Microsystems, Inc., that had just been acquired by a global enterprise software company. As the story goes, they were looking for an original idea upon to base their next start-up, and I was looking for a strategy for implementing my research ideas that would scale well beyond what I was capable of doing part-time as a researcher at IML. Over the summer of 1995 my future partners Gerry Hunt, Theo Pozzy, Henry Adams, Hal Franklin and I held numerous planning meetings, and on 1 November 1995 NetRights, LLC was born!

We started NetRights at a time when other players, in particular InterTrust (then still called EPR) and IBM InfoMarket were starting to draw attention to their robust, encryption-based "envelope" strategies for "protecting copyright" --- quotes intentional! --- and the term digital rights management wasn't yet in standard use. Taking a clue from my prototype work at Dartmouth, the core idea behind LicensIt(tm) (later @attribute) was to "objectify" flat multimedia objects using secure wrappers whose primary objective was to provide structured metadata about the object in hand. Our goal was to provide rich static and Internet-served dynamic metadata to facilitate "conversations" between creators and users of content. Our motif for "experiencing" copyright was a simple and elegant: A user sees a photo, audio clip, video, even an embedded text snippet; they "right-click" on it and a tabbed set of property pages is displayed; they use those various pages to view descriptions of the content, to start emails with the creator or other contributors, to view default terms of use, even to initiate live rights transactions, all while staying within the context of use.

From a technical standpoint we were using OLE structured storage in very much the same way as XML (and especially RDF) is used today. Our development team, including Mark Schlageter, Norm Tiedemann, Mark Markus and Dan O'Connor (our sole Mac-head!), created amazing tools that let us design not only these metadata structures, but to actually create "soft" property-page layout templates (think CSS!) that were packaged with the metadata, enabling customized content-specific views. Considerable infrastructure was required to make all of this work, starting with OLE services installed on the user machine, to the tools for design and packaging, to back-end services for object registration. Also, major, bet-the-company decisions about PC vs Mac, "networked COM" (which became ActiveX), Spyglass/IE vs Mosaic/Netscape support, etc. To a startup company, Bill Gates' commitment of Microsoft to "embracing and extending" the Internet in late 1995 was helpful!

Trade journals like Seybold took notice and wondered whether our "kinder, gentler" approach to copyright, which by that time (June 1996) we were calling "enhanced attribution," might actually be a better option than so-called "opaque packages." Publishers were torn; they liked the obvious value our approach was bringing to the user and the fact that we were actually facilitating the copyright process, but they also couldn't get over their perceived need for "strong protection."

Today we see echoes all over the Internet of infrastructure and technology that make "copyright for the rest of us" radically easier than it was at NetRights birth in 1995. First and foremost are systems of globally unique, persistent object identifiers, in particular the Digital Object Identifier (DOI), implemented on CNRI's Handle System. (As it happens, that same 1996 issue of Seybold also carried an article about the birth of the DOI!) RDF provides a universal information model for conveying metadata assertions (local and remote) about objects; RDFa provides a way to do this within (esp.) web documents. The recent massive and growing interest in publishing Linked data by organizations, including governments, has fortified distributed metadata as a means of conveying object information from a variety of sources. And special mention must be made of The Creative Commons, which has applied most of these techniques to not only make the process of copyright readily accessible to creators and users all over the world, but also to make content use safe through the explicit and unambiguous communication of terms of its use.

Providing immediate, unambiguous expression of copyright information and connections to processes for any piece of content was my mantra starting in the lab at Dartmouth, then at NetRights, and following our acquisition in 1997 by Digimarc, with the creation of Copyright Direct(tm) at Yankee Rights Management (YRM) and my subsequent involvement with the content identification and metadata communities. More on that in our next installment...

Wednesday, November 18, 2009

DRM & Me: A 15-year retrospective (Part 1)

Fifteen years ago, in November 1994, I was two years into a Ph.D. program at the Thayer School of Engineering at Dartmouth College. I had entered Dartmouth with a background in computer engineering and an interest in "special-purpose systems," a narrow field that focuses on creating computing systems that are exceptionally good at a very narrow range of operations, such as particle-in-cell simulation or gene sequence processing. This interest led me across campus to become a research assistant in Dr. Joseph V. Henderson's pioneering Interactive Media Lab at Dartmouth Medical School --- at first to consider the infrastructural problems of delivering IML's high-value multimedia training programs across the Internet, and by mid-1994 over a novel set of technologies known as the "World Wide Web."

As the story goes, the IML team was preparing a major set of demos for a visit by Dr. C. Everett Koop, a Dartmouth alumnus, area resident and recently retired as one of the more influential Surgeons General the United States has ever had. My particular focus was creating an interactive web site for IML, focusing in particular on the delivery of several key video sequences via the web. Several of us worked long into the night to migrate a few select videos into tolerable Quicktime format and suitable "thumbnails," then onto the lab's server, then linked (for downloading) from web pages, and finally viewable on the demo Mac.

When Joe arrived on the morning of our demo, I greeted him with (something like), "Joe, I got the 'Binding Sequence up on the Web!'" His incredibly insightful response was:

John, that's great!...John, that's terrible!

Joe preceded to express his concerns about two fundamental implications of my "success":

  • The copyright implications, especially as many IML programs were funded by private entities that retained certain rights to the works;
  • The implications of dis-aggregating medical and other training programs and delivering their content out-of-context, possibly doing harm to their message due to loss of design integrity.

Joe framed the challenge for me: to study the question of rights management from the perspective of multimedia production. In 24 hours, I learned that this was an important and rising issue that was not going away; that very little research had been done on the question from a practical standpoint; that the few proposed solutions at the time were overly simplistic, equating "copyright management" with "security" and in fact did neither; and no one appeared to be considering the issues from the perspective of the creator. In 24 hours, my Ph.D. topic was born!

This leads us to 1 November 1994 when I presented my dissertation proposal, which included as an example research artifact my Mr. Copyright(tm) prototype --- quickly re-named at the urging of my committee and others to LicensIt(tm). LicensIt demonstrated in the form of a easy-to-use, desktop "appliance" the key ideas of (a) binding actionable copyright metadata to multimedia objects, and (b) user-friendly, real-time, networked copyright registration. The LicensIt desktop icon said it all: modeled after the famous Stuffit(tm) coffee grinder, users dragged and dropped their content (initially GIF files) onto LicensIt; a dialog popped up to collect (and display) their descriptive and other metadata and to enable them to select their "registration server" from a menu of choices; their work was registered. By way of both the static metadata and the registry, users would be able to contact the principals involved in the creation of the item. I envisioned several other options, including registering digital signatures to allow users to authenticate a work in hand, as well as enveloping the work in an encrypted envelope.

It is important to note that the focus of my work at that time was on enabling copyright by binding static and dynamic metadata to content and especially to make it as accessible as possible within the context of use; content security was only a secondary concern. "Enablement" means that although a desktop client is interesting, plugins for creation tools like Photoshop, Acrobat and Macromedia Director, and enjoyment tools like Mosiac --- this was 1994!! --- would be infinitely more interesting and useful! I assumed that one day, creators would be mixing and matching content found around the web, and at least commercial and other highly visible producers would want/need to "do the right thing" w.r.t. copyright and thus would benefit from instantly accessible attribution, bound to the item. Note that I was heavily influenced at that time by the writings of Prof. Henry H. Perritt, Jr. whose concept of permissions headers was not only an inspiration for me, but I believe anticipated Creative Commons licensing templates.

Fifteen years later, we can at least say the world is different! The world we imagined 15 years ago of rampant "re-mixing" of content has arrived; licensing models such as Creative Commons have improved awareness; but still the infrastructure does not accommodate the discovery and transmission of rights information as readily as it should. With the rise of new data-centric models such as Linked Data (a practical outcome of Semantic Web research) and the acceptable of persistent identifier systems including the Handle System and the Digital Object Identifier, we're getting there...

Next installment: The NetRights and YRM years...