In a recent project, we were asked to help creating a world-wide catalog of all the resources available online about a given domain (the actual domain is not relevant to this discussion). These resources consisted of various resource types (data sets, videos, scientific papers, simulations, etc.) and were typically available in multiple languages and formats. A traditional approach would have consisted of using a specialized metadata models for describing each category of resources (for example, IEEE LOM for describing learning objects, MARC 21 for describing publications, etc.). In a situation where we know there are heterogeneous types already included plus a strong probability of new types of resources being added down the road, we faced a situation of increasing costs brought on by the necessity of constantly adapting applications to new data models. To overcome this problem, we decided to use a generic model that encompasses all resources types. This kind of data model relies on a facet mechanism that makes possible the integration of new requirements and resource types without compromising the existing structure.
The approach proposes a generic metadata container that can be used to describe, in one metadata record, all the aspects of all types of digital resources.
Introducing a Generic Metadata Container Structure
In this container, metadata records are structured according to a FRBR hierarchy (i.e., work, expression, manifestation, and item) where each FRBR level is described using a limited set of structural metadata elements (i.e., fixed metadata elements that belong to the container structure and are therefore common to all records) complemented by faceted descriptions that can vary depending on the type of resource being described, its context, and the relevant FRBR level.
Structural Metadata Elements
The following categories of elements are present at each FRBR level:
- Structural elements specific to a given level;
- An optional element “description” that, when present, holds the facets for this level; and
- A sub-container for the next FRBR level (i.e., an element “expressions” at the work level, en element “manifestations” at the expression level, and an element “items” at the manifestation level).
The work level is used to describe the information content of the resource as a distinct intellectual creation like its title, its abstract, its authors, etc. In addition to elements “description” and “expressions”, each work is described by two structural elements:
- Resource identifier (in the approach all the metadata records have to be uniquely identified) and
- Resource type (to distinguish among the different resource types).
The expression level is used to describe information specific to a particular version of the work (e.g., describe a particular language version of the work).
In addition to elements “description” and “manifestations”, each expression is described by two structural elements:
- Language and
These two elements are used to distinguish between the different expressions of a given work. They are called the expression dimensions.
The manifestation level is used to describe the different ways an expression can be experienced by a user. For example, is this manifestation of a resource intended to be printed or can it be streamed, etc.
In addition to elements “description” and “items”, each manifestation is described by two structural elements:
- Name (the name of the manifestation: “printable”, “streamable”, “web feed”, etc.) and
- Parameters (an optional parameter that further describe the manifestation for example, possible parameters for web feeds would include “atom” and “rss”).
In addition to elements “description”, each item is described by one structural element:
- Location (location indicates the location where the item can be accessed e.g., its URL)
‘Facets’ is the key mechanism that makes the model generic and extensible. Typically application are specialized. For example, a digital rights management system is only interested in information about the digital rights associated with a resource and does not need to access other metadata. By dividing up metadata in specialized facets, this generic model allows specialized applications to easily access only information relevant to its services while totally ignoring metadata encapsulated in other facets.
Facets are used to describe specific aspects of a FRBR level using element “description”. A facet consists of a container named after the name of the facet and contains three sub-elements:
- An element “schema” used to identify the facet;
- A controlled block that contains all the controlled vocabulary elements of the facet; and
- Language blocks that contain all the free-text elements (e.g., title, description) of the facet grouped by languages.
Modeling with Facets
The idea is to have each facet encapsulating in a precisely-defined manner all the information necessary to describe a specific aspect of a resource so that this information can easily be consumed and processed by specialized applications or functionalities. Examples of such facets include:
- A “rights” facet that describes the intellectual property rights associated with a resource;
- An “pedagogical” facet that describes the pedagogical content of a resource;
- An “accessibility” facet that indicates the presence of features in a resource providing access to users with special needs;
- A “meta-metadata” facet that describes the metadata record itself.
In addition to these specialized facets, we have introduced a “common” facet that, as its name suggests, is common to all resource types and provides general information about a resource such as title, description, subjects covered, etc.
Using the generic metadata container structure presented above, these facets can be easily combined by adding them at the appropriate FRBR level to model any type of resource.
This process is facilitated by building a catalog in which all these predefined facets are uniquely identified, classified and described.
In a next post we will describe in detail how to implement this kind of model to maximize the benefits of the faceted approach.