The case for reference models

From filmstandards.org

From the TC 372 Workshop Compendium

Do we talk about the same thing?

Naming a data element may appear sufficient in order to give it a meaning. While this may be true for personal databases, it clearly isn't as soon as people from different backgrounds want to share the information.


Country-vs-country.png

Both statements on the left appear to be correct when viewed independently from each other.

Let's assume that the first statement is from a database of documentaries for educational purposes. Most users of this database will be more interested in where the film was shot, rather than in where it was produced.

The second statement is what we would expect from a general filmographic database where films are usually associated with a production country.

Penguins-2.png

Apparently we have to consider two locations: one for shooting and one for production.

Then, "let's add another field".

In fact, having fifty or more columns (fields) in a table is not uncommon in do-it-yourself databases. Most of these columns have accumulated over time by "let's add another field"

Penguins-3.png

We now learn that the film was originally released in France as La marche de l'empereur. Apparently, March of the penguins is a distribution title.

Unfortunaltely, we soon come across another distribution title, Marsz pingwinów.

"Add another field?"

Penguins-4.png

One data element per country of distribution clearly isn't workable.

After all: what does "Polish distribution title" refer to? The French original version, or a version adapted to the Polish market?

As long as we are cataloguing single copies in an archive, this question may be irrelevant. Once we start exchanging catalogue records with others, it can become an issue that needs to be resolved.

What-could-this-be.png

Asking what something is can easily lead us back to Adam and Eve (or to the beginning of the universe).

And, indeed, modern information science does relate things back to universal categories.

• • •

Clarifying the things to talk about

Metadata consists of statements about something. In this way, it is no different from ordinary discourse. Fruitful discourse requires a clear understanding of at least some basic concepts behind our words.

Porphyry.jpg

Defining classes of things by looking at how to distinguish them (finding the "differentiae") is an intellectual exercise since ancient times.

The diagram on the left shows a part of the Tree of Porphyry, derived from an introduction to Aristotelian logic written by Porphyry of Tyre in the 3rd century. This particular rendering was transcribed from a drawing attributed to Peter of Spain (1329).

Oct-toplevel.png

From: Aldo Gangemi: Project Proposal for a Fishery Ontology Service. Gainesville FL, 2002.

Here, we have a more recent example of defining top-level distinctions, including some relationships in addition to the class hierarchy.

Clarifying the philosophical perspective can be highly useful when defining models for representing domain knowledge. In this particular example, the purpose was to enuncuate some basic assumptions before developing a model for information about fishing.

Since they define a basic frame of reference, these specifications are often called reference models or reference ontologies.


Indecs-model-1.png

From: The <indecs> metadata framework, WP1a-006-2.0. Indecs Framework Ltd., 2000

This reference model brings us a little closer to our universe of discourse.

While still containing very abstract notions such as concept and percept, this model also defines more specialised entities such as manifestation. It attempts to identify the circumstances under which intellectual and artistic creations can become subject to transactions among people.

Frbr-model-1.png

From: Functional Requirements for Bibliographic Records. IFLA UBCIM Publications – New Series Vol 19. München: K.G. Saur, 1998

Even closer to our task of defining metadata concepts for audiovisual media is this reference model from the library community.

This model, known as the FRBR, had a profound influence on the discussion about the future of cataloguing. The diagram shows the four type 1 entities that define description levels for intellectiual or artistic creations.

FRBR was also influential during the development of EN 15907.

En15907-entities.png

From: EN 15907:2010 - Film identification - Enhancing interoperability of metadata - Element sets and structures.

Finally, this is what CEN Technical Committee 372 found to be useful for modelling metadata about cinematographic works.

Some of these entitites are specific to the standard. Others can be mapped 1:1 to corresponding definitions in reference models such as FRBR.

• Previous: How fragmentation happens • Up: Contents • Next: Events in the lifecycle of an audiovisual creation