Metadata specifications in context


From the TC 372 Workshop Compendium

Metadata, as the name implies, is data about data. In current usage of the term, the meaning of data is not restricted to digitally encoded information, but can be almost anything.



The concept of metadata became popular about 15 years ago when it was realised that the emerging World Wide Web with all of its digital objects would need some equivalent to catalogue records.

The ensuing development took a different course, however, with free text search engines becoming the major catalogues to the Web.

Realising that many digital objects are non-textual and that even a text document is often a poor description of itself, the interest in metadata continued and has increased considerably over the past few years.


From: Hinrichs' Halbjahreskatalog, 204. Fortsetzung, erstes Halbjahr 1900, Leipzig: Hinrichs, 1900. p.215.

Metadata has been produced for centuries. Usually referred to as catalogues, some collections of metadata have become huge and complex works long before the advent of computers.

Cataloguing rules had to reflect this complexity, leading to an ever increasing number of clauses and directives.


Frame from: Alle Kennis van de wereld. Directed by Ijsbrand van Veelen. VPRO (Hilversum), 1998.

Today's metadata, particularly that in the cultural heritage domain, still often resembles the paper-based catalogue.

One card for each item in the collection.

This legacy lives on, even in some of the most recent metadata specifications.


by Detlev Balzer, 2004.

Plenty of metadata is produced by machines.

Embedding metadata in the medium ensures that it does not get lost (as long as the medium remains intact).

Philosophical question:
To what extent is embedded metadata part of the work? Is a painter's signature a part of the image? If it is, what if the image is signed on the back of the canvas?


Photo by Detlev Balzer, 2010.

Some would also subsume this under the concept of metadata.

This label is clearly about something. Its use and its content schema is even mandated by law.

Assuming that fashion is not a primary business of film archives, we will henceforth narrow our focus on metadata about cultural heritage items in general, and audiovisual artefacts in particular.


From: Hinrichs' Halbjahreskatalog, 204. Fortsetzung, erstes Halbjahr 1900, Leipzig: Hinrichs, 1900. p.215.

Defining metadata means defining structure. Basically, a metadata schema defines an artifical language, consisting of a vocabulary and a grammar.

Marking up the artificial grammar elements in the example on the left will easily exhaust your stock of felt-tip pens.


Catalogue Record from Biblioteca Nacional de España

Librarians have devised a metadata standard in which elements are identified through a numbering scheme. Known as MARC (or variants thereof), this scheme has developed from modest beginnings in the 1960s into a family of complex specifications. Variants of MARC have been adopted by the majority of libraries worldwide.

In recent years, MARC has increasingly been criticised for its inconsistent syntax and semantics.


Part of a catalogue record from the Moving Image Collections portal, Retrieved Oct, 2010

The image on the left shows metadata encoded in XML. This encoding uses human-readable names for its elements, and nesting (i.e. elements enclosed by elements) as a way of expressing structure.

XML has become the most widely used encoding for data and metadata exchange. It is largely neutral with respect to the semantics of data elements. Therefore, it can be used as an encoding for arbitrary data structures.

Dbpedia anni ruggenti.jpg

Selected statements from Retrieved March, 2011

This is filmographic metadata represented in RDF/N3.

RDF (short for Resource Description Framework) is not actually an encoding, but a data model that can itself be encoded in different ways. Among others, it is the recommended representation for metadata using the Dublin Core element set, and for controlled vocabularies expressed using the SKOS model.

Adoption of RDF has been slower than that of many other technologies, perhaps because of its more radical departure from established methods of representing data.

One particular strength of RDF is that it facilitates integration of data from different models without the need for finding a least common denominator. RDF has been chosen as the core for several activities collectively known as the Semantic Web.

• Up: Contents • Next: How EN 15744 and EN 15907 came into being