Difference between revisions of "No entity without identity"

From filmstandards.org

(Where philosophers can help)
 
(5 intermediate revisions by the same user not shown)
Line 1: Line 1:
 +
''From the [[TC 372 Workshop Compendium]]''
 +
 
==Where philosophers can help==
 
==Where philosophers can help==
  
Line 13: Line 15:
 
In the relationship graph from the preceding session we identified instances of Cinematographic Work by a '''title''', and instances of Agent by a '''name'''.
 
In the relationship graph from the preceding session we identified instances of Cinematographic Work by a '''title''', and instances of Agent by a '''name'''.
  
As databases grow, titles and names can quickly '''become ambiguous''', while machines require unambigous indentifiers for operating on relationships. Moreover, in this example we have an instance of an Event entity, for which no "natural" identifier exists.
+
As databases grow, titles and names can quickly '''become ambiguous''', while machines require unambiguous indentifiers for operating on relationships. Moreover, in this example we have an instance of an Event entity, for which no "natural" identifier exists.
 
|}
 
|}
  
Line 40: Line 42:
 
No bureaucracy without identifiers.
 
No bureaucracy without identifiers.
  
An identifier such as "58981" can only be '''unique within a''' particular '''scope''', in this case, the set of German film censorship records.
+
An identifier such as "58981" can only be '''unique within a''' particular '''scope'''. In this case, the scope is the set of censorship records from the German ''Filmprüfstelle'' in Berlin.
  
 
In databases, the scope of an identifier is usually limited to an entity from the data model. In this way, different entities can share the same set of identifiers, e.g. a Cinematographic Work 12345 can be distinguished from an Agent 12345.
 
In databases, the scope of an identifier is usually limited to an entity from the data model. In this way, different entities can share the same set of identifiers, e.g. a Cinematographic Work 12345 can be distinguished from an Agent 12345.
Line 75: Line 77:
  
 
Levels or '''modes of identity''' have been (and continue to be) a major topic of analytical philosophy.
 
Levels or '''modes of identity''' have been (and continue to be) a major topic of analytical philosophy.
 +
|}
 +
 +
{| style="float: right; border: 1px solid #BBB; margin: .46em 0 0 .2em;"
 +
|-
 +
| valign="top" width="405px" |[[File:Gabeln.jpg|400px]]<br />
 +
<span style="font-size:8pt">
 +
Photos and montage: Detlev Balzer, 2010
 +
</span>
 +
 +
| valign="top" width="405px" |
 +
'''Levels of identity''' can be distinguished in every kitchen.
 +
 +
On the left, we have an '''item''' which, by definition, can only be identical with itself. In the center, we have two items that are identical at the '''manifestation''' level, i.e. these forks were apparently manufactured from the same mould. On the right we have two items from different manifestations that are identical at a generic '''concept''' level, i.e. both can be identified as forks.
 +
 +
We could easily extend this example with a fourth level, e.g. pieces of cutlery belonging to a named series from a particular designer.
 +
|}
 +
 +
{| style="float: right; border: 1px solid #BBB; margin: .46em 0 0 .2em;"
 +
|-
 +
| valign="top" width="405px" |[[File:Quine-citation.png|400px]]<br />
 +
<span style="font-size:8pt">
 +
</span>
 +
 +
| valign="top" width="405px" |
 +
Questions of identity have often been the subject of drama, one of the best known examples being ''Dr. Jekyll and Mr. Hyde''.
 +
 +
In the context of filmography, similar questions arise when film works are modified to an extent that the '''level of identity''' needs to be assessed.
 +
 +
Such assessments of identity cannot be made on the basis of a metadata standard alone.
 +
|}
 +
 +
{| style="float: right; border: 1px solid #BBB; margin: .46em 0 0 .2em;"
 +
|-
 +
| width="810px" style="background-color:#F0F0F0" |
 +
References and Materials: Identity and identifiers
 +
* [http://www.athenaeurope.org/getFile.php?id=779 Persistent Identifiers (PIDs): Recommendations for Institutions.] <i>- A useful guideline from the ATHENA project, April 2011</i>
 +
|}
 +
 +
{| height="20px" width="100%"
 +
|- style="text-align:center; "
 +
|<span style="color:#808080"> • Previous: [[Relationships: An essential component of art and culture]] • Up: [[TC 372 Workshop Compendium|Contents]] •  Next: [[Description levels: A worked example]] • </span>
 +
|-
 
|}
 
|}

Latest revision as of 20:12, 1 July 2011

From the TC 372 Workshop Compendium

Where philosophers can help

Any statement about something must identify the thing in question. Introducing description levels into metadata also means introducing different concepts of identity.

Jazzgossen-rels-id.png

Graph based on information from the Swedish Film Database

In the relationship graph from the preceding session we identified instances of Cinematographic Work by a title, and instances of Agent by a name.

As databases grow, titles and names can quickly become ambiguous, while machines require unambiguous indentifiers for operating on relationships. Moreover, in this example we have an instance of an Event entity, for which no "natural" identifier exists.

Imdb-agentname.png

From: Internet Movie Database, http://www.imdb.com/ accessed 13-Oct-2010

For many years, the creators of the Internet Movie Database (and also those of Wikipedia) believed that all things of interest could be identified uniquely and persistently by a name, a title, or similar.

This identifier scheme turned out to be difficult to manage as the databases grew in size.

In recent years, both databases have introduced non-semantic identifiers in the form of numbering schemes that remain hidden from the user interface.

Zensurkarte.jpg

No bureaucracy without identifiers.

An identifier such as "58981" can only be unique within a particular scope. In this case, the scope is the set of censorship records from the German Filmprüfstelle in Berlin.

In databases, the scope of an identifier is usually limited to an entity from the data model. In this way, different entities can share the same set of identifiers, e.g. a Cinematographic Work 12345 can be distinguished from an Agent 12345.

In the Linked Open Data paradigm, the scope of an identifier is determined by a namespace. All UniformResource Identifiers (URIs) must contain a component that identifies a namespace.

Dif-guid-1.png

From the in-house database of Deutsches Filminstitut, Frankfurt am Main

This is an identifier from the DIF database which uses globally unique identifiers (GUIDs) for all instances of all entities.

GUIDs are numbers with 128 binary places (38 decimal places) which are constructed in such a way that the chance of two computers generating the same ID is extremely small, even over long periods of time.

GUIDs are particularly useful in environments where data is created without centralised control.

Frbr-model-1.png

From: Functional Requirements for Bibliographic Records. IFLA UBCIM Publications – New Series Vol 19. München: K.G. Saur, 1998

Identity (expressed through identifiers) plays a major role in interpreting the FRBR description layers.

Two items can only share an identity at the manifestation level. Likewise, two manifestations can only share an identical expression, etc.

Levels or modes of identity have been (and continue to be) a major topic of analytical philosophy.

Gabeln.jpg

Photos and montage: Detlev Balzer, 2010

Levels of identity can be distinguished in every kitchen.

On the left, we have an item which, by definition, can only be identical with itself. In the center, we have two items that are identical at the manifestation level, i.e. these forks were apparently manufactured from the same mould. On the right we have two items from different manifestations that are identical at a generic concept level, i.e. both can be identified as forks.

We could easily extend this example with a fourth level, e.g. pieces of cutlery belonging to a named series from a particular designer.

Quine-citation.png

Questions of identity have often been the subject of drama, one of the best known examples being Dr. Jekyll and Mr. Hyde.

In the context of filmography, similar questions arise when film works are modified to an extent that the level of identity needs to be assessed.

Such assessments of identity cannot be made on the basis of a metadata standard alone.

References and Materials: Identity and identifiers

• Previous: Relationships: An essential component of art and culture • Up: Contents • Next: Description levels: A worked example