Identification for DDI 4
This version: 2014-03-27
Prepared By: Dan Smith-Colectica, reviewed and modified at NADDI Sprint
This document outlines the DDI Technical Committee discussion on identification in DDI 4 from February 2014.
- Global Uniqueness
- Alignment with DDI 3.2 Lifecycle
- Alignment with ISO/IEC 11179-6
- Semantics work in both XML and RDF serializations
- Any URI pattern may be used in the RDF serialization
Requirement 6 Note:
Many organizations may have preexisting URI schemes, or have URI patterns imposed on them other organizations or governments. DDI will not require any specific information or pattern to be contained in the URIs of described resources.
To align with both DDI 3.x and ISO/IEC 11179-6, the identifier system will continue to be based on a combination of:
- Agency Identifier
- Item Identifier
- Item Version
These parts correspond to the agency, id, and version used in DDI 3.x and to the registration authority identifier (RAI), data identifier (DI), and version identifier (VI) constituting the international registration data identifier (IRDI) in ISO/IEC 11179-6
In DDI 3.x, all items had an agency, item id, and version. However, some types of items could inherit a parent item’s agency. Some items would inherit a parent item’s version. In DDI 4, all items will have their own Agency, Item Identifier, and Item Version specified.
In DDI 3.x, regular expressions restricted the Agency Identifier, Item Identifier, and Item Version. It is suggested that these types of restrictions be removed for DDI 4.
A suggested format is a string without colons and whitespace.
Unrestricted identifiers allow systems to use internal identifiers that may have been previously restricted. Additional versioning schemes, such as hash function as used by git and mercurial, could also be implemented by users and systems when using an unrestricted version field.
Note that these are restrictions on the specific content not the structure of a DDI URN. The restriction on the use of a colon supports the use of this character as a URN separator.
This complies with ISO/IEC 11179-6 as it imposes no limitations on the contents of the IRDI fields.
In DDI 3.2, versions are restricted to integers that may be separated by periods. This forces implementers to use a specific versioning system. A more flexible system would use a “based on” reference to determine version history. In addition, a “based on” system adds the ability to branch and merge. A Based On system would be backwards compatible with DDI 3.x versioning systems.
There will be no correspondence between identification and containership. While items may be placed into container or group like relationships, these relationships shall not affect their identity.
All identified items shall be identified in the same manner, regardless of the structure on the domain model. This will simplify the DDI identification and ease integrating the XML and RDF serializations.
In DDI 3.x, some items were classified as Maintainable, Versionable, and Identifiable. The way their identity was constructed may have varied based on another item. There will be no such distinction in DDI 4, all identified items will have the same identification requirements.
The XML Serialization will integrate with the identification scheme in the same manner as the current DDI Lifecycle (3.2). Each identified item shall contain an Agency Identifier, Item Identifier, and Item Version.
When referencing an identified item to create a semantic relationship, the Agency Identifier, Item Identifier, and Item Version shall be specified to create the relationship reference. The Type of Item may be included in the reference.
DDI will not impose any requirements on construction of the URI of resources in the RDF serialization. The URI may be constructed in any manner. This will support users who are mandated by governing organizations or law to construct URIs in specific ways. The resource shall have an Agency Identifier, Item Identifier, and Item Version as properties, specified in the DDI namespace. These properties will be required in the OWL/rdfs schema for DDI objects. This will support interoperability between RDF and XML expressions of the same content. These properties may also be mapped to additional vocabularies such as Dublin Core using sameAs.
When referencing an identified item to create a semantic relationship, the URI of the item will be used. To enable interoperability between the RDF and XML representations, additional practices will be needed when creating relationships in RDF to determine the identity of an item.
DDI will be publishing a practice for creating concise bounded descriptions  (CBD) and named graphs of an identified item or set of items. This document could include an additional requirement.
When referencing an identified item via a URI that will not be included in the CBD or named graph, still include the Agency Identifier, Item Identifier, and Item Version as properties of the URI in the CBD. Additionally, the RDF type information about the referenced item may be included.
“In OWL 2 a collection of properties can be assigned as a key to a class expression. This means that each named instance of the class expression is uniquely identified by the set of values which these properties attain in relation to the instance.” - Owl 2 primer
To be additionally formal, the Agency Identifier, Item Identifier, and Item Version can be added to the class expression of the identified item with the owl HasKey to create a composite key. This would uniquely identify the named instances.
See Owl 2 keys 
DDIItem HasKey ( :AgencyId : ItemId :ItemVersion )
W3C, "CBD - Concise Bounded Description," [Online]. Available: http://www.w3.org/Submission/CBD/.
W3C, "OWL 2 Web Ontology Language - 9.5 Keys," [Online]. Available: http://www.w3.org/TR/owl2-syntax/#Keys.