3.1 Metadata Classification

25. The metadata is classified according to their usage and their role in the statistical production process.

The main types of metadata according to this criteria are as follows:

  • Definitional metadata - The definitional metadata refer to metadata that act as identifiers and descriptors of the data. They are prior to the data, are created and maintained independently from the data and are used to define the data structure. Examples of definitional metadata are country names and codes, currency names and codes and their relation to the countries, definitions of the indicators, classifications like ISIC Rev. 2, ISIC Rev. 3, etc. Through these core data are defined also some basic metadata elements like metadata classes, stages, sources and methods, etc. Historically this metadata type was the first to be established (ported from the Mainframe, re-factored and formalized) in ISDE. The definitional data are maintained by the statistical staff using the tool Nomenclature Explorer (NE) following strictly the user authorisation and ownership.
  • Implicit metadata - The implicit metadata are a special class of metadata arising throughout the specific usage of other metadata. Typical example are the ISIC combinations. For example several industry categories can be combined and reported together by a given country for a given indicator and years. In the questionnaire returned by the NSOs such a combination is expressed in the following way (see - Figure 6):

The codes 1511, 1512 and 1513 are combined and reported as a single number '1234'. The combined industries are linked by the footnote a/. This is resolved by the system as a dummy ISIC code 1511A defined as "1511 includes 1512 and 1513" which is used throughout the production process and appears accordingly in the publications as well as in the pre-filled Questionnaire. In a similar way can be solved other country specific classification discrepancies like industry codes at 3-digit level that exclude one or more specific 4-digit industry codes. The implicit metadata can be used also for defining of synonyms - for example '040' is the country code of Austria and this is the same as, i.e. substituted by the ISIC code 'AUT'. Or for specifying of aggregation e.g. the aggregation code 'EU' is composed by the codes of the single countries. The keywords substitute, included, excluded used in the above described context are called operators.

  •  Operational Metadata - The operational metadata are generated by the process of data transformation and attributed to the respective data items. As described in the presentation of the Data Transformation phase, each data item is stored in the database with a stage indicator reflecting its credibility. Also the transformation process generates "Source" and "Methods" metadata, describing the source of the data item and methods applied for its generation.
  • System metadata - these metadata are used to drive automated processing throughout the phases of the life cycle. These can be layout definitions for the yearbook (for each country, for each edition of the yearbook) as well as country lists, etc., used in the automatic generation of the PDF output; Installation and packaging lists, directories, templates, etc. for creation of the CD product. These metadata are specific for the application where they are used and do not relate to the data, therefore, although stored in the centralized repository, are maintained by each application separately and are called "Properties" of the respective process, i.e. Yearbook properties, Questionnaire properties, etc.
  • Descriptive and Methodological metadata - these form the main bulk of metadata. They are received from the primary data reporters, using the UNIDO Questionnaire and than are further processed together with the data. During this processing additional metadata can be added by the UNIDO statistical staff. Descriptive or methodological metadata can be attached to all possible levels ranging from the complete data set down to individual data items. This is done by assigning to the metadata same dimensions as those of the data.

3.2 Metadata used/created at each phase

