3.1 Metadata Classification

The ABS doesn't have a formal "taxonomy" of metadata. One was proposed early in development of the 2003 metadata strategy but it wasn't included in the final document. It was found that discussions about how to "class" particular instances of metadata (in borderline cases rather than all cases) could become very protracted without that discussion seeming to generate any real value.

In general ABS concurs with the findings of Bo Sundgren in regard to Classification of Statistical Metadata, namely that multiple valid approaches exist, with the optimum depending on why classification is being attempted.

One form of categorisation sometimes used within the ABS relates to purpose/use of metadata. This means a particular "piece" of metadata may (and often should) support more than one type of use. The categories are

  1. (Search and) Discovery - Help users find data (or a metadata object in its own right, such as a classification) of relevance to their needs and interests
  2. Definition - Help users understand data (or a metadata object in its own right, such as the definition of a data element)
  3. Quality - Help uses assess the fitness of associated data for their specific purpose
  4. Process - Apply metadata to run processes, such as using a classification to drive an aggregation process or to provide a list of valid encoding values for editing purposes. It also includes defining other parameters that drive a process as metadata, such as the choice of which imputation method to use for which data element.
  5. Operational - These are metrics on the results of the operation of processes such as edit rates, imputation rates etc. These can feed into internal decisions on managing and improving survey processes and into external "quality" decisions. This metadata is sometimes termed "paradata".
  6. System - Low level information about files, servers etc that helps allow the physical IT environment to be updated without end user processes needing to be respecified.

The ABS also recognises "objects" in regard to which metadata can be assembled and registered. These include

  • high level end to end statistical activities ("collections")
  • individual datasets
  • data elements
  • classifications
  • individual processes
  • terms
  • questions
  • question modules
  • collection instruments

These "objects" can be further broken down (eg data elements into properties, object classes, value domains etc).

The main way forward from the ABS perspective at this time is work toward GSIM (Generic Statistical Information Model). This should (among other things) provide a reference classification (or taxonomy) of "information objects" (including "metadata objects") that is shared in common beyond just the ABS.

Work on the Metadata Census within the ABS is also providing a "bottom up" approach to classifying/grouping "information objects" based on the requirements of existing systems and processes (including seeking to harmonise the sets of requirements and align them with constructs described within SDMX and DDI). As described in Section 2.2, this work (together with GSIM) will input to classing the objects supported by the MRR and also provide a use based checklist for testing the GSIM "taxonomy".

3.2 Metadata used/created at each phase

The ABS actively aspires to manage metadata consistently throughout the statistical business process. The MRR to be delivered under IMT will have a crucial role in realising this objective in the medium and longer term.

As documented in the supporting page for Section 2.2, consistent metadata related to the identification of collections and cycles and the definition of classifications is already used widely (although not universally) throughout the statistical business process.

As the ABS does not yet have a definitive taxonomy of "statistical information objects" (including "metadata objects") we do not yet have a definitive mapping of metadata used/created at each phase of the statistical business process. Nevertheless, indicative representations were created in 2006 and 2010.

The 2006 representation predates the ABS adoption of the GSBPM. The most legible version of the diagram predates the final version that was ultimately included in a briefing paper for senior management. Unfortunately the source file for the final version of the diagram has not yet been located and its reproduction from the briefing paper itself is hard to read.

The 2010 working document has been loaded to the METIS wiki.

3.3 Metadata relevant to other business processes

IMT has a focus on statistical information management rather than, eg, financial and human resource information within the ABS.

That said, information about costs, information about organisational structures, people and their roles are examples of information that can be relevant to managing and performing statistical business processes. More generally Statistical Information, Corporate Information and Business Information can be visualised as intersecting circles, and the intersections with Statistical Information are certainly in scope for IMT.

Also, as long as it does not distract from the focus on the statistical information used and/or produced in the course of various sub processes within a statistical business process, there is no reason why elements of the Statistical Information Management Framework associated with IMT could not be applied to these other information domains.

A further connection is that a range of high level key performance indicators in relation to the core business of the ABS are expected to be able to be sourced via the MRR in future. These will assist in high level corporate monitoring and reporting, on a more consistent and informed basis, of efficiency, productivity, return on investments etc.

One reason IMT does not have a primary focus on corporate information and business information is that well recognised standards and frameworks (not specific to producers of official statistics) already exist for these domains of information. A more natural alignment in this case might be with Australian Government Architecture rather than the common reference architecture for producers of official statistics internationally.

These other domains of information are recognised within ABS Enterprise Architecture (eg in terms of data/information architecture). The redevelopment of information systems related to human resource management is an example of an "architecturally significant" project in this regard.

In this regard, the IMT approach parallels the 2003 metadata strategy which defined its scope as relating to "statistical" metadata (rather than all the metadata potentially relevant to any aspect of ABS operations).

