3.1 Metadata Classification
1) Statistical metadata.
Statistical metadata consist of :
- descriptions and definitions of statistical data and variables
- variable formulas and unit of measurement
2) Statistical data quality.
Statistical data quality reports consist of :
- statistical method descriptions
- relevance of data
- validity, reliability and accuracy of data
The former elements of the report are evaluated by quality indicators which are based on international recommendations.
3) Metadata of statistical documents or products.
Document and product metadata consist of information about:
- publication information
- identification knowledge of the publications or products
- field or subject area glossary
4) Process metadata. Process metadata are divided into technical and conceptual metadata:
a) technical metadata
- technical metadata guide the process of data production: data collection, data management and data dissemination. For instance, it makes it possible to follow data production phase by phase. It also documents the process.
b) conceptual process metadata
- conceptual process metadata consist of the technical information of data and variables which are used in producing data. For example, they can be minimum or maximum values, various calculation rules or use of certain classification values.
3.2 Metadata used/created at each phase
Metadata are used to inform the users of data what the statistics describe and how they are realised. With data descriptions, classifications and definitions of variables it is ensured that statistics are mutually comparable. Quality and methodological descriptions help the user to assess the reliability of the statistics and their suitability to various use purposes.
At the same time, metadata are needed in order to organise production of statistics in an efficient way. Timetables, data concerning the state of the data set and supplementary notes to the data set are necessary in interaction between those involved in the statistics production. Process indicators, such as data concerning the effectiveness of editing, help to focus measures on the right matters.
In the production processes, the same metadata have to be used again in different process stages. For example, variable descriptions needed in table headings can be produced by inheriting them from those descriptions that were made in data planning and that have possibly been updated in the course of the production process.
Persons involved in the production of the statistics at the practical level do not often even realise they are working with metadata. Metadata are processed similarly in the course of work as other information. Metadata exist but they are not necessarily managed and described in the best possible way, so they cannot be exploited to the full.
In future development the importance of metadata and its application will be stressed in every phase of statistical processes. At present, there are information systems which use both statistical and process metadata in all the phases of data production, but it is by no means standard in data production. For example, annual census data are processed by using metadata in every production phase from data collection to data dissemination. As a whole, metadata are mostly used in the data dissemination phase. That is the area where the development of metadata systems has been stressed also because of the wide use of the Internet as a data dissemination medium. The main goal in metadata use in future years will be in the middle phases of data production, namely data processing. This also lays stress on the use of process metadata.
As stated above, the development of metadata and metadata systems will in future be carried out by applying the metadata model processed by Statistics Finland, the CoSSI model. The metadata model consists of modules for different categories of metadata. The challenge in the near future will be creating a module for process metadata and using the modules of document, statistical, classifications and table metadata in new metadata tools.
3.3 Metadata relevant to other business processes
There are of course business processes which help the budgeting and the follow-up of the yearly action plans of the statistical unitsalso including the action plan of the statistical office itself.
In personnel administration there are various business processes for managing human resources and their optimal allocation to various projects. For instance, time allocation systems are in place, consisting of work time allocation to different tasks or projects, overseeing absences, etc. (sick leaves, holidays).
These business systems include a great amount of metadata about the projects and personnel. At present, they are not as optimally integrated into statistical process management as they could be, but that could be remedied in the future. For example, document-specific producer information could be obtained straight from the personnel data systems.