|3. Statistical Metadata in each phase of the Statistical Business Process (Statistics Norway)||Statistics Norway||5. System and design issues (Statistics Norway)|
4.1 Metadata system(s)
Datadok - File descriptions (implemented)
We document all permanent archive data files in our file documentation database Datadok. The database was built in 1998 but wasn't mandatory until 2002.
Vardok - Variables documentation system (implemented)
The overall purpose of the variables documentation system is to document variables in a central location, accessible by all, and to function as a tool for harmonising names and definitions.
There is a two way link between Vardok and Datadok (file descriptions database), a one-way link from Vardok to Stabas (standard classifications database), a two way link between Vardok and StatBank (dissemination database), a two way link between Vardok and Metadb (system for documentation of event history data) and a one way link from About the statistics, About the data collections and the statistical metadata portal to Vardok, via web services.
2006 was the last year in the development phase for the Vardok-project.
Stabas - Standard classifications database (implemented. but an upgrade is planned from 2015)
The overall aim of Stabas is:
• To make work with and the use of standards simpler and more efficient
• To ensure systematic use of standards across different statistical areas
One main task is to make approved versions of the central statistical classifications available in a database system where they can be taken out at different aggregation levels, together with texts in different languages and relevant documentation, and where the classifications can be exported to other IT tools.
2004 was the last year in the development phase for the Stabas-project.
Service library for metadata systems (implemented)
The purpose of this project was to
• Create a library of services for the master systems Vardok, Datadok, Metadb and Stabas.
• Define a framework for the description and formulation of SSB's metadata based on international metadata models (e.g. Neuchâtel) and standards (e.g. ISO/IEC 11179).
The project began in 2005 and ended in 2008.
Metadata portal (implemented)
The overall purpose of the metadata web page is to make Statistics Norway's metadata systems more accessible and easier to use. Both internal and external users will get easier access to the metadata by displaying the contents of these systems in a common web page. The project began in 2005 and ended in 2009.
Metadata portal: http://www.ssb.no/english/metadata/
Metadb - metadatabase for event history data (implemented)
Metadata for FD-Trygd (Social security database) and NUDB (Norwegian national Education Database).
FD-Trygd: details on demography, social conditions, social security, employment, search for employment, government employees, income and wealth. Data from1992 to the present. Continuous regulatory and technical changes.
NUDB : All individually based statistics on education from completed lower secondary education to tertiary education from 1970 to the present.
System for questionnaires
Systems exist but are being replaced.
Administrative system for projects, products and processes (implemented)
This administrative system can be used to take out reports that combine manhours and other administrative information. It includes important information on all products in Statistics Norway such as financing, response burden, responsible division and person, response rates, frequency, laws, EEA requirements, subject field etc. This system contains both metadata and data.
About the data collections (implemented)
Researchers frequently use data collections from Statistics Norway for their research. However, the process from finding out what you need, to actually getting the data, may be long and troublesome, especially for inexperienced researchers. Statistics Norway has therefore (with support from the Research Council of Norway) developed a website to make information about this process more easily available. Among other things, this page provides the users with documentation of several data collections. Each data collection has a general description e.g. of data quality, and it also contains a list of relevant variables, including variable documentation from Vardok. A new system is being scoped, hopefully with even more automatic solutions.
About the statistics (implemented)
About the statistics is metadata that describes each statistics that is published by Statistics Norway. It contains administrative information, information about statistics production, variables, concepts, sources of errors and uncertainty, comparability, coherence and availability. About the statistics now uses a CMS (Content Management system)-platform. CMS makes it possible to link About the statistics to Vardok and Stabas.
StatBank - dissemination database (implemented)
StatBank Norway is a service where you may select scope and content of each table, and then may export the result in various formats to your own PC. This system contains both metadata and data.
4.2 Costs and Benefits
Examples of costs:
A total of 1420 man-hours have been used in preparing the metadata strategy with ca. 35% of resources from IT.
A total of 12690 man-hours have been used in development with ca. 70% of resources from IT. A total of 476 man-hours from standards were used in 2007 for continued harmonisation of names and definitions, and training of personnel in the six new divisions. 294 IT man-hours were used in 2007 for maintenance and minor changes to the system.
A total of 7200 man-hours have been used in development 2002-2004 with ca. 75% of resources from IT. However these man-hours do not include the development performed by Statistics Denmark on the editing application. A rough guess for this would be 2500 man-hours. The system required approximately 1000 man-hours in production each year from 2005-2007 with ca. 70% from IT. We are now planning a new version of the editing application that we hope will be more flexible and less costly in production.
Metadata portal (man-hours used):
Statistics Norway shall have easy access to data sources
IT shall develop better and more integrated metadata systems to help to enable data to be collected and reused across different sources and collection channels to a greater extent.
Statistics Norway shall be an effective and knowledge-based organization
IT shall contribute to the development of common solutions that ensure standardisation and automation of work processes, use the best statistical methods and create consistent quality indicators and link metadata and data in
the production of statistics in order to ensure good storage and reuse of data.
Data and metadata
Good descriptions of Statistics Norway’s data, methods and processes are fundamental to the understanding of statistics and reuse of data. These metadata shall be systematically stored during the production of statistics, and be well integrated with Statistics Norway’s data and statistical products. Further development of an effective and comprehensive system that ensures this shall be prioritised. Statistics Norway’s description of data shall be based on national and international standards and models. This will make it easier for Statistics Norway to apply solutions developed by other producers of statistics or generally available software based on the same standards. Statistics Norway’s statistical definitions and classifications shall be easily available for external use. Further development of the metadata systems will help Statistics Norway to disseminate open data with adequate documentation for reuse. Good metadata systems are also necessary for Statistics Norway’s data archive to be harmonised and structured. This in turn will help Statistics Norway to effectively provide data for research and analysis.
4.3 Implementation strategy
All our metadata projects are based on a step-wise approach.