|3. Statistical Metadata in each phase of the Statistical Business Process (Czech Statistical Office)||Czech Statistical Office||5. System and design issues (Czech Statistical Office)|
4.1 Metadata system(s)
MS architecture is modular. It is composed of relatively self-sustainable, mutually interlinked subsystems as presented in the diagram below.
Statistical Classification (CLASS) – maintenance and update of statistical classifications/code‑lists; the module has been in full operation. It contains about 1000 active code-lists and all international statistical classifications.
Statistical Variables (VAR) - maintenance and update of the catalogue of statistical variables.
Description of VAR is based on the metadata model used for VAR in all stages of SBP; the module is in full operation. It comprise descriptions over 4000 variables.
Statistical Tasks (ST) - maintenance of metadata related to the design and processing of ST (basic characteristics, statistical questionnaires, statistical surveys, other input data, decree on annual programme of statistical surveys, data validation, definition of statistical samples, imputation methods, quality requirements, aggregations, specification of users, time-tables for data collection, applied code-lists, legislation, provider of ICT services, specification of ICT services, etc); the module has been in semi-production run. At present it contains description of building blocks of statistical questionnaires, description of validation rules for selected tasks of economic statistics, description of validation rules, automated corrections derivations and transformations of variables for population census 2011, etc.
Statistical Quality (QUALITY) - maintenance and update of qualitative characteristics and methods for statistical data assessment;the module has been implemented under the umbrella RSIS Project.
Statistical Time Series (T-SERIES) - maintenance and update of metadata on current statistical time series; the module has been implemented under the umbrella RSIS Project.
Dissemination (DISSEM) - maintenance and update of metadata linked to dissemination of statistical information (statistical publications, electronic outputs, web site, data security etc.); the module has been designed and implemented for formal specification of population census outputs (the first phase of implementation). Full implementation of the the module has been carried out under the umbrella RSIS Project.
Respondents (RESP) - maintenance and update of metadata on respondents, (respondent burden, respondent opinions, reporting duty, links to statistical surveys, etc); this module has been partially implemented, mainly the registration of respondents and for monitoring progress in data collection of economic statistics questionnaires.
Users (USERS) - maintenance and update of metadata on the SIS external users (users’ opinions, FAQ, etc.); the module has been implemented under the FB DISSEMINATION as the Register of Users.
Data Fund (D-FUND) - maintenance and update of metadata on contents and structure of data files included in SIS. A data warehouse (DWH) has been implemented as the central storage of approved micro and aggregated data; the metadata part on data stored in the DWH has been the integral part of the DWH application.
iSMS - Internet presentation of SMS - new application for presentation of statistical classifications and statistical variables has been developed for external users. There is an intention to extend this application for presentation of selected parts of the TASKS module.
SMS is interlinked with the system of Statistical Registers. The main registers in this system are the following:
- Business Register,
- Register of Census Districts and Buildings, and
- Population Register.
Core principles for SMS implementation
- unified internal users´ interface (search, update, administration),
- unified external users´ interface (navigation, selection, interpretation),
- unified data interfaces between SMS subsystems,
- preserving history of SMS objects,
- update of metadata elements on one place only,
- single authoritative source (registration authority) for each metadata element,
- registration process associated with each metadata element so, that there is a clear identification of ownership, approval status, date of operation etc,
- reuse of metadata where possible for statistical integration as well as efficiency reasons,
- unique storage and update of metadata,
- unified user documentation,
- unified technical documentation,
- standard data protection model,
- consistency of metadata inside the SMS subsystem and between subsystems,
- unified technological tools for implementation.
Steps in implementation of SMS subsystems and responsibility for them
- business system options (BSO) by CZSO,
- technical system options (TSO) by external supplier,
- programming by external supplier,
- testing by CZSO and external supplier,
- pilot processing using selected ST by CZSO and external supplier,
- operational running by CZSO.
4.2 Costs and Benefits
a) SMS financing
Principles of SMS financing:
- BSO are prepared by the CZSO and financed from the CZSO budget.
- TSO are prepared by external suppliers and financed partly from the CZSO budget and partly from resources provided by the EU (Transition Facility programmes and Integrated Operational Programme).
b) SMS benefits
- interlink of statistical data and metadata from beginning to end of SBP allows unified and clear data interpretation,
- strengthening the role of methodology throughout SBP,
- systematic data quality assessment,
- upgrading of data dissemination and interpretation to users,
- integration with other ISs of public administration,
- integration with ISs of international organisations (Eurostat, OECD, UN, IMF, etc.),
- tool for defining phases of SBP,
- tool for management of ST processing.
The current progress of the project makes it obvious that SMS strengthens the role of methodology in defining the content, size and coordination of statistical surveys.
The introduction of project management and organisation of work during SMS implementation increased the CZSO research potential without staff's increases. More staff of different profession groups (management, methodologists, statisticians, IT specialists) got involved. Training courses improved the knowledge of SMS subsystems in all profession groups. Communication barriers were reduced between departments involved in SBP (subject-matter departments, methodology, IT).
4.3 Implementation strategy
The SMS implementation is, in fact, a 'big-bang' approach. At this time, only statistical classifications are maintained and updated in e-way. Statistical tasks are defined without using metadata. Current application processing tools use different identification of statistical variables for central processing and different meta-identification for variables stored in the output database.
Introduction of the SMS into practice implies a change in the process of preparing and designing STs by statistical departments. These activities will rely on work with metadata and hence on using SMS tools. A prerequisite for the use of SMS functions is availability of an updated metadata base. What has to be done further is to bring into being all functions and organisational measures related to metadata administration. Adequate training of all participating actors should precede the SMS implementation.
The main condition for introduction of SMS into the SIS operational running is its functionality in all stages of SBP. Effective and viable interlink of SMS subsystems interpreted in a unified metadata base is a necessary precondition for that. This requirement predefines priorities in design and implementation of SMS subsystems implementation strategy.
In view of the project comprehensiveness and complexity, SMS should be developed step by step. The step-wise approach, however, has a clearly defined framework.
The first stage of the SMS introduction into the practice (2008-2009)
Subsystems CLASS, VAR, ST and QUALITY have been tested on the Annual Labour Costs Survey.
There is to test functionality of SMS namely for the following activities:
The aim of a pilot project
- definition of ST,
- design of statistical questionnaires,
- data validation (logical control specification),
- design of samples (response duty specifications),
- aggregate specifications,
- output specifications,
- preparation of timetables,
- specification of quality attributes of a statistical task.
The pilot project pre-requires the following:
- to complete a database of statistical classifications (SMS-CLASS),
- to unify methodologically a content of statistical survey(s) for the pilot project,
- to complete a description of statistical variables relevant to the pilot and to ensure their storage in the database (SMS-VAR),
- to create a database for definition of statistical tasks (SMS-ST),
- to develop and test an SMS application program package,
- to develop and make operational statistical data warehouse,
- to establish and make operational an SMS administration,
- to accomplish training of personnel for all professions needed for the pilot project (methodology, subject-matter departments, SMS administration, project preparation, IT applications).
Building up and loading of an SMS database has been for the CZSO an entirely new task. In the newly established SMS-CLASS database, the links to the existing (old) e-system of statistical classifications should be maintained until a complete transition of statistical tasks into the new SIS is accomplished.
The second phase of the SMS introduction into the practice (from 2010 to 2012)
During the time period 2010 – 2012 statisticians in cooperation with methodologists elaborated descriptions about 10 thousand variables (VAR module) and 120 statistical tasks (TASKS module) for 2012 and 2013 survey years. This metadata was prepared for the second stage of the Project and practically used for testing purposes in the Redesign of the SISI (RSIS) Project in 2012-2014 years.
The third stage of the SMS introduction into the practice (from 2012 to 2015)
The third stage has been focused on design, development, implementation and step by step introduction into practice of SMS subsystems for monitoring of quality, time series, dissemination, respondents and users of statistical information. This second stage comprised new SMS modules, design, programming and implementation of application tools for main phases of the SBP (see the picture and explanation in the attachment).
The public tender for the implementation of the RSIS Project was open in the autumn 2012, the contract was signed in January 2013, the work stated by the end of January 2013 and finished in September 2014. During this time period were designed and developed new SMS modules and new application for data collection, central processing and dissemination of statistical data and information. The 2015 year has been the year of introducing these new applications in everyday statistical practice.