|3. Statistical Metadata in each phase of the Statistical Business Process (Czech Statistical Office)||Czech Statistical Office||5. System and design issues (Czech Statistical Office)|
4.1 Metadata system(s)
SMS architecture is modular. It is composed of relatively self-sustainable, mutually interlinked subsystems as presented in the diagram below.
Statistical Classification (CLASS) - maintenance and update of statistical classifications/codelists; the module has been in full operation. It contains about 1000 active code-lists and all international statistical classifications.
Statistical Variables (VAR) - maintenance and update of the catalogue of statistical variables.
Description of VAR is based on the metadata model used for VAR in all stages of SBP; the module is in full operation. It comprise descriptions over 4000 variables.
Statistical Tasks (ST) - maintenance of metadata related to the design and processing of ST (basic characteristics, statistical questionnaires, statistical surveys, other input data, decree on annual programme of statistical surveys, data validation, definition of statistical samples, imputation methods, quality requirements, aggregations, specification of users, time-tables for data collection, applied code-lists, legislation, provider of ICT services, specification of ICT services, etc); the module has been in semi-production run. At present it contains description of building blocks of statistical questionnaires, description of validation rules for selected tasks of economic statistics, description of validation rules, automated corrections derivations and transformations of variables for population census 2011, etc.
Statistical Quality (QUALITY) - maintenance and update of qualitative characteristics and methods for statistical data assessment;the module is in design phase.
Statistical Time Series* (T-SERIES) - maintenance and update of metadata on current statistical time series;; there is an intention to launch a project of this module
Dissemination (DISSEM) - maintenance and update of metadata linked to dissemination of statistical information (statistical publications, electronic outputs, web site, data security etc.); the module has been designed and implemented for formal specification of population census outputs (the first phase of implementation).
Respondents (RESP) - maintenance and update of metadata on respondents, (respondent burden, respondent opinions, reporting duty, links to statistical surveys, etc); this module has been partially implemented, mainly the registration of respondents and for monitoring progress in data collection of economic statistics questionnaires.
Users (USERS) - maintenance and update of metadata on the SIS external users (users' opinions, FAQ, etc.); the implementation of the module has been still opened.
Data Fund (D-FUND) - maintenance and update of metadata on contents and structure of data files included in SIS.A data warehouse has been implemented as the central storage of approved micro and aggregated data; this implementation creates the first phase of the module.
iSMS - Internet presentation of SMS -- new application for presentation of statistical classifications and statistical variables has been developed for external users. There is and intention to extend this application for presentation of selected parts of the TASKS module.
SMS is interlinked with the system of Statistical Registers. The main registers in this system are the following:
- Business Register,
- Register of Census Districts and Buildings, and
- Population Register.
Core principles for SMS implementation
- unified internal users´ interface (search, update, administration),
- unified external users´ interface (navigation, selection, interpretation),
- unified data interfaces between SMS subsystems,
- preserving history of SMS objects,
- update of metadata elements on one place only,
- single authoritative source (registration authority) for each metadata element,
- registration process associated with each metadata element so, that there is a clear identification of ownership, approval status, date of operation etc,
- reuse of metadata where possible for statistical integration as well as efficiency reasons,
- unique storage and update of metadata,
- unified user documentation,
- unified technical documentation,
- standard data protection model,
- consistency of metadata inside the SMS subsystem and between subsystems,
- unified technological tools for implementation.
Steps in implementation of SMS subsystems and responsibility for them
- business system options (BSO) by CZSO,
- technical system options (TSO) by external supplier,
- programming by external supplier,
- testing by CZSO and external supplier,
- pilot processing using selected ST by CZSO and external supplier,
- operational running by CZSO.
4.2 Costs and Benefits
a) SMS financing
Principles of SMS financing:
- BSO are prepared by the CZSO and financed from the CZSO budget.
- TSO are prepared by external suppliers and financed partly from the CZSO budget and partly from resources provided by the EU (Transition Facility programmes and Integrated Operational Programme).
b) SMS benefits
- interlink of statistical data and metadata from beginning to end of SBP allows unified and clear data interpretation,
- strengthening the role of methodology throughout SBP,
- systematic data quality assessment,
- upgrading of data dissemination and interpretation to users,
- integration with other ISs of public administration,
- integration with ISs of international organisations (Eurostat, OECD, UN, IMF, etc.),
- tool for defining phases of SBP,
- tool for management of ST processing.
The current progress of the project makes it obvious that SMS strengthens the role of methodology in defining the content, size and coordination of statistical surveys.
The introduction of project management and organisation of work during SMS implementation increased the CZSO research potential without staff's increases. More staff of different profession groups (management, methodologists, statisticians, IT specialists) got involved. Training courses improved the knowledge of SMS subsystems in all profession groups. Communication barriers were reduced between departments involved in SBP (subject-matter departments, methodology, IT).
4.3 Implementation strategy
The SMS implementation is, in fact, a 'big-bang' approach. At this time, only statistical classifications are maintained and updated in e-way. Statistical tasks are defined without using metadata. Current application processing tools use different identification of statistical variables for central processing and different meta-identification for variables stored in the output database.
Introduction of the SMS into practice implies a change in the process of preparing and designing STs by statistical departments. These activities will rely on work with metadata and hence on using SMS tools. A prerequisite for the use of SMS functions is availability of an updated metadata base. What has to be done further is to bring into being all functions and organisational measures related to metadata administration. Adequate training of all participating actors should precede the SMS implementation.
The main condition for introduction of SMS into the SIS operational running is its functionality in all stages of SBP. Effective and viable interlink of SMS subsystems interpreted in a unified metadata base is a necessary precondition for that. This requirement predefines priorities in design and implementation of SMS subsystems implementation strategy.
In view of the project comprehensiveness and complexity, SMS should be developed step by step. The step-wise approach, however, has a clearly defined framework.
The first stage of the SMS introduction into the practice (2008-2009)
Subsystems CLASS, VAR, ST and QUALITY have been tested on the Annual Labour Costs Survey.
There is to test functionality of SMS namely for the following activities:
The aim of a pilot project
- definition of ST,
- design of statistical questionnaires,
- data validation (logical control specification),
- design of samples (response duty specifications),
- aggregate specifications,
- output specifications,
- preparation of timetables,
- specification of quality attributes of a statistical task.
The pilot project pre-requires the following:
- to complete a database of statistical classifications (SMS-CLASS),
- to unify methodologically a content of statistical survey(s) for the pilot project,
- to complete a description of statistical variables relevant to the pilot and to ensure their storage in the database (SMS-VAR),
- to create a database for definition of statistical tasks (SMS-ST),
- to develop and test an SMS application program package,
- to develop and make operational statistical data warehouse,
- to establish and make operational an SMS administration,
- to accomplish training of personnel for all professions needed for the pilot project (methodology, subject-matter departments, SMS administration, project preparation, IT applications).
Building up and loading of an SMS database has been for the CZSO an entirely new task. In the newly established SMS-CLASS database, the links to the existing (old) e-system of statistical classifications should be maintained until a complete transition of statistical tasks into the new SIS is accomplished.
The second stage of the SMS introduction into the practice (from 2010 on) will be focused on development, implementation and gradual introduction into practice of SMS subsystems for monitoring of quality, time series, dissemination, respondents and users of statistical information. The second stage will comprise also the completion of SMS-CLASS, VAR, TASKS and D-FUND, namely in terms of their links to the newly prepared SMS subsystems.