1.1 Metadata strategy
The common strategy of CSB is available at link: http://www.csb.gov.lv/csp/content/?cat=4417 In 1992 the Latvian government launched, with the assistance of the Commission of the European Communities, a programme to innovate the Central Statistical Bureau of Latvia. We analysed the existing system of statistical indicators and harmonized with EUROSTAT Compendium. The analysis of existing processes and data flows was started simultaneously with the preparation of the data processing model which could help to define the requirements for the new IT system. From 1997 - 1999 Central Statistical Bureau of Latvia (CSB) experts in cooperation with PHARE experts prepared Technical specification for the project "Modernisation of CSB - Data Management System", where all technical and functional requirements for the new system were described and statistical metadata are used as the key element in statistical data processing.
Considering complicity of the project it was decided to delegate authority on development and implementation of ISDMS to outsource company with serious, long time experience in complex, large scale and large budget development projects implementation.
The idea of metadata emerged in CSB in 1999. Since 1999 metadata has been collected and analyzed. In 2002 after thoughtful analysis of data and metadata flows, Integrated Metadata Driven Statistical Data Management System (further IMD SDMS) was created.
Metadata strategy that was defined several years before was developed to cover full cycle of statistical data processing using process oriented approach instead of stovepipe approach of statistical data production.
Currently the IMD SDMS is based on following principles mentioned below:
- metadata must be created/processed/maintained in standardized environment;
- metadata must be created/processed/maintained in an integrated environment;
- metadata must be created/processed/maintained in centralized system;
- metadata must be created/processed/maintained in meta-driven system;
- metadata must be created/processed/maintained in transparent system;
- metadata must be created/processed/maintained in system, allows automated generation of user application forms;
- metadata must be created/processed/maintained in system which has a modular structure;
- metadata must be processed in system that allows closer connection to respondents.
Summing up improvement goals and strategy realised in the system, there are mainly the following targets achieved by the system implementation:
- Increased quality of data, processes and output;
- Integration instead of fragmentation on organizational and IT level;
- Reduced redundant activities, structures and technical solutions wherever integration can cause more effective results;
- More efficient use and availability of statistical data by using common data warehouse (concerning IMD SDMS, see section "Current situation");
- Users provided (statistics users, statistics producers, statistics designers, statistics managers) with adequate, flexible applications at their specific work places;
- Tedious and time consuming tasks replaced by value-added activities through an more effective use of the IT infrastructure;
- Metadata used as the general principle of data processing;
- Electronic data distribution and dissemination used;
- Making extensive use of a flexible database management provides users with high performance, confidentially and security;
Separate storages of data and metadata in CSB should be handled by corporative repository, therefore the strategy in next years will be to focus on a corporative data and metadata repository creation, development and implementation.
One of the main aims of repository is to commonly refer to a location for data and metadata storages, providing data and metadata safety and preservation.
In the future the NSI of Latvia is considering to implement a project, which foresees the creation of the References metadata base.
1.2 Current situation
In 2010-2011 IMD SDMS has been modified to cover statistical data collection and processing for Social Statistics as well. IMD SDMS has received the new name - MetaData Driven Integrated Statistical Data Management System - Computer Assisted Survey Information System (further, MDD ISDMS - CASIS).
For today MDD ISDMS - CASIS covers both: Business and Social statistics. MDD ISDMS - CASIS is capable to replace completely "BLAISE" (e Survey processing system). The Population Census 2011 of Latvia has been started in MDD ISDMS - CASIS. Further it is planned to start others social statistical surveys like Labour Force Survey, EU-SILC and etc.
STATISTICAL DATA COLLECTION, PROCESSING AND DISSEMINATION processes, which are presented in Scheme 1, is the successfully working system, but some elements, like as Common dissemination data base (time series data base); Reference metadata base (SDMX)); the links between metadata bases at the moment are under construction or planned to be developed.
STATISTICAL DATA COLLECTION, PROCESSING AND DISSEMINATION processes of CSB of Latvia are managed by 2 systems:
1.MDD ISDMS - CASIS
2.Data and metadata dissemination system.
1.MDD ISDMS - CASIS. This system is Integrated Statistical Data Management System - Computer Assisted Survey Information System for statistical data management, collection and processing purposes, which covers Computer Assisted Personal Interviewing (CAPI), Computer Assisted Telephone Interviewing (CATI), Computer Assisted WEB Interviewing (CAWI).
It is necessary to mention that despite that the majority of statistical surveys (including business and social statistical surveys) take place the operation cycle through MDD ISDMS - CASIS, nevertheless at present some statistical surveys are processed (including data collection as well) through this MDD ISDMS - CASIS partially or are not processed at all. For such cases, depending on a stage of individual processing of the survey, such means as - MS Access, BLAISE, SPSS, MS Excel are applied.
Main principles of MDD ISDMS - CASIS:
2.Independence of individual programming
6.Closer connection to respondents
7.Automatic applications generation
2.Data and metadata dissemination system.
This system is foreseen for both: to maintenance the common dissemination data base and to storage and loading of references metadata. For the time being it is not an integrated system which holds data and metadata descriptions in completely integrated way, but nevertheless it has a connection between these instances so that the data and metadata descriptions are linked together and data user can see metadata about the particular data table available.
The main problem of the current situation is that common repository for all storages is missing.
Since 1 January 2009, the CSB of Latvia has introduced a reference metadata repository that describes contents and quality of statistical data. The Project's Documentation System (ADS) includes the following information on surveys and calculations of the CSB beginning with 2008 annual statistics and 2009 short-term statistics in Latvian:
1) structured descriptive information according to the production process of statistics (identification of data demand, project preparation, data collection, data processing, data analysis, data dissemination);
2) ESS quality and performance indicators;
3) Thesaurus (definitions of statistical indicators).
Currently ADS is accessible just for internal users. It is planned that selected information will be accessible to external users in mid-2011.