|3. Statistical Metadata in each phase of the Statistical Business Process (Statistics Netherlands)||Statistics Netherlands||5. System and design issues (Statistics Netherlands)|
4.1 Metadata system(s)
The system that supports the office-wide storage and retrieval of data and metadata, the Data Service Center (DSC), is under development in a pilot phase. A first version is operational by the end of 2009.
The DSC consists of two main components. The first component is the tailor-made classification server that stores and maintains classifications and code lists. The second ccomponent is based on a commercial documentation software package. This contains the metadata that is designed according to the SN Metadata model. The SN Metadata model is inspired by both the Swedish and Neuchâtel model and is meant to describe steady state data. To support a gradual development of the DSC and to guarantee the close connection with statistical processing tools, the SN Metadata model is based upon a separate metadata architecture. This metadata architecture also covers the transformation of data and metadata during statistical processing.
At this moment SN is able to use the commercial software package without any tailor made software. With the help of the configuration possibilities of the software the SN Metadata model (as well as the metadata itself) can be stored and maintained. Using only the configuration mode is important because all new versions of the software can be used automatically without additional programming. It is yet unsure whether or not SN will need tailor made programming in the future; the aim is to avoid it.
The BA requires to distinguish ex ante and ex post metadata. In the design phase ex ante metadata are formulated: they prescribe the statistical data required, including their required quality. During the production phase ex post metadata describe the statistical data that is realized (including their realized quality). Differences between ex ante and ex post are used to derive indicators about the quality of the statistical process and the statistical product. These indicators are meant to trigger possible future redesign phases.
The DSC is able to store conceptual, process and quality metadata; the SN Metadata model however covers conceptual metadata only. Process and quality metadata are stored as free text.
The first version of the DSC will contain ex post quality metadata. Ex ante quality metadata will be added in a next version. Further additional wishes are a more close relation between the two components the DSC consists of.
4.2 Costs and Benefits
Though SN has had a history of trial and error, the present pilot did not costs a lot resources. At the beginning of the pilot phase the SN Metadata model was implemented by 4 software engineers in less then one week. The tailor made classification server is a residual from earlier attempts.
4.3 Implementation strategy
The implementation strategy unfolds along multiple lines. In the first place, all new development projects should act according to the new BA and should take the DSC as a point of departure for the storage of their steady state data. In the second place, existing datasets should be added to the DSC if there is a need for reuse. In the third place all data arriving from outside the office (the so-called pre-input data) will be stored in the DSC.