METIS

Quick links

GSBPM

Common Metadata Framework

Metadata Case Studies

GSIM

 All METIS pages (click arrow to expand)
Skip to end of metadata
Go to start of metadata

2.1 Statistical business process


Figure 1 shows where the metadata in the IMDB supports the statistical business process. While the metadata layer extends across all of the phases of the statistical business process, metadata in the IMDB currently supports or will support analysis, disseminated data, archived datafiles, and the planning and design of surveys. However, metadata are derived from the different phases of the survey life cycle and stored in the IMDB. Also, metadata in the IMDB are linked to the Agency's various data products such as datawarehouses, which hold both micro- and macrodata; and may be used for data analysis (i.e., data benchmarking and data confrontation). The operational datastores hold the raw data collected from questionnaires (operational data), the registers (e.g., business register, address register, farm register and geographies) used for survey frames, imputed and estimated data (survey data) and administrative data. The relationship between the IMDB and the operational datastores has not been fully established.  
 

Figure 1. The role of the IMDB in the survey life cycle

 

IBSP
The following figure shows ISBP alignment with the STC Corporate Business Architecture (CBA) and the GSBPM.

 

Below are figures showing the idea of the IBSP as a metadata driven system showing in the second figure the IMDB feeding the IBSP metadata repository
 

2.2 Current systems


The business model describes the survey design, questionnaires, processing, data sets and products. It contains the metadata for the different phases of the survey. The IMDB has adopted a modified version of the Corporate Metadata Repository (CMR) business dimension model to store metadata describing a survey and its documentation (Figure 2).
 
Figure 2. Business model in the IMDB.

 

The IMDB model defines the entities for describing Statistics Canada's surveys and statistical programs, their content and their methodology, and the relationships between them. There are metainformation systems that support the data collection and data processing phases of the statistical business process, which are not part of the IMDB.

The basic structure of the metadata in the IMDB is illustrated in Figure 3. Each entity is referred to as an administered item. Each of the administered items in the IMDB represents a part of the statistical business process and the Data Dimension of the CMR (i.e., data elements and value domains). Administered items are defined, and may be reused or shared; and they are also managed, tracked and organized. In order to complete the latter, each administered item is supported by the following "regions", outlined in red in Figure 3. The stewardship region (e.g., organization, contact and documentation) supports the administration aspects of the administered item such as the responsible division and information for registration as well as supporting documentation. The identification region (e.g., identification and time frame) manages the name of the administered item and the time context for the administered item. The classification region (e.g., keyword and themes) manages the classifications and keywords to which administered items are assigned. In Statistics Canada, some administered items (e.g., surveys and questionnaires), data tables, data releases and publications are organized around themes and sub-themes.

In Figure 3, the administered items have been grouped into items that support information about the survey and its "umbrella" statistical activity; the survey methodology; and data elements. The green arrows show some of the relationships between these administered items. In the model, all the administered items describing data sources and methodology (i.e., methodology box) are attached to the survey instance; survey instances are linked to the survey; and data elements (variables) and value domains (classifications) are linked to the data file.
The administered items in the current version of the IMDB are: 1. Statistical activity; 2. Survey; 3. Instance; 4. Universe; 5. Instrument (questionnaires); 6. Methodology; 7. Documentation; 8. Data Files; and 9. Questions.
 
Figure 3: Administered items in the IMDB.

 
 

2.3 Costs and benefits


The IMDB was first initiated in 1998 with addition of supplementary budget for systems developers, which now has been converted to permanent systems development resources of 2.5 FTEs per year. Below, are the resources in terms of full-time equivalents (FTEs) from fiscal year 2004/2005 to 2007/2008. These include support for the system architecture, systems development, a database administrator and 7 to 9 FTEs for maintaining the IMDB as well as working with survey and program areas in order to educate them on IMDB model and requirements.

IMDB Resources

 

 

 

 

 

 

 

 

 

(Full-time equivalents)

2004/2005

2005/2006

2006/2007

2007/2008

Software development

         2.5

         2.2

         2.9

         3.0

Maintenance of metadata

       11.2

         9.5

         7.0

         9.9


2.4 Implementation strategy


The IMDB has been implemented in a "step-wise" approach beginning with 3 development phases, and followed by identifying opportunities to re-use metadata and to expand the IMDB metadata model to link to other information systems in the Agency.

The first phase (Phase 1) was a set of static web pages describing the data sources and methods for each of approximately 400 active statistical programs and surveys, and a similar number of inactive ones. These were accessible only through hyperlinks from data tables and publications on the Statistics Canada website, while internal users could in addition browse the full inventory of these documents on the Standards Division Intranet site.

In 2000, Phase 2 of the IMDB project was implemented. Metadata was collected from a variety of pre-existing sources and reformatted, submitted to author divisions for validation and loaded into a new metadatabase. A public version was made available on the Statistics Canada website and its internal mirror site while an internal version is posted on the Standards Division Intranet site. The public version can be accessed through hyperlinks from the Daily (i.e., Statistics Canada's vehicle for data releases), and other Statistics Canada products. Updates are triggered by new releases in the Daily so that up-to-date metadata is made available for every new data release. The content of the Phase 2 web pages is based on the requirements of the Policy on Informing Users of Data Quality and Methodology.

Since November 2000, the major efforts have been directed towards improving the quality of the Phase 2 content and developing Phase 3, which adds definitions of concepts, variables and classifications to the database. The Agency continues to complete Phase 3 of the IMDB, which is the most challenging phase to implement since it is the most complex and least documented.

  • No labels