Message-ID: <2021995665.3957.1495558464762.JavaMail.confluence@ece-vmapps> Subject: Exported From Confluence MIME-Version: 1.0 Content-Type: multipart/related; boundary="----=_Part_3956_2008458973.1495558464761" ------=_Part_3956_2008458973.1495558464761 Content-Type: text/html; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Content-Location: file:///C:/exported.html 3. Statistical Metadata in each phase of the Statistical Busines= s Process (Statistics Canada)

3. Statistical Metadata in each phase of the Statistical Business P= rocess (Statistics Canada)

2. Modelling the Information and Proc= esses of a Statistical Organization (Statistics Canada)  Statistics Canada  4. Statistical Metadata Systems= (Statistics Canada)
=20

 

3.1 Metadata Classification

GSIM is being adopted to specify, design, and implement components that = will easily integrate into =E2=80=9Cplug=E2=80=99n=E2=80=99play=E2=80=9D so= lution architectures and seamlessly link to standard exchange formats (e.g.= DDI, SDMX). It is important to note that GSIM does not make assumptions ab= out the standards or technologies used to implement the model, which leaves= the Agency room to determine its own implementation strategy.

=20

Statistics Canada is beginning to use GSIM=E2=80=99s Concepts a= nd Structures Groups as the main classifiers of metadata. These gr= oups contain the conceptual and structural metadata objects, respect= ively, that are used as inputs and outputs in a statistical business proces= s. The Structures group defines the terms used in relation to data= and their structure. The Concepts group defines the meaning of da= ta, providing an understanding of what the data are measuring.

=20

Work focuses on aligning the new GSIM-based classification with other in= ternal metadata classification models currently in use. For instance, IBSP = identifies the following types of metadata:

=20

-          Reference met= adata: Describes statistical datasets and processes.

=20

-          Definitional = metadata: Description of statistical data (with meaning to business use= r community) E.g., concepts, definitions, variables, classifications, value= meanings and domains.

=20

-          Quality metad= ata: Quality evaluation of a dataset or individual records; helps users= assess the fitness of associated data for their specific purposes. E.g., C= V, rolling estimates, analysts comments about the quality of a set of recor= ds.

=20

-          Operational m= etadata: links between the concepts and the physical data.

=20 =20

-          Systems metad= ata: Low-level information about files, servers and infrastructure that= allows the physical IT environment to be updated without re-specification = by the end user.

=20

=20
=20
=20

[1]  For example: analyst comments about their analy= sis, output of statistical processes; respondent comments, interviewer comm= ents or additional information about the respondent obtained during collect= ion.

3.2 Metadata used/c= reated at each phase

Metadata use is not uniform across all GSBPM phases.   IMDB me= tadata, consisting mostly of GSIM Concepts and Business o= bjects, is used for survey design (phase 2) and dissemination (phase 7).&nb= sp; Survey managers use the IMDB to identify existing variables for reuse. = New variables and related questions will be soon documented and stored duri= ng questionnaire design as well, which will then be used by collection proc= esses across the Agency.  In the dissemination phase, the IMDB is the = primary source of summary texts describing surveys, definitions of variable= s, related methodology, and data quality and questionnaire images.  Mo= st products on the Statistics Canada website offer a link to the related IM= DB survey records. 

=20

The System of National Accounts (SNA) creates and uses metadata (classif= ications) for the data integration sub-process (5.1) of the GSBPM. In parti= cular, the SNA creates the Input-Output Industry Codes (IOIC), Commodity Co= des (IOCC) and Institutional Sectors classifications. They are mainly used = for the GDP surveys (annual, quarterly and monthly). SNA classifications ar= e being exported to the IMDB and integrated into data warehouses for analys= is via a classification web service[1]). In addition, concordances to = NAICS, NAPCS and other international classifications will be maintained in = a Classification Management and Coding System (CMCS). 

=20

The Social Survey Processing Environment (SSPE) has its own metadata rep= ository (see Section IV-B) which is used from design through dissemination,= including survey and questionnaire metadata, codesets and codebooks. The S= SPE repository does not use the same metadata objects that are in the GSIM = Business or Structures groups.

=20

Underlying the GSBPM is the implicit need for common semantics which req= uire some degree of harmonization and maintenance across all phases. Even c= ommon concepts like questionnaire, survey, or classif= ication mean different things across the Agency.   Enterpris= e Architecture Services (EAS), Methodology and Subject Matter areas have wo= rked collaboratively to make progress in semantics work for the IBSP on a n= umber of topics, including:

=20 =20

Statistics Canada=E2=80=99s involvement in the development of the GSIM h= as influenced both GSIM and the Agency=E2=80=99s internal semantic work. A = case in point is the work done by EAS with IBSP and the Integrated Collecti= on and Operation System (ICOS) on survey instrument and questionnaires, whi= ch helped identify the need for a flow decision object separated from flow = action that was included in version 1.0 (submitted to the GSIM group for re= view). This semantic work has been the starting point for developing a cano= nical model for survey instrument and questionnaire for the SOA[3] .=20

IBSP has also developed a conceptual framework and naming convention for= harmonized content. SSPE has developed standardized questionnaire modules = for cross-cutting household survey variables. These modules contain standar= d concepts, definitions, classification and wording for multiple collection= modes.

=20

=20
=20
=20

[1] See Section IV-F-(a) for more information on this service.

=20
=20

[2] It indicates the provenance of the value for data quality pu= rposes.

=20
=20

[3] See Section IV-F for more information on SOA canonical model= s.

3.3 Metada= ta relevant to other business processes

Several projects have been identified as potential content providers or = consumers of the IMDB.  The IMDB now stores documentation for public u= se microdata files as part of the requirements for the Data Liberation Init= iative (DLI) - an initiative between Statistics Canada and Canadian univers= ities to share data for social science research. Statistics Canada makes av= ailable to universities and colleges, by subscription, all of its statistic= al products including microdata files using Data Documentation Initiative (= DDI) specifications.  

=20

Another initiative under development is the Research Data Centre (RDC) M= etadata Project.  Rather than integrating with the IMDB on a case-by-c= ase basis (point-to-point integration), authorized applications can gain ac= cess to content through IT industry standard web services in a standard bas= ed format (DDI).  This approach is expected to reduce development cost= s, allow for code and component reuse across projects and foster the adopti= on of global standards across the Agency.  It will also support the fu= ture establishment of standard-based data and a metadata management framewo= rk.

 

 

 

------=_Part_3956_2008458973.1495558464761 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: file:///C:/fffd57e2ff141ce404ffe732106cc080 R0lGODlhEAAQAKIEAKKiosjIyEFBQerq6v///wAAAAAAAAAAACH5BAEAAAQALAAAAAAQABAAAAM0 SLoq/E84CEWYdNmB8+ZSGAVDaQ4X850lgK3s4KpkLHfEGgD8PJYhUYWE86QykSJSiaQkAAA7 ------=_Part_3956_2008458973.1495558464761 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: file:///C:/8ac24dac918352ef69a5544a290c6285 R0lGODlhEAAQANUkAAB3AJn/ZgBEAP//zBaoDwScA3vrUl7YPz/DKiKwF6POowyhCF7XP0/ONG3h SU/ONTC6IG3iSYfzWk/NNE7ONTC5ICKxF17XPpH6YT/EKiOwF17YPk7ONE/NNQuhCAuiCG3hSDC5 IW3iSIjzWv///wAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA AAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAACH5BAEAACQALAAAAAAQABAAAAZ2QJJw SAIAiEgiYDA4JpWDQKD5FC6lmKkTeZWMJBItN2oom8XWqCPiELEjIPGVwTjYD5vLgaEFBB4TDQ0P HB0UDxQNEwEACgKPAgiSkhmQAgpKEBAVmyFbXAkWCaMan0oEqKmmQwALHguwH6tWBbW2sySOlpdE QQA7 ------=_Part_3956_2008458973.1495558464761 Content-Type: image/gif Content-Transfer-Encoding: base64 Content-Location: file:///C:/839c0ce8604740f614920418fee25894 R0lGODlhEAAQAKIEAKKiosjIyEFBQerq6v///wAAAAAAAAAAACH5BAEAAAQALAAAAAAQABAAAAM0 SLqy/qzBKYKcTgxLhffDdjFBaIZcBpxnqggra7qwjI4woJf38309zEskJI2EnmJGyVQkAAA7 ------=_Part_3956_2008458973.1495558464761--