Versions Compared

Key

  • This line was added.
  • This line was removed.
  • Formatting was changed.

Note: To translate this paper into over 50 languages, please see the Automatic translation option at the top of the screen

Image Removed

 

 

Section
Column
Image Added

Word version of document

Column

Image Added

GSIM Communication Paper v1.0 in Norwegian

(kindly provided by Statistics Norway)


Generic Statistical Information Model (GSIM):
Communication Paper for a General Statistical Audience
(Version 1.

...

1, December

...

2013)


Anchor
h.v92vhuu7syee
h.v92vhuu7syee
Anchor
h.ex307scnea5o
h.ex307scnea5o
Anchor
_Toc323718712
_Toc323718712



About this document
This document provides an overview about the information represented in GSIM, and summaries of how the model will benefit statistical organizations and relationships to other models and standards.
Anchor
h.fnzzx08buors
h.fnzzx08buors
Anchor
h.nx7gqse56o29
h.nx7gqse56o29
Anchor
h.9gpk3fknu7dm
h.9gpk3fknu7dm


This work is licensed under the Creative Commons Attribution 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by/3.0/. If you re-use all or part of this work, please attribute it to the United Nations Economic Commission for Europe (UNECE), on behalf of the international statistical community.
Image Removed Image Added
Anchor
_Toc323807413
_Toc323807413

Anchor
_Toc335952263
_Toc335952263




Table of Contents
Introduction
Scope
What is GSIM?
Benefits of GSIM for the organization as a whole
GSIM and GSBPM
What does it mean for me?
The Business view
The Information Technology view
SDMX, DDI and other standards
Summary

Anchor
_Toc343790028
_Toc343790028
Introduction


1.Across the world statistical organizations undertake similar activities albeit with variations in the processes each uses. Each of these activities use and produce similar information (for example all organizations use classifications, create data sets and publish productsdisseminate information). Although the information used by statistical organizations is at its core the same, all organizations tend to describe this information slightly differently (and often in different ways within each organization). There is In the past, there was no common way to describe the information that is used. This makes it difficult to communicate clearly within and between statistical organizations and without this there is was no foundation for in-depth collaboration, standardization, or the sharing of tools and methods.

...

4.GSIM is one of the cornerstones for modernizing official statistics and moving away from subject matter silos. It is a key element of the strategic vision prepared by the High-Level Group for the Modernization of Statistical Production and Services (HLG), and endorsed by the Conference of European Statisticians

Footnote Macro

See: www1.unece.org/stat/platform/display/hlgbas

.

5. The modernization of statistical production is needed in order for statistical organizations are to remain relevant and flexible in a dynamic and competitive information environment. It is hoped that statistical organizations will adopt and implement GSIM and the common language it provides. However, a model alone cannot transform an organization or its processes. In order to meet the future needs of statistical organizations, GSIM is designed to allow for innovative approaches to statistical production to the greatest extent possible; for example, in the area of dissemination, where demands for agility and innovation are increasing. It is one of the main foundations of the Common Statistical Production Architecture

Footnote Macro

See: http://www1.unece.org/stat/platform/display/CSPA

, a collaborative initiative to design common and interchangeable services with standard interfaces to support standardisation and modernisation. At the same time, GSIM supports current approaches ways of producing statistics.

6.This paper provides an introduction to GSIM, summarizing the key points for a relatively general statistical audience. For more technical detail, please see the Specification specification document and the User Guiderelated material, available on the UNECE web site

Footnote Macro

See: http://www1.unece.org/stat/platform/display/metis/Generic+Statistical+Information+Model+(GSIM)

.

Anchor
h.wqz07klpa7b6
h.wqz07klpa7b6
Anchor
_Toc323807414
_Toc323807414
Anchor
_Toc335952264
_Toc335952264
Anchor
_Toc343790029
_Toc343790029
Scope

 

7. GSIM provides the information object framework supporting all statistical production processes such as those described in the Generic Statistical Business Process Model (GSBPM)

Footnote Macro

See: www.unece.org/stats/gsbpm

, giving the information objects agreed names, defining them, specifying their essential properties, and indicating their relationships with other information objects. It does not, however, make assumptions about the standards or technologies used to implement the model.

...

Anchor
_Toc343790030
_Toc343790030
What is GSIM?


9. GSIM contains objects which specify information about the real world – 'information objects'. Examples include data and metadata (such as classifications) as well as the rules and parameters needed for production processes to run (for example, data editing rules). GSIM identifies around

...

110 information objects, which are grouped into four top-level groups, and are explained in more detail in the specification documentation.

...

Image Added

Anchor
_Toc343259837
_Toc343259837
Figure 1. GSIM Top-level information object Groups

10. The four top-level groups are described below:

The Business group is used to capture the designs and plans of statistical programs, and the processes undertaken to deliver those programs. This includes the identification of a Statistical Need, the Acquisition, Production and Dissemination Activities Business Processes that comprise the statistical program Statistical Program and the evaluations of them. 

The Production Exchange group is used to describe each step in the statistical process, with a particular focus on describing the inputs and outputs of these stepscatalogue the information that comes in and out of a statistical organization via Exchange Channels. It includes objects that describe the collection and dissemination of information.

The Concepts group is used to define the meaning of data, providing an understanding of what the data are measuring.

The Structures group is used to describe and define the terms used in relation to data information and its structure.

11.Figure 2 shows a simplified view of the information objects identified in GSIM. It gives users examples of the objects that are in each of the four top-level groups.Image Removed

Image Added

Anchor
_Toc343259838
_Toc343259838
Anchor
_Toc324357976
_Toc324357976
Anchor
_Toc324360462
_Toc324360462
Anchor
_Toc335952269
_Toc335952269
Figure 2. Simplified view of GSIM information objects

12.Figure 3 shows another

...

view of one part of GSIM. This is a slightly more technical view

...

, but still intended to be accessible by a relatively wide audience. Both figures 2 and 3 can

...

be used as a means for communication with users who are interested in examples of the objects and relationships in GSIM.

...

Image Added

...

Anchor
_Toc343259839
_Toc343259839
Figure 3. Alternate simplified view of GSIM information objects

13. Figure 3

...

gives an example of GSIM information objects that tell a story about some of the information that is important in a statistical organisation. Information objects in the GSIM model are given in italics.

"A statistical organization

...

initiates a Statistical

...

Program. The Statistical Program corresponds to an ongoing activity such as a survey or an output series and has a Statistical Program Cycle (for example it repeats quarterly or annually).

The Statistical Program Cycle will include a set of Business Processes. The Business Processes consist of a number of Process Steps which are specified by a Process Design. These Process

...

Designs have Process

...

Input Specifications and Process

...

Output Specifications. The specifications will often be pieces of information that refer to Concepts and Structures (for example, Statistical Classification, Variable, Population, Data Structure, and Data Set).

...

If, for example, the Business Process is related to the collection of data, there will be an Information Provider who agrees to provide the statistical organisation with data (via a Provision Agreement). This Provision Agreement specifies an agreed Data Structure and governs the Exchange Channel used for the incoming information. The Exchange Channel could be a Questionnaire or an Administrative Register. It will receive the information via a particular mechanism (Protocol) such as an interview or a data file exchange.
The Data Set produced by the Exchange Channel will be stored in a Data Resource and structured by a Data Structure.

Anchor
_GoBack
_GoBack
"

Anchor
_Toc343790031
_Toc343790031
Benefits of GSIM for the organization as a whole

...

  • Between the different roles in statistical production (business and information technology experts);
  • Between the different statistical subject matter domains;
  • Between statistical organizations at national and international levels.

 


19. Improving communication will result in a more efficient exchange of data and metadata within and between statistical organizations, and also with external users and suppliers.

...

  • Build capability among staff by using GSIM as a teaching aid that provides a simple easy to understand view of complex information and clear definitions;
  • Validate existing information systems and compare with emerging international best practice and where appropriate leverage off international expertise;
  • Guide development or updating of international or local standards to ensure they meet the broadest needs of the international statistical community.

 

Anchor
_Toc323807420
_Toc323807420
Anchor
_Toc335952270
_Toc335952270
Anchor
_Toc343790032
_Toc343790032
GSIM and GSBPM


21.GSIM and GSBPM are complementary models for the production and management of statistical information. GSBPM models the statistical production process and identifies the activities undertaken by producers of official statistics that result in information outputs. These activities are broken down into sub-processes, such as "Impute" and "Calculate aggregates". As shown in Figure 6, GSIM helps describe GSBPM sub-processes by defining the information objects that flow between them, that are created in them, and that are used by them to produce official statistics.

...

Image Added

Figure 4. GSIM and GSBPM

22. Greater value will be obtained from GSIM if it is applied in conjunction with GSBPM. Likewise, greater value will be obtained from GSBPM if it is applied in conjunction with GSIM. Nevertheless, it is possible (although not ideal) to apply one without the other. In the same way that individual statistical business processes do not use all of the sub-processes described within GSBPM, not every information object in GSIM is necessarily required to be used and/or produced in the course of every it is very unlikely that all information objects in the GSIM will be needed in any specific statistical business process.

...

  • Facilitate the building of efficient metadata driven collection, processing, and dissemination systems.;
  • Help harmonize statistical computing infrastructures.


25.GSIM supports a consistent approach to metadata, facilitating the primary role for metadata envisaged in Part A of the Common Metadata Framework "Statistical Metadata in a Corporate Context"

Footnote Macro

http://www1.unece.org/stat/platform/display/metis/The+Common+Metadata+Framework

, that is, that metadata should uniquely and formally define the content and links between objects and processes in the statistical information system.
Anchor
_Toc323807421
_Toc323807421
Anchor
_Toc335952271
_Toc335952271

...

  • Subject matter specialists, methodologists and information technologists.;
  • Statisticians in different domains of a statistical organization.;
  • Statisticians in different organizations.

28. GSIM will help you design and understand your processes (and their inputs and outputs) better. 

29.For a production cycle, a statistician can design the input and the output, and the process in-between. In GSIM terms, the output and the input can be designed in terms of structures and concepts information objects, and the process in-between can be designed using the production business information objects. The structures and concepts objects are provided by subject matter specialists.

30. As seen in Figure 4, if the GSBPM is considered as a frame of reference for statistical production processes, the first level can be considered as equivalent to the statistical production process as a whole. The next level corresponds to a phase of the statistical production process (for example the "Process" phase 5 of the GSBPM). The third level corresponds to a sub-process (for example sub-process 5.3 of the GSBPM – Review , and validate and edit). The fourth level consists of the individual building blocks within the sub-process, such as detecting financial values that might be expressed in thousands rather than units.

...

Image Added

Anchor
_Toc343259840
_Toc343259840
Figure 5. GSIM information objects in context of GSBPM

31. An important issue for statisticians is the problem of single-use design components, which are often recreated or at least modified for each production cycle. GSIM facilitates the description of inputs and outputs at each level of the GSBPM, following the same pattern thus providing a consistent structure to design statistical processes. It supports the design, specification and implementation of harmonized methods and standard technology to create a generalized statistical production system.

...

40. At the national level, statisticians will become more self-supporting in the design (see Figure 56) and production of their statistics reusing and repurposing harmonized components GSIM will enable more flexible and modular production systems. Production will be based upon more standardized applications that are more robust to change and less vulnerable to changing of IT personnel. An increase in the use of standardized applications, which can easily be shared across domains, will enable the IT specialists to more easily work in different domains.

41. The use of GSIM will reduce the workload as many components can be repurposed and reused. This means less repetitive work and more time for innovation.

42. This will free the IT staff to make more robust applications and explore new ways to better meet the changing needs of the statistical organization and their clients at large. This will include more time for creation of robust, modular, harmonized, well documented processes that

...

comply with the requirements of the Common Statistical Production Architecture.
Image Added

Anchor
_Toc343259841
_Toc343259841
Figure 6. Design your own imputation process

43. At the international level there will be increased possibilities for co-design and co-development of common components based upon more robust user-requirements from a wider user-community. The IT developers will also have access to a larger development community that all speak the same language to describe their statistical information.

...

45. The information objects within GSIM are conceptual; no specific physical representation of the information is prescribed. As a simplified illustration, the "street address" of a person's home name of an organization can be defined as the same concept regardless of whether the information is recorded in a database, in a spreadsheet, in a CSV file, in an XML file or handwritten on a piece of paper.

46. GSIM allows organizations to start with a common language related to the data and metadata used throughout the statistical production process. The next step, which will be undertaken internationally on a collaborative basis, is to map (or relate) information objects from GSIM In this context, GSIM information objects have been mapped to relevant representations in SDMX and DDI.

...

48. While GSIM information objects can be mapped to SDMX and DDI (and substantial business benefit can be obtained from harnessing these standards), GSIM does not require these standards to be used. Some producers and some users of statistics may decide to use alternative standards for particular purposes. In other cases, producers of statistics may be open to using SDMX and/or DDI but have legacy information systems which are not economical to update to for use with these standards.

49. Describing statistical information using GSIM as the common point of reference helps users identify the relationship between two sets of statistical information which are represented differently from a technical perspective.

50.For example, a statistician may receive some data described in DDI and some described in a locally created format. The statistician can relate both of these to GSIM. The statistician will be able identify which differences are purely technical and which reflect underlying conceptual differences.

51. Once the nature and extent of the differences can be understood, it commonly often proves straightforward to transform the information into a common technical representation (for example, SDMX or DDI) which allows the content to be integrated and explored. This approach ensures thatthe that the results of the technical conversion to a common standard are accurately understood, and are sound, from a conceptual perspective.

...

53. For example, when determining the set of definitions to be used for information objects within GSIM, existing standards and models were harnessed as key reference sources. While none of these existing sources had the same purpose and scope as GSIM – that is a reference framework of information objects spanning the full statistical production processes process – the development of each entailed analysing and supporting particular needs and scenarios related to particular types of statistical data and metadata.

54. In this way GSIM benefited from the investment of time in analysis, modelling, testing and refinement when developing these standards and models to their current level of maturity. It also means GSIM does not vary "for no reason" from terms and definitions which are used in existing standards and models. Where it does vary it is for reasons such as existing relevant standards and models being inconsistent internally, with one another and/or statisticians reporting that alternative terms or definitions are more relevant to their business needs. A direct consequence of this was the revision of the Neuchâtel Model for Classifications, to fully align and integrate it with GSIM.

Anchor
_Toc343790037
_Toc343790037
Summary


55. This paper introduces GSIM to people working in statistical organisations. It outlines the benefits of the model as well as how the adoption of the model might impact staff in statistical organisations. The paper also discusses the interaction of GSIM and other frameworks and standards such as GSBPM, DDI and SDMX.

56.For more information on how a statistical agency might implement GSIM, the GSIM User Guide introduces the steps that need to be undertaken. 57.For more detailed information on the information objects in GSIM, their definitions, attributes and relations, the GSIM Specification document provides a fine level detail and also discusses the relationship between GSIM and other standards and models. The GSIM wiki page

Footnote Macro

http://www1.unece.org/stat/platform/display/metis/Generic+Statistical+Information+Model

also includes links to information about practical implementations, and other resources that might be useful to organisations adopting GSIM as a corporate standard.

...

Display Footnotes Macro

 

Viewtracker
hidetrue

Show If
currentSpace
groupconfluence-administrators
falsedd-MMM-yyyy