Generic Statistical Information Model

An innovative collaboration which facilitates international collaboration

Primary authors: Aurito Rivera , Alistair Hamilton & Steven Vale

I.               Introduction

1.         Version 1 of the Generic Statistical Information Model (GSIM), released in December 2012, was the product of an innovative process of collaboration between national and international statistical agencies.

2.         GSIM is a reference framework of internationally agreed definitions, attributes and relationships that describe the pieces of information that are used in the production of official statistics (information objects).

3.         Examples of information objects in scope for GSIM include data (held in datasets) and metadata (eg classifications, variables, questions, populations)

4.         While there are some differences in nature and purpose, statistical agencies have a long history of working together to establish conceptual frameworks for statistical domains.  One well-known and broadly influential example is the System of National Accounts (SNA).

5.         The program of work to develop SNA 2008, as an update to SNA 1993, was initiated in 2003 [1] .  The final version was released in English in 2009 [2] .  Several member countries in the OECD will not implement SNA 2008 until 2014 or later [3] .

6.         Using innovative techniques, the GSIM reference framework was developed in less than two years.  The bulk of the development, in fact, occurred over a period of ten months.

7.         This paper focuses on how (and why) GSIM was developed.  It identifies learnings which might be relevant to future collaborative developments.  It also discusses strategic benefits associated with implementing GSIM, including a range of “next step” activities which are making use of GSIM V1.0 now that it has been delivered.

8.         The paper does not introduce, in detail, GSIM as a product.  Readily accessible information for understanding GSIM as a product - including brochures, posters, presentations, a readers’ guide and a user guide – are available through the GSIM Home Page [4] .

II.            Why development of GSIM was initiated by the Statistical Network (SN)

9.         The inaugural meeting of the Informal CSTAT [5] workgroup on stronger collaboration on Statistical Information Management Systems was held in Paris in June 2010.  The meeting agreed the informal workgroup should, in future, be known as the Statistical Network (SN).

10.     The National Statistical Offices (NSOs) which took part in the initial meeting were from Australia, Canada, New Zealand, Norway, Sweden and the United Kingdom.  Italy joined the SN in 2012.

11.     The agreed purpose for the SN was

Working together with pace and passion to better meet our societies’ information needs while driving down costs .

12.     The critical goal was

Harmonising statistical methods, systems and capabilities across statistical agencies.

13.     The objective was collaboration in practical small steps to industrialise methods and processes to quickly and effectively benefit all participating NSOs. [6]

14.     The inaugural meeting selected an initial set of five collaboration opportunities to improve statistical information management.

15.     Four of these opportunities focused on specific processes, each of which was located within a particular phase of the Generic Statistical Business Process Model (GSBPM). [7]

16.     The SN also identified an essential role for a consistent reference model which would be used when defining the information required to drive any statistical production process and to describe the outputs from that process.

17.     For example, if the Statistical Network were to collaborate to develop a tool to support a business process such as estimation

    how would the structure of the input and output datasets be defined?

    how would variables be defined, and linked to the data structure?

    how would different levels of aggregation for particular variables be represented?

    how would the relevant weights be identified?

18.     If different sub-processes, and the tools to support them, were to fit together coherently – supporting a consistent flow of data and metadata through the statistical business process overall – the modelling of statistical information should be addressed on a consistent basis within each “small step” collaboration undertaken by the SN.

19.     Establishing a common reference framework for modelling statistical information was, therefore, seen as the highest priority strategic enabler for ensuring coherence across the various practical collaborations the SN would undertake.  The SN initiated development of GSIM as the fifth collaboration opportunity.

20.     It was recognised at this time that GSIM would also facilitate efficient and effective collaboration in the development and sharing of statistical information systems and statistical information management frameworks beyond the SN.  While initiated by the SN, development of GSIM was not considered to be exclusively of interest to members of the SN.

21.     GSIM was envisaged as a common reference for modelling statistical information at the conceptual and logical semantic levels.  GSIM would then be operationalized (or implemented in practice ) on a consistent basis in order to define the information required to drive a particular statistical production process, including to define the outputs (eg transformed statistical data) and outcomes (eg process metrics) from the process.

22.     It was anticipated that existing statistical information exchange standards (primarily SDMX [8] and DDI-L [9] ), with further evolution in some regards, would provide a sound and consistent basis for operationalizing GSIM rather than needing to develop completely new technical standards to support operationalization.

23.     The role of GSIM when defining and exchanging statistical information can be seen as in some ways analogous to the role of SNA when compiling and comparing national accounts data.  Exact implementation practices for collecting and compiling data to produce National Accounts need to be decided at a national level – SNA does not provide a complete “ready to implement” blueprint.  SNA does, however, provide a common reference framework to promote consistency of definitions and practices and to promote comparability of outputs.

24.     After the inaugural meeting in June 2010, it took some time for top level project co-ordination and oversight processes to be agreed for the SN.  The project team responsible for the GSIM collaboration, which included members from each of the six NSOs within the SN, first met (by teleconference) in late November 2010.  The initial project plan for developing GSIM was agreed just before Christmas.

25.     GSIM V0.1 [10] was released for external review in June 2011.

26.     Members of the collaboration team had not met in person up to this time.  All work had been transacted via teleconferences, via use of an online collaboration space and via email.

III.          Establishment of HLG for Modernisation of Statistical Production and Services with GSIM as a cornerstone of its strategic vision

27.     The body now known as the High Level Group for Modernisation of Statistical Production and Services (HLG) [11] was established by the Bureau of the Conference of European Statisticians late in 2010 [12] .

28.     The mission of the HLG is to oversee development of frameworks, and sharing of information, tools and methods, which support the modernisation of statistical organisations. The aim is to improve the efficiency of the statistical production process, and the ability to produce outputs that better meet user needs. [13]

29.     There are currently ten members of HLG.  Each is the head of a NSO or the chief statistician of an international organisation (UNECE, OECD, Eurostat).  There is geographic diversity in the membership of the HLG (eg Europe, Latin America, Asia).

30.     In March 2011 a Strategic Vision [14] was released by HLG as its first output.

31.     The Strategic Vision outlines the fundamental challenges facing producers of official statistics in the 21 st century.  The Vision identifies that producers of official statistics need to re-invent our products and processes, and adapt to a changed world.  Addressing these challenges is essential if producers of official statistics are to remain as relevant to the information needs of our governments and societies as we were in the 20 th century.

32.     The Vision identifies that the challenges are too big for each organisation to tackle on its own.  We need to work together!

33.     The Vision highlights that producers of official statistics should be able to work together as an “industry”, including defining and applying shared “industry standards and frameworks” which will facilitate collaborative development and sharing of processes, methods, IT components and statistical information.

34.     The Strategic Vision referenced GSIM as a cornerstone for industrialising statistics.  This brought GSIM to the attention of a much wider audience and highlighted its important role in the global agenda for pursuing modernisation.

IV.         Accelerating development of GSIM

35.     Twelve sets of comments were received from external reviewers in regard to GSIM V0.1, including reviewers from major international organisations.  Three sets of comments “re-envisaged”, each in different way, the approach to the top level structure and presentation of GSIM.

36.     The approach to the top level of GSIM had received protracted discussion within the SN project team previously, with several different approaches being trialled and none proving entirely satisfactory from the team’s perspective.   The team therefore started by engaging the three reviewers further.

37.     The inaugural Workshop on Strategic Developments in Business Architecture [15] was convened by HLG in October/November 2011.  The Workshop brought together HLG members and representatives from the twenty or so partnerships/collaborations (including the Statistical Network) which are recognised by HLG in its inventory of international groups whose work is related to the enterprise architecture of statistical organisations. [16]

38.     While there was uniform agreement at the Workshop that development of GSIM should be a top priority, particularly as a number of other initiatives depended on it, participants did not have a consistent view of the expected scope, purpose and nature of GSIM.

39.     HLG met immediately after the Workshop and requested that its Secretariat team prepare proposals for a “sprint” session to accelerate the development of the GSIM.

40.     A “sprint” was defined as an intensive collaboration exercise, bringing together appropriate experts in a single location for a fixed period of time to achieve a given set of outputs.

a.   Planning the Sprints

41.     The proposal which was prepared by the HLG Secretariat [17] refined the original concept of a single sprint of four weeks’ duration to a two phase approach consisting of two sprints, each of two weeks duration.  A gap of six weeks would be provided between the sprints to allow the international statistical community to react to the outcomes of the first sprint, before the second sprint started.

42.     This approach would also allow some changes in participants between the two sprints, to make best use of different competences.

43.     It was proposed that the first sprint would focus on what the GSIM is, and what it is expected to achieve, the result being an overview, at a similar level of abstraction to the GSBPM diagram and a sound business case to continue the work. The second sprint would focus more on practical implementation issues.

44.     While the majority of participants would have a background in fields such as metadata management, informatics and enterprise architecture, such participants would be required to have demonstrated a strategic, business oriented focus which allowed them to deliver frameworks that were understood, valued and applied by the business in practice.  The teams for the two sprints would also include members who were experts in designing and managing statistical business and statistical methodology.

45.     It was proposed that a professional facilitator, with relevant skills and experience, would be engaged to support both Sprints.  The facilitator would provide professional planning and leadership for “process” aspects of each sprint, leaving all other participants free to focus on “content”.

46.     A design feature that was common across all phases of both Sprints was working in smaller syndicate groups for prolonged periods to focus on different issues. Every now and then all participants (around a dozen people in total) would convene as a whole to review the progress on various issues and examine proposals from syndicate groups. This effectively coordinated efforts and harmonised ideas across the Sprint teams as a whole.

47.     Another key feature of the Sprints and the wider GSIM development work, strongly encouraged by the HLG, was openness. This took several forms, including inviting a representative of the data archives community to join the second sprint and subsequent activities, using social media to publish updates on progress during the sprints, and generally encouraging input and feedback from as many people and organisations as possible, not just those who were physically present. 

48.     HLG approved the sprint proposal in December 2012 and detailed organisation of Sprint 1 commenced.

b.   GSIM Sprint 1

49.     GSIM Sprint 1 took place from 20 February to 2 March 2012.  It was hosted by the Statistical Office of the Republic of Slovenia.

50.     HLG gave participants the following objectives;

    come up with a compelling business case for GSIM

    ensure strong agreement on the fundamental scope, and purpose for GSIM

    ensure strong agreement on the value and use of GSIM

    design a high level view of GSIM that is readily understood

    come up with a plan for the further development of GSIM

51.     The broad structure of Sprint 1 was as follows

Phase

Objective

1

(Days 1 to 3)

Ensure a common understanding of the context, background and parameters for the task of developing GSIM. Having achieved this, come an agreed definition of the problem, the requirements and the outputs Sprint 1 will deliver.

2

(Days 4 to 8)

Develop the agreed outputs

3

(Days 9 to 10)

‘Package and present’ the outputs to facilitate input from a wider and more diverse audience over the subsequent weeks

52.     The main published output from Sprint 1 was known as GSIM V0.3.

53.     While the rationale, scope and purpose for GSIM V0.3 were highly consistent with those of GSIM   V0.1, the characteristics were more clearly expressed and more broadly agreed on in GSIM   V0.3.

54.     GSIM V0.3 simplified the top level to only 4 Groups – compared to 14 elements in GSIM   V0.1. Each group was clearly differentiated form the others in terms of its fundamental character.

55.     The fundamental agreements reached during Sprint 1 proved effective and robust through to the delivery of GSIM V1.0.

56.     On the last day of Sprint 1, the team held a teleconference to report their findings and outputs to members of HLG.  HLG congratulated the team on its outputs and confirmed

    GSIM V0.3 should be circulated widely to statistical agencies for input, and

    Sprint 2 should proceed as planned.

c.    GSIM Sprint 2

57.     GSIM Sprint 2 took place from 16 – 27 April 2012.  It was hosted by Statistics Korea.

58.     HLG gave participants the following objectives;

    review and incorporate stakeholder feedback on GSIM V0.3

    identify defining characteristics of GSIM objects (i.e. definitions, attributes and relationships)

    formulate detailed and demonstrable business-driven use cases to illustrate GSIM application

    identify relationships with GSBPM

    map GSIM to DDI and SDMX and identify gaps in the Standards

59.     The broad structure of Sprint 2 was as follows

Phase

Objective(s)

1

(Days 1 to 2)

Ensure a common understanding of the context for Sprint 2.

Having achieved this, review the feedback received on GSIM V0.3.

2

(Days 3 to 7)

Define the identifying characteristics of GSIM objects.

Formulate detailed use cases.

Test and enhance the applicability of GSIM V0.3

Map GSIM V0.3 to the SDMX and DDI implementation Standards.

Demonstrate potential advantages of GSIM.

3

(Days 8 to 10)

‘Package and present’ the outputs for presentation to HLG.

Develop a Business Plan for developing GSIM V1.0

60.     The main published output from Sprint 2 was known as GSIM V0.4.

61.     GSIM V0.4 established, for the first time, a clear perspective on what would be entailed in completing the detailed specification for GSIM V1.0.

62.     While GSIM V0.3 had established a top level perspective for GSIM, external feedback on V0.3, together with improved understanding of the detailed structure of GSIM developed during Sprint 2, led to refinement of the top level presentation of GSIM.

63.     While not yet a full enumeration, the number of GSIM information objects identified in V0.4 was much greater than in V0.3.

64.     Having established a clear perspective on what would be entailed in completing the detailed specification for GSIM V1.0, Sprint 2 was able to prepare a “Roadmap” for HLG of the work required to progress from GSIM V0.4 to V1.0.  HLG members had indicated that GSIM V1.0 should be released by the end of 2012 so, to some extent, the Roadmap was created by “working backwards” from that target.

65.     On the last day of Sprint 2, the team held a teleconference to report their findings and introduce their main outputs (including the Roadmap) to members of HLG.

66.     HLG congratulated the team on its outputs and confirmed

    GSIM V0.4 should be circulated widely to statistical agencies for input, and

    The Roadmap should be used as the basis for preparing a Business Plan for completing development of GSIM V1.0

d.   GSIM Development Project

67.     HLG’s request for a Business Plan reflected the fact that the program of two sprints had achieved its objectives.  There was now a clear and agreed path to complete GSIM V1.0.  This permitted, and required, definition of a more detailed project governance and management plan.

68.     The Business Plan was required to identify

    what teams would be required to complete development of GSIM V1.0,

    how these teams would be resourced, and

    how these teams would work together

69.     The Business Plan [18] was prepared during May 2012.  It was reviewed and approved by HLG on 8   June 2012.

70.     The Business Plan entailed

    appointment of a full time project manager for the duration of the remaining work,

    formation of Specification Layer Task Teams (SLTTs) to develop detailed specifications for objects within each of the four top level Groups identified by Sprint 1,

    support for the SLTTs from professional information modellers and standards experts,

    establishment of a GSIM Integration Team to help ensure the work by the 4 SLTTs remained consistent (across the 4 teams and with the overall design objectives and principles for GSIM), and

    establishment - in October 2012 after the SLTT’s had completed their work - of Task Teams to develop a Communications Plan and User Guide for GSIM V1.0.

71.     While the SLTTs required staff additional to those who participated in the two Sprints, a cadre of staff who had worked together in person during the Sprints played a key role in moving the detailed work forward in a timely and co-ordinated manner.

72.     The work of the SLTTs required around 50 web conferences over the space of 10 weeks.  Many members around the world participated outside their business hours - eg very late at night or very early in the morning.  Members of SLTTs also undertook extensive analysis of various modelling issues and options between web conferences.

73.     Collaboration within SLTTs became a significant part of members’ “day jobs” during the ten week period.  Availability of time for staff to commit themselves to this work needed to be negotiated and confirmed with senior managers before the work commenced.

74.     At the completion of the SLTTs’ work, an Integration Workshop was convened by Statistics Netherlands from 17-21 September 2012.

75.     The Integration Workshop allowed representatives from the SLTTs to finalise and integrate their detailed work on the specification of GSIM.  It allowed the upper level presentation of GSIM to be updated based on the fully defined lowest level and based on external review feedback on GSIM V0.4.

76.     The output from the Integration Workshop was GSIM V0.8, the first version of GSIM which was specified in full at its most detailed level.

77.     GSIM V0.8, and early feedback received on it, was presented to the second annual Workshop on Strategic Developments in Business Architecture in Statistics which was held from 7 8   November 2012 [19] .

78.     As described in paragraph 37 , discussion at the equivalent workshop in 2011 led HLG to accelerate development of GSIM.  One year later a “near final” version of GSIM was presented to the representatives from the twenty or so partnerships/collaborations recognised in HLG’s Inventory of International Groups.

79.     Over the subsequent weeks feedback received from HLG members, from Workshop participants and from other reviewers of GSIM V0.8 was incorporated to produce a draft of GSIM V1.0.

80.     The draft of GSIM V1.0 was then reviewed, signed off and released on the web on 21   December 2012.

V.           Communicating GSIM V1.0

81.     As a new reference framework, with multiple intended uses and with extensive content, it was considered essential to make the content readily accessible and understandable to different audiences with different needs and interests.  The aim was that each audience could see the relevance of GSIM and the steps they might take to apply it in practice.

82.     The release of GSIM V1.0, therefore, entailed making a suite of information available.

83.     The formal detailed specification of GSIM V1.0 forms the core of the release.  The formal specification defines more than 100 information objects, including their core attributes and relationships.

84.     The formal specification is available in two main forms.  One form is designed to be read directly by people.  The other form is designed to be used systematically by modelling tools.

85.     In Microsoft Word format or PDF, the specification (excluding “Annex D”) runs to 101   pages.  This includes overviews of different parts of the model and diagrams which summarise how individual information objects relate to each other.  It also includes a glossary and summary information on how the GSIM model relates to existing frameworks and standards that the reader may already know.

86.     In its most detailed form, the specification is presented formally using Unified Modelling Language (UML) [20] , a standard format which is very widely used for modelling.  The specification can readily be loaded into modelling tools and used in practice [21] .

87.     The detailed UML specification is also presented, in a more human readable but less software “actionable” form, in the 190 pages of Annex D.

88.     The formal specification is supported by

    Two brochures introducing GSIM, how it can be used and the benefits which can be realised

    A communication paper (13 pages) which provides an overview of GSIM and its use

    A user guide which provides more detailed practical information about how GSIM can be used.

o       It was considered very important to clearly distinguish the specification, which defines GSIM, from this guide which provides ideas (not rules) about how GSIM can be used

    A readers guide suggesting which material is likely to be of interest and relevance to which audiences, and

    Other supporting material (eg electronic copies of presentations and posters)

89.     Managers and subject matter statisticians, for example, may only be interested in the brochures, communication paper and user guide.  They may rely on specialists to understand and apply the specification itself.

90.     This is another novelty of the GSIM development process compared to that for other statistical standards. Using input from communication specialists, as well as subject-matter experts, to produce a range of documents targeting different audiences seems to be an approach that could be considered as a good practice, and used more widely.

VI.         Overview of GSIM

91.     Information objects within GSIM V1.0 are grouped under four main headings.

Figure 1 : GSIM Top-level Groups

92.     The Business group is used to capture the designs and plans of statistical programs. This includes the identification of a Statistical Need , the Acquisition , Production and Dissemination Activities that comprise the statistical program and the evaluations of them.  

93.     The Production group is used to describe each step in the statistical process, with a particular focus on describing the inputs and outputs of these steps.

94.     The Concepts group is used to define the meaning of data, providing an understanding of what the data are measuring.

95.     The Structures group is used to describe and define the terms used in relation to data and its structure.

VII.      Benefits from GSIM

96.     As described in Section III, GSIM is one cornerstone of the HLG Vision, and its subsequent Implementation Strategy, for modernization of statistical production.

97.     Benefits which can be realised through applying GSIM therefore comprise both

    benefits which arise directly from harnessing GSIM (potentially in isolation), and

    benefits which arise from modernization of statistical production using GSIM in conjunction with other enablers

98.     Benefits are realised when GSIM is applied to;

       Improve communication between different disciplines involved in statistical production, within and between statistical organizations; and between users and producers of official statistics.

       Generate economies of scale by enabling greater collaboration within and between organizations, especially through reuse of information, methods or technology.

       Enable greater automation of the statistical production process, thus increasing efficiency and reducing costs.

       Provide a basis for flexibility and innovation , including support for the easy deployment of new statistical products and the adoption of new types of statistical data sources

       Build staff capability by using GSIM as a teaching aid that provides a simple, easy to understand view of complex information, with clear definitions

       Validate existing information systems and compare with best practice in other organizations.

VIII.   Next Steps

99.     Having delivered GSIM V1.0 as a cornerstone of the HLG Vision, a number of the next steps involve applying GSIM to underpin subsequent steps in the HLG Implementation Strategy for Modernization of Statistical Production.

100.      At the same time, HLG prioritised timely delivery of GSIM V1.0 over more extensive trialling and refinement prior to initial release.  This enabled timelier realisation of initial benefits and progress on subsequent steps.  It also allowed GSIM to be refined further based on experience with practical application rather than based on more theoretical forms of evaluation.  This strategy implies, however, that another key set of next steps is to gather, analyse and act on experiences with implementation of GSIM V1.0 so an enhanced version of GSIM can be designed, agreed and released for use in a timely manner.

a.   Outcome realisation at a national level

101.      In regard to application of GSIM, and realisation of benefits, a number of NSOs have already developed, and are currently implementing, plans for corporate adoption of GSIM.

102.      These implementers are working together through the GSIM Implementation Group to share plans, experiences, issues encountered, presentation material and other supporting documentation.

103.      The GSIM Implementation Group meets via web conference every two weeks and also has a community wiki space.

104.      The GSIM Implementation Group will be instrumental in gathering and prioritising feedback which allows the next version of GSIM to more fully and more consistently meet the needs of implementers.

b.   CSPA Project

105.      Having accelerated and completed development of GSIM V1.0 during 2012, HLG initiated the Common Statistical Production Architecture (CSPA) Project [22] for 2013.

106.      The CSPA Project aims to

i.          To create a standardised architecture for statistical production solutions, including processes, information and systems, and to allow specifications and ultimately applications to be re-used easily within and between statistical organisations.

ii.        To enable and advance the sharing of production processes or components, thus reducing costs.

iii.      To provide the basis for a central inventory or repository with life cycle management of sharable production processes and components.

107.      CSPA is sometimes referred to as “Plug & Play” Architecture because its aim is to enable a solution developed by one NSO, which is compliant with the standardised architecture, to be “plugged in” (and “play correctly”) to support relevant business processes undertaken by another NSO.

108.      CSPA includes defining, and undertaking a practical “proof of concept” application of, the standardised architecture.

109.      Alignment with the information model specified in GSIM is one of the defining characteristics for the standardised architecture.

110.      The CSPA Project is also reusing the practice of holding “sprints” which worked effectively during development of GSIM V1.0 during 2012.

c.    Informal taskforce on metadata flows

111.      This taskforce is mapping the typical flow of metadata through the nine phases of the GSBPM, using GSIM V1.0 to characterise the metadata.  The work recognises that while GSIM is not dependent on GSBPM, or vice versa, much of the value for NSOs comes from how these two frameworks related to Statistical Information and Statistical Business Processes fit together to characterise the business activity of producing official statistics.

112.      The output of this work will be a generic characterisation of where in the statistical business processes various types of metadata are typically created, reused and updated.

113.      The mapping may;

    help designers of new metadata understand how their output is likely to be used throughout the statistical process – helping them ensure their design is fit for purpose

    provide a checklist of likely inputs and outputs for designers of new business processes

114.      The taskforce’s application of the frameworks to practical (but generalised) business analysis is identifying potential gaps and areas of ambiguity in both GSIM V1.0 and the current version of GSBPM (V4.0).

115.      In addition, GSBPM V4.0 was released prior to the development of GSIM.  It therefore describes some information objects from GSIM using out of date, or nonaligned, terminology.  The taskforce is noting these instances and will feed them into an update and alignment process for GSBPM which will commence later in 2013.

d.   GSIM /SDMX / DDI

116.      In January 2013, HLG has also initiated a work package to determine in more detail the relationship between

    the information objects in GSIM

o       the business oriented conceptual framework for statistical information, and

    objects defined in the information models associated with SDMX and DDI-L

o       two commonly used standards for representing and exchanging statistical data and metadata.

117.      This work will gauge how fully, and how effectively, the existing standards support the full range of information objects defined in GSIM.

118.      Change proposals may then be initiated which seek to bring the standards into fuller, and more consistent, alignment with GSIM. [23]

119.      The aim, which will be achieved progressively, is to ensure that agencies which choose to adopt GSIM as their conceptual framework for statistical information will have simple, consistent and standards based means available to them for applying the framework in practice when defining, representing and exchanging data and metadata. [24]

120.      Agencies which are already implementing SDMX and/or DDI-L are expected to benefit from this work.  In many cases the standards are able to support several different methods for representing a particular business concept (eg information associated with a particular time series of economic data).  Aligning use of the standards with GSIM means that the solution chosen, where multiple options exist, should

    best address the business uses associated with that information rather than necessarily being the most technically “satisfying” or technically “least demanding” option

    be in common, or at least fully consistent, with the way other implementers decide to address the same need

121.      This is a further example of where agreement on a common conceptual framework enables agencies to then work toward common implementation practices.

e.   Designing and agreeing the next version of GSIM

122.      A GSIM V1.0 Discussion Forum was established once GSIM V1.0 was released.  The Discussion Forum has a list of issues.  Some of these are “open” (still under discussion) while others already have recommendations for specific changes to be made in the next version of GSIM.

123.      Any implementer, or reviewer, of GSIM V1.0 is welcome to contribute issues.  The GSIM Implementers Group (see VIIa.) reviews open issues collectively and negotiates recommendations.

124.      Based on the feedback gathered by September 2013, a recommendation will be made to HLG as to whether a revised version of the GSIM is needed, and if so, whether this would be a minor revision (version 1.1), or a major change (version 2.0).

125.      The volume and significance of feedback gathered so far (eg agreement that at least one commonly required information object is absent from GSIM V1.0) would seem to suggest some form of revision of GSIM will be agreed in September 2013.

IX.         Conclusion

126.      The body of analysis, from national and international perspectives, drawn together by HLG demonstrates that practical collaboration between producers of official statistics – underpinned by common frameworks and standards – is an essential element being able to respond to changing needs and opportunities in an agile manner and remain relevant to governments’ and societies’ needs for statistical information to support analysis and decision making.

127.      The HLG strategy includes facilitating reuse and sharing of business processes, statistical methods, IT components (in terms of design or implementation) and content from statistical metadata and data repositories.

128.      As a pre-requisite for being able to collaborate efficiently and effectively in practice, including ensuring that the outputs from multiple collaborations are consistent with each other and can work together, it is necessary to have common terminology and a common reference model in regard to the statistical data and metadata.  This provides a common means of describing the inputs to, and transformed outputs from, statistical production processes.

129.      The need for a common framework in regard to statistical information was identified when the Statistical Network formed in June 2010 in order to collaborate in practical small steps to industrialise methods and processes.

130.      Development of GSIM became of global interest, and was accelerated, as HLG’s Strategic Vision became more widely understood and supported during 2011.

131.      In a number of ways, development of GSIM became an exemplar of the approach to collaboration which HLG is championing for the future.  This included:

    mobilising resources internationally, and defining and managing the project, through collective agreement and commitment by heads of national and international agencies rather than through a single international agency assuming responsibility for governing the project

    adopting an “agile” development framework, focusing on timely initial delivery to be refined through application and using methods such as Sprints to supplement – and improve the effectiveness of – more traditional approaches to collaboration between agencies

132.      HLG members have assessed the approach to developing and delivering GSIM V1.0 as highly successful.

133.      HLG’s strategy of increasing GSIM’s fitness for purpose through statistical agencies starting to apply V1.0 in practice, and identifying future refinements and extensions, appears at this early stage to have momentum and to be succeeding.

134.      HLG focuses on collaborations between agencies in regard to business processes, statistical methods, IT components and statistical metadata and data repositories.  In other contexts, however, the positive experiences with developing and implementing GSIM as a common reference framework may have some relevance for approaches used to develop and evolve common conceptual frameworks for specific domains of statistics.


[1] http://unstats.un.org/unsd/nationalaccount/docs/Workprogram1993SNAupdate.pdf

[2] http://unstats.un.org/unsd/nationalaccount/timeline.asp

[3] http://unstats.un.org/unsd/statcom/doc13/2013-4-NationalAccounts-E.pdf

[4] http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=59703371

[5] CSTAT is the OECD Committee on Statistics

[6] For further information in regard to the Statistical Network, see http://www1.unece.org/stat/platform/display/msis/Statistical+Network

[7] http://www.unece.org/stat s / gsbpm

[8] Statistical Data and Metadata eXchange. See http://sdmx.org/

[9] DDI (Data Documentation Initiative) Lifecycle Specification. See http://www.ddialliance.org/Specification/

[10] http://www1.unece.org/stat/platform/display/metis/v0.1+GSIM+Common+Reference+Model

[11] See http://www1.unece.org/stat/platform/display/hlgbas/High-Level+Group+for+the+Modernisation+of+Statistical+Production+and+Services

[12] The body was referred to as HLG-BAS until the end of 2012.

[13] http://www1.unece.org/stat/platform/download/attachments/58492100/HLG+ToR+2013+to+2015.doc

[14] http://www1.unece.org/stat/platform/display/hlgbas/Strategic+Vision

[15] http://www.unece.org/stats/documents/2011.10.hlgbas.html

[16] http://www1.unece.org/stat/platform/display/msis/Inventory+of+International+Groups

[17] http://www1.unece.org/stat/platform/download/attachments/64880986/21+GSIM+Sprint+Proposal.doc?version=1

[18]  

http://www1.unece.org/stat/platform/download/attachments/59703371/19-+GSIM+Business+Plan.doc?version=1

[19] http://www.unece.org/stats/documents/2012.11.hlgbas.html

[20] http://www.uml.org/

[21] For readers more familiar with XML, UML provides strong support for modelling while XML provides strong support for representing and exchanging information which conforms to a particular model.  For example, the SDMX Information Model is expressed in UML, with SDMX-ML being one (of two) syntaxes defined for representing data and metadata in accordance with the SDMX Information Model.

[22] http://www1.unece.org/stat/platform/download/attachments/58492100/Plug+and+Play+project+outline.docx?version=1&modificationDate=1360670755446

[23] The DDI Alliance has already confirmed that improved support for implementing GSIM will be an important design consideration for future versions of DDI-L.

[24] While standard implementation via SDMX and DDI-L will be available, agencies can still choose to adopt GSIM as a conceptual model without choosing to use SDMX and DDI-L to implement the framework in practice.  “Semantic consistency” based on GSIM adds value even in the absence of “syntactic consistency” based on use of common representation standards.