Fostering Interoperability in Official Statistics:

 

Common Statistical Production Architecture

 

 

 

(Version 0.1, April 2013)

 

 

 

 

 

 

DRAFT FOR REVIEW

 

 

Please note the development of the Common Statistical Production Architecture is a work in progress. This is not intended for official publication and reflects the thoughts and ideas gathered at the first Sprint on this topic held in Ottawa, Canada during April 2013.

 

This document is aimed at readers who have some knowledge of Enterprise Architecture.

 

Instructions for reviewers and a template for providing feedback is available at http://www1.unece.org/stat/platform/display/msis/CSPA+v0.1

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

Content s

I.               Introduction               3

II.               Common Statistical Production Architecture               6

III.               How the Architecture could be used.               7

IV.               Who uses the Architecture               8

V.               Impact on organizations               10

VI.               Architecture Description               11

A.               Principles               12

Decision Principles               12

Design Principles               12

Statistical Service Design Principles               13

B.               Requirements               14

C.               Statistical Services and   "Plug and Play"               14

Generic Statistical Business Process Model (GSBPM)               16

Generic Statistical Information Model (GSIM)               16

Quality Attributes/ Non-Functional Requirements               16

Architecture Patterns               18

D.               Catalogues               21

VII.               Governance of the architecture               22

Annex 1: Quality Attributes / Non-Functional Requirements               24

Annex 2: Differences between “Classic” SOA   and   EDA               28

Annex 3: Glossary               30


I.               Introduction

 

1.               Many statistical organizations are facing common challenges. They have built up their organizational structure, production process, enabling statistical infrastructure and technology over years, through many iterations and technology changes. This can be referred to as 'accidental architecture' as they were not designed from a holistic top down view.   The cost of maintaining this business model and the associated asset bases (process, statistical, technology) is becoming insurmountable and the model of delivery is not sustainable.

 

2.               There are two major threats to the continued efficient and effective supply of core statistics come from within statistical organizations. These are:

 

1) rigid processes and methods and

2) inflexible ageing technology environment.  

 

3.               Often it is difficult to replace even one of the components supporting statistical production.   Use of these processes, methods and an inflexible and aging technology environment mean that statistical organizations find it difficult to produce data and information aligned to modern standards. Process and methodology changes are time consuming and expensive resulting in an inflexible, unresponsive statistical organization.

 

4.               Historically, statistical organizations have developed their own business processes and IT-systems for producing statistical products. Therefore, although the products and the processes conceptually are very similar, the individual solutions are not.   Statistical organizations have attempted many times over the years to share their processes, methodologies and solutions, as it has long been believed that there is value in this. The mechanism for sharing has historically meant an organization taking a copy of a component and integrating it into their own environment. Examples include CANCEIS (CANadian Census Edit and Imputation System) and Banff (an editing and imputation system for business surveys). However, most cases of sharing have involved significant work to integrate the component into a different processing and technology environment.  

 

5.               Many statistical organizations are modernizing and transforming using Enterprise Architecture to underpin their vision and change strategy. This work will   enable   the organizations to develop statistical services   in a standard way within their organization. An   enterprise   architecture aims to create an environment which can change and support business goals. It shows what the business needs are, where the organization wants to be, and ensures that the IT strategy aligns with this. Enterprise architecture helps to remove silos, improves collaboration across an organization and ensures that the technology is aligned to the business needs.  

 

6.               Figure 1 attempts to explain why this difficulty occurs. The figure assumes that the two statistical organizations develop all their business capability and supporting components in a standard way (i.e they have an Enterprise Architecture). The first line of the figure shows that Canadian components have a zig zag shape and the second suggests that Sweden has components with slanted edges. If Sweden needs a new component, ideally they need a component with a slanted edge. It can be seen in the third row that while a component from Canada might support the same process and incorporate robust statistical methodologies, it will not be simple to integrate it into the Swedish environment.  

 

Description: C:\attach_31058d199c2ba7be5a45710f2f802bc2

 

Figure 1: Why sharing /reuse is hard now

 

7.               The High Level Group for the Modernization of Statistical Production and Services (HLG) has put priority on the development of a Common Statistical Production Architecture (CSPA).

 

8.               The CSPA is meant to be a generic architecture for statistical production. It will serve as an industry architecture for statistical   organizations. By adopting this common reference architecture, it   will be easier for each organization to standardize and combine the components of statistical production, regardless of where the statistical services   are built. As shown in Figure 2, Sweden could easily reuse a statistical service from Canada because they both use the same “shape”.

 

9.               The CSPA will   facilitate   the sharing and reuse of statistical services   both across and within statistical   organizations.   The statistical services that are shared or reused across statistical organizations might be new statistical services or legacy/existing services which have been wrapped to comply with the architecture. This is show in Figure 2 by the shapes inside the Lego blocks. It also provides a starting point for concerted developments of   statistical infrastructure and shared investment across statistical organizations.

 

Figure 2: how it could be easier with architecture

 

10.               The goal of the CSPA is to provide statistical organizations with a standard framework to:

 

       facilitate the process of modernization

       provide guidance for operating change within statistical organizations  

       provide statisticians with flexible information systems to accomplish their mission and to respond to new challenges   and opportunities

       reduce costs of production through the reuse / sharing of solutions and services and the standardization of processes

       provide guidance for building reliable and high quality services to be shared and reused in a distributed environment (within and across statistical organizations)

       enable international collaboration initiatives for building common infrastructures and services

       foster alignment with existing industry standards such as the Generic Statistical Business Process Model (GSBPM) and the Generic Statistical Information Model (GSIM), and

       encourage interoperability of systems and processes

 

11.               This document provides the first thoughts on what this CSPA will look like. It is the output from the Architecture Sprint held at Statistics Canada 8 - 12 April 2013. This output is not designed to be the final output of this project. It is intended to outline the first ideas on the architecture in order to seek feedback from the wider official statistics community.

 

 

 

II.               Common Statistical Production Architecture  

 

12.               The CSPA aims to be the industry architecture for the official statistics industry.   An industry architecture is a set of agreed common principles and standards designed to promote greater interoperability within and between the different stakeholders that make up an "industry", where an industry is defined as a set of organizations with similar inputs, processes, outputs and goals (in this case official statistics).The value of the architecture is that it enables collaboration in developing and using services which will allow statistical organizations to create flexible business processes and systems for statistical production more easily. The CSPA is sometimes referred to as "plug and play" architecture. The idea is replacing components could be as easily pulling out a component and plugging another one in.

 

13.               The CSPA   gives users an understanding of the different statistical production elements that make up a statistical organization and how those elements relate to each other.   It also provides a common vocabulary with which to discuss implementations, often with the aim to stress commonality.   It is   an approach to enabling the vision and strategy of an industry,   by providing a clear, cohesive, and achievable picture of what’s required to get there.

 

14.               An important concept in architecture is the “separation of concerns”.   For that reason, the architecture is separated into a number of “la yers”. These “layers” are:

 

       Business Architecture which defines and evolves understanding of what the industry does and how it is done (statistics in our case),

       Information Architecture which builds understanding of the information, its flows and uses across the industry, and how that information is managed,

       Application Architecture which is the set of practices used to select, define or design software components and their relationships,   and

       Technology Architecture which   describes the infrastructure technology underlying (supporting) the business and application layers.

 

15.               The CSPA provides a template architecture for official statistics. It describes:

 

      What the official statistical industry want to achieve – This is the goals and vision (or future state).

      How the industry can achieve this – This is the principles that guide decisions on strategic development and how statistics are produced.

      What the industry will have to do -  The industry will need to adopt a service oriented architecture which will require them to comply with requirements, quality attributes / non-functional requirements and architecture patterns

 

16.               One of the key enablers for the CSPA is Catalogues. It is envisaged that each statistical organization will have catalogues of processes, information objects and statistical services. There will be a Global Artefact Catalogue which contains the shareable/ reusable artefacts (i.e. processes, information objects and statistical services) from the statistical organizations.

 

17.               The Global Artefact Catalogue will contain planning metadata. This will include information about developments that are planned or in progress. This information will facilitate the creation of a Global Roadmap of statistical organizations developments that can be consulted to see which organizations are developing what and when.

 

18.               The architecture is described in more detail in Section VI.

 

III.               How the Architecture could be used.  

 

19.               The Architecture Sprint proposed eight ways in which CSPA may be used. These are outlined below:

 

20.               Component redesign:   A statistical organization uses a statistical service and would like to modify the statistical service to meet new functional and/or non-functional requirements. An approach to collaborative renewal of existing statistical service is required.   For example, Statistics New Zealand and the Office of National Statistics (UK) would like new functionalities in   CANCEIS

 

21.               New component (strategic level or design level):   The need for a new statistical service has been identified as a gap in a statistical organization’s Strategic Plan. To fill the gap the statistical organization   starts looking for available solutions in the collaborative space (Global Artefact Catalogue). If not found, the statistical organization can either start designing and developing internally or seek collaboration with ot her statistical organization s . An example of this could be a need to have metadata driven questionnaire creation.

 

22.               Vendor :   The vendor has released a new product. Statistical organizations should verify together if the product is relevant to the community requirements. If it is not, statistical organizations can try to influence the vendor to meet requirements. If yes, statistical organizations ask the vendor to integrate the product in the CSPA. For example, a   SAS component for Analytics.

 

23.               Vendor sought:   Statistical organizations identify new needs/solution and decide to investigate possibilities of vendors developing it. For example, the   Australian Bureau of Statistics seeking SAS agreement to develop a new procedure for regression that protect confidentiality.

 

24.               Roadmap: Statistical organizations need to modify and integrate their roadmaps to align them with the CSPA framework. They will also integrate/streamline their Investment strategies.  

 

25.               Strategy: Statistical organizations are   creating and using an industry strategy ("Industry Architecture") and this leads to projects/work programs. For example, the Proof of Concept for CSPA.

 

26.               Hosted services:   One international organization hosts/deploys services available for all statistical organizations. For example,   Eurostat delivers statistical services in the cloud .

 

27.               Statistical organization   cooperation:   A cluster of statistical organizations have been working on a project and want to share the results.

 

28.               Transition Strategy:   Each statistical organization   needs to define a strategy to move from its current state to the common future state defined   in their roadmap.

 

IV.               Who uses the Architecture  

 

29.               Using the Architecture will create new functional roles within a   statistical organization. These roles may already exist in some   statistical organizations and may have particular people (for example, Chief Information Officer, Enterprise Architect) assigned to them.   This section explains the functional roles and leaves it to each   statistical organization to decide who performs that role in their organization.  

 

 

Figure 3: Roles in project lifecycle

 

30.               An   Investor is a person, or a group of people, who undertakes the   strategic planning and investment decision making role to support statistical production   in a   statistical organization. In   their   decision making, they balance benefits, opportunities, risks and costs. To support this decision making, the   Investor   could consult a catalogue   which includes processes, components and statistical services and evaluate the relevant components.  

 

 

 

 

Technical Specialist   roles

 

31.               Once the need for a component is identified and the project is funded, a   Designer from within or outside the organization   will   reuse, wrap or design methods and services to provide the essential capability needed by the organization or wider statistical community. There could be a number of types of   Designer including   Process Designers,   Information Designers,   Methodology Designers   and   Service Designers.   (See Box 1 for a more detailed description of the Designer role). The Service Designer will consult the architecture to ensure that the service is consistent with it and also authorize the service to be added to the catalogue.

 

32.               If a new statistical service has been designed, the   Service Designer will give the work to a   Service Builder.   A   Service Builder   is a specialist or expert from within or outside of an organization who builds statistical services according to the specifications provided by Service Designers.

 

33.               A   Service Assembler   is a specialist responsible for integrating or assembling relevant statistical services into an end-to-end process that delivers a critical business capability. This role requires the specialist skills of people experienced in integration and methodology.  

 

Non-Technical Specialist   Roles

 

34.               Services deliver generic business capability within the GSBPM framework. Services are then used in a specific context, within a specific statistical process. Services therefore must be capable of being configured or parameterized for this specific task. A Service Configurer is the role   responsible for configuring either the end-to-end process or a specific process to achieve the business outcomes specified by the business area responsible for statistical production. People who fulfill this role would include Subject Matter Experts and Methodologists. Their role is to configure the parameters needed for a particular business process. For example, for predetermined capabilities, they would decide what is used and in what order. Their interactions with a service are only through a User Interface.

 

35.               Finally, Users are the business area staff who manage and operate the configured end-to-end process to achieve the business outcomes sought.

 

 

Box 1: An illustration of the Designer Role

 

Jose is a Service Designer in a statistical organization. He has been asked to design a solution to collect data for the Labour Force survey. Jose would look into the statistical requirements of all data collections in the organization and design a service which would meet not just the requirement of the Labour Force Survey, but also data collection approach for other surveys in the organization. These statistical requirements would include details on variables to be collected, on international standards and classification, sample techniques to be used, the questionnaire logic and the outputs to be obtained.

 

Once the requirements are understood, Jose searches in the "GSIM aligned Information Object Catalogue"   for the availability of data and classifications that can be used for the survey.

Using the "GSBPM aligned Process Catalogue", Jose selects appropriate processes according to the requirements for different phases of the Data Collection i.e. “Sample Selection",   "Set up Collection", "Run Collection" and "Finalize Collection".

 

Then Jose browses the organization’s artifact catalogue to search for suitable existing services he can use to support the survey methodology, processes and information flows. If multiple modules are available in the organization’s artifact catalogue for a single phase, Jose chooses according to the requirements and interfaces, possibly verifying with statisticians.

 

If some of the needed modules are not available in the organization’s artifact catalogue, Jose searches in the Global Artefact Catalogue. If no suitable services are found there, Jose investigates various options to make an informed decision:

 

       internal legacy systems that are candidates to become services (by wrapping)

       planned initiatives on the global roadmap that meet the requirements and timelines (using Planning Metadata in the related Catalogue)

       existing services that would require extension following the collaborative change management process (internal and or external)

       vendors listed in the catalogues of the Global Artefact Catalogue that could partner or provide a solution

 

Jose will need to consult with the Enterprise Architecture on his evaluations and the cost/benefit analysis of the options seeking confirmation that his proposals   aligns with transitional planning. If there are no alternatives, Jose will design a new service that can be re-used both internally in his Statistical Organization and in other organizations.  

 

Following the planning metadata, frameworks, standards, policies, guidelines, and other knowledge assets in the related Catalogues to conform to the requirements of the CSPA.

 

After designing the services to be used, he ensures that inputs and outputs required align with the "Service Interfaces" and integration specifications of the services selected. After verifying that all necessary components are available, Jose selects the appropriate "Architectural Patterns" and composes the Services package including defining the "Guidelines" to be used by "Service Builders" and "Service Assemblers" to meet all functional and non-functional requirements.

The service package is then sent to the Enterprise Architect review and validation process for both internal and international re-use.

 

V.               Impact on organizations

 

36.               There will be a number of required changes for an organization implementing the CSPA.   Adoption of the CSPA   will require investment with a view to generating the long term benefits identified in the value proposition.  

 

 

 

 

37.               The main changes required at the organization level can be grouped in layers:

 

A.      People Changes

      Openness to international cooperation

      Building trust in international partners (especially as they may be building services for your organization)

      Sense of compromise (acceptance that nothing will be optimized for local use, rather it will be optimized for international or corporate use)

      Development of new functional roles to support use of the architecture (e.g. Assembler, Builder)
 

B.      Process Changes

      Adoption of an industry wide perspective

      Different approach to business process management and design

      Commitment to service (contract between different functional units)
 

C.     Technology Changes

      setting up an adequate middle ware infrastructure (messaging, repositories)

      uplift of physical network capabilities (bandwidth, etc.)

      management of security features

 

38.               In addition to the costs and the targeted benefits, an organization adopting the CSPA   will benefit from:

 

       a sustainable and efficient strategy to cope with legacy and phasing out of existing applications

       a cycle that enables cost saving from reduction in production costs to be reinvested in further infrastructure transformation  

       a positive image both on national and international/industry scene

 

VI.               Architecture Description

 

39.               The CSPA has a number of design artefacts. These design artefacts describe the structure of statistical services, their interrelationships and the principles that govern their design. The first ideas for a number of design artefacts were produced at the Architecture Sprint. These include:

 

       Principles

       Requirements

       Statistical Services and Plug and Play

o          Generic Statistical Business Process Model (GSBPM) Alignment

o          GSIM Alignment

o          Quality Attributes / Non-functional Requirements

o          Architecture Patterns

       Catalogues

 

 

A.               Principles  

 

40.               Principles are high level decisions or guidelines that influence the way processes and systems are to be designed, built and governed. Principles are derived from the mission and values of the organization, taking into account the opportunities and threats that the organization faces.   In the CSPA, principles are used to express the high level design decisions that will shape the future statistical processes and systems.

 

Decision Principles

 

41.               Decision principles are guidelines to help decide on strategic development. They provide a basis for decision making and informing how the mission is fulfilled. They help enable sound investment decisions. The following decision principles have been selected for the CSPA. They represent the outcomes sought by the High Level Group and key elements of the United Nations principles for Official Statistics .  These provide a basis for decision making through an enterprise and inform how that organization sets about fulfilling its mission. How we decide on strategic development

 

42.               These principles are outlines below and are to be applied as a cohesive set:

 

       Increase the cost efficiency of the statistical organization

       Deliver enterprise-wide benefits

       Capitalize on or influence national or international developments

       Sharing and Collaboration

       Increase the value of the nation's statistical assets

       Maintain information security and community trust.  

 

 

Design Principles

 

43.               The following design principles have been developed to provide an understanding of the goals of the CSPA and the sponsors of the architecture.  The principles are provided across the architectural layers outlined earlier (business, information, application and technical) and inform design related decisions.  Applying the principles (see Table 1) in a balanced way will ensure that statistical services will be delivered to meet the industry vision.

 

 

 

 

 

 

 

 

 

 

 

Table 1: Design Principles for the different “layers” of architecture and the Statistical Service

Business Design Principles

Information Design Principles

    Use available standards

    Focus on efficiency

    Support responsiveness to new needs

    Organization wide awareness in design

    Output driven architecture

    Empowering business to reach their goals independently through metadata driven processes

    Re-use existing before creating new

    Create new for re-use and flexible, easy assembly, plug and play

    Use recognized standards

    Have the right delegations and governance with supporting metrics

    Consider all capability elements e.g. Methods, standards, processes, skills, IT

     Information is an asset that has value to the organization and must be managed accordingly.

     Information is consistent across all relevant services.

     Use available standards to maintain consistence of language across all services.

     Users/services have access to the necessary information to perform their duties.

     Human intervention at run time is minimized by automated population.

     Changes to Information assets are managed rigorously to provide traceability and versioning.

     Persistence is maintained within the service and is not exposed to service consumers.

      Services do not provide persistence as a main (or sole) purpose. They support a GSBPM service

Application Design Principles

Technology Design Principles

     Use available standards

     Leverage learning from international application platforms (for example COmmon Reference Environment (CORE))

     Follow both "classic" and "event-driven" SOA architecture patterns in our Plug and Play framework depending on best fit to requirements

Automate as much as possible

   Implementation agnostic

 

Statistical Service Design Principles

       Managed standardized service contracts based on GSIM objects.

       Enable services to be loosely coupled externally and be aware of internal coupling.

       Maximize service autonomy (completeness) to enable share-ability and reusability (External & Internal)

       Non-functional requirements (quality attributes) form a key input in design decisions.

       Granularity based on the level below a GSBPM sub phase

       Independence between design and implementation

 

Statistical Service Design Principles

 

44.               The design principles (see Table 1 above) have been selected to maximize the flexibility of the statistical services wrapped or developed in the context of the CSPA.  The flexibility of the statistical service directly impacts the level of reuse, the flexibility required of the industry vision and the ease with which a statistical organization can implement a statistical service.

 

B.               Requirements

 

45.               A requirement is a specific functional or physical need that a process or system must fulfill. It is a statement that identifies a necessary attribute, capability, characteristic, or quality of a system for it to have value and utility to a customer, organization, internal user, or other stakeholder. In the context of the CSPA, the requirements shown in Figure 4, in addition to the principles, shape the way the architecture will be realized.

 

Figure 4: Requirements for the CSPA

 

C.               Statistical Services and   "Plug and Play"

 

46.               The architecture is based on an architectural style called Service Oriented   Architecture (SOA). This style focuses on Services (or Statistical Services in this case).   A service is a representation of a real world business activity with a specified outcome. It is self-contained and can be reused by a number of business processes (either within or across statistical organizations).  

 

47.               The level of reusability promised using an SOA is a direct result of the standardized definition of the service capabilities through the service contract.  For CSPA statistical services these capabilities have been defined as:

 

      the GSBPM for the reusable functional context

      the GSIM for the information model (note this will be the implementation model)

      the quality attributes/non-functional requirements

      the architectural pattern(s) to be supported (traditional and event driven SOA),
 

Description: C:\attach_78aab0658790a3a47f17bd8d60399c12

Figure 5: CSPA Service Features

 

48.               A statistical service will perform a task in the statistical process. Statistical services will be at different levels of granularity. An atomic or fine grained statistical service encapsulates a small piece of functionality.   An atomic service may, for example, support the application of a particular methodological option or a methodological step within a GSBPM sub phase. Coarse grained or aggregate statistical services will encapsulate a larger piece of functionality, for example, a whole GSBPM sub phase. These may be composed of a number of atomic services.  

 

49.               The CSPA is sometimes referred to as "plug and play" architecture.   Plug and Play is a concept that was borrowed from the world of computer hardware. It is used to indicate that installing new hardware modules is easy: just plug it in and it will work. In the context of statistical processes, we use the term to indicate that installing a new "module" of functionality in an existing process is easy.   In fact, the idea is that it should be possible to build a complete new process by stringing together a number of such modules, like building a wall from Lego-bricks.  

 

50.               Within the Plug and Play concept, the Service is the “pluggable” element.   The granularity of statistical services should be based on a balanced consideration between the efficiency of the statistical service and the flexibility required for sharing purpose - larger statistical ser vices will usually enable greater efficiency, whereas a finer granularity will allow greater flexibility for supporting sharing and reuse. It is envisaged that the statistical services will be shared or reused at one level below a GSBPM sub phase.   Services, regardless of their granularity, must meet the architectural requirements and are aligned with the CSPA principles.   Figure 5 shows that these statistical services must align to GSBPM and GSIM as well as adhere to the architectural patterns and quality attributes / non- functional requirements described by the architecture.  

 

Generic Statistical Business Process Model (GSBPM)

 

51.               The GSBPM is a flexible tool to describe and define the set of business processes needed to produce official statistics. The GSBPM can also be utilized to harmonize statistical computing infrastructures, facilitating the sharing of software components, and providing a framework for process quality assessment and improvement.

 

52.               For further information please refer to the GSBPM web site http://www1.unece.org/stat/platform/display/metis/The+Generic+Statistical+Business+Process+Model

 

Generic Statistical Information Model (GSIM)

 

53.               The GSIM is the first internationally endorsed reference framework for statistical information. This overarching conceptual framework plays an important part in modernising, streamlining and aligning the standards and production associated with official statistics at both national and international levels.

 

54.               GSIM is a reference framework of information objects, which enables generic descriptions of the definition, management and use of data and metadata throughout the statistical production process. It provides a set of standardized, consistently described information objects, which are the inputs and outputs in the design and production of statistics. As a reference framework, GSIM helps to explain significant relationships among the entities involved in statistical production, and can be used to guide the development and use of consistent implementation standards or specifications.

 

55.               GSIM is one of the cornerstones for modernising official statistics and moving away from subject matter silos. It is a key element of the strategic vision prepared by the High-Level Group for the Modernization of Statistical Production and Services (HLG), and endorsed by the Conference of European Statisticians.

 

56.               For further information please refer to the GSIM web site

http://www1.unece.org/stat/platform/pages/viewpage.action?pageId=59703371

 

Quality Attributes/ Non-Functional Requirements

 

57.               Quality Attributes / Non-Functional Requirements are important to be captured in the design of the services. They have a significant influence on the software architecture of a service

 

58.               The Quality Attributes / Non-Functional Requirements listed in Table 2 are grouped into four specific areas linked to design, runtime, system and user qualities. Further explanations of these quality attributes including descriptions can be found in Annex 1.

 

Table 2: Quality Attributes / Non-Functional Requirements

Design Qualities

      Reusability (Exploitability)

      Legal and licensing issues or patent-infringement-avoidability

      Privacy

      Multilanguage support (internationalization and localization)

      Compliance

      Extensibility (adding features, and carry-forward of customizations at next major version upgrade)

      Maintainability

      Platform compatibility

 

Run time Qualities

 

      Security

      Efficiency (resource consumption for given load)

      Interoperability

      Effectiveness (resulting performance in relation to effort)

      Performance / response time (performance engineering)

      Availability (see service level agreement)

      Fault tolerance (e.g. Operational System Monitoring, Measuring, and Management)

      Backup

      Robustness

 

System Qualities

 

      Failure management

      Quality (e.g. faults discovered, faults delivered, fault removal efficacy)

      Configuration management

      Deployment

      Operability

      Portability

      Scalability (horizontal, vertical)

      Backward Compatibility

      Supportability

      Testability

 

User Qualities

 

      Documentation

      Usability by target user community

 

 

 

Architecture Patterns  

 

59.               In simple terms, architecture patterns describe a re-usable solution to certain classes of problems. They explain how, when and why statistical services can be used, as well as the impact of using them in that way. They help a Service Assembler to identify combinations that have been used   successfully   in the past.   Although not identified in this document, there are also anti-patterns which are examples of what should not be done.  

 

60.               The benefits of using architecture patterns can be described by using the analogy of an expert chess player. To play chess, you must learn   the rules and the principles (for example the value of different pieces, squares). However, to improve and become a really good player, you need to learn the patterns used by more experienced players and apply them to your game. In the same way, you can use the requirements, principles and quality attributes of the CSPA, but to get the maximum benefit, Service Assemblers should learn the architecture patterns.  

 

61.                 The CSPA will incorporate both "Classic" and Event Driven Service Oriented Architecture patterns.   Both patterns   provide capabilities to   cope with the future challenges facing statistical organizations.   Event Driven Service Oriented Architecture   would be preferred in cases such as   Big Data, new   Dissemination processes, cross organization   cooperation.

 

62.               For other issues either it is indifferent (e.g. security) or   it is still undetermined (e.g. future technology, loss of control of data collection).   These aspects may be investigated on the basis of the Proof of Concept results including aspects like impact on granularity.

 

63.               Further details comparing the two types can be found in Annex 2.

 

"Classic" Service Oriented Architecture (SOA)

 

64.               "Classic" SOA is a pattern that uses the Request/Response pattern for activating services. This implies a rather fixed routing of messages between services. The integration infrastructural platform implementing the process “orchestrates” the routing and executing of services. This pattern leads to less flexibility and tighter coupling between services that the Event Driven (Service Oriented) Architecture pattern described below.

 

65.               "Classic" SOA can be used if:

       A functional style and sequential flow is required  

       It is known precisely which service interface should be called  

       The service should be called exactly once  

       A response when service execution completes is expected  

       A response is expected

 

66.               An example of how this pattern could be applied in relation to collection is   each questionnaire is stored in an entity service. The entity service exposes the operation to get the questionnaire through a service call. The indicators are computed and stored and made available using an entity service call.

 

Figure 6: Examples of “Classic” SOA

 

Event Driven (Service Oriented)   Architecture (EDA)

 

67.               EDA is based on the publish/subscribe concepts and heavily utilizes many of SOA   approaches and design patterns. Some consider EDA to be an asynchronous version of SOA.

 

68.               SOA   emphasizes the importance of loose coupling. EDA goes a step further. As an event is generated by an event source and is sent to the processing middleware. It is not known which functionality is triggered next. In SOA, the concrete service call would have been made, but this is not the case for Event Driven Architecture. For this reason, EDA talks about "decoupling" rather than loose coupling.

 

69.               EDA can be used if:

 

       All recipients that may be interested in the event should be notified  

       It is not exactly known which and how many recipients are interested in the event  

       It is not known how recipients respond to this event  

       Different recipients respond differently to the same event  

       Only one-way communication from the sender to the recipient is possible

 

70.               An example of how this pattern could be applied in relation to collection is when   each questionnaire completed   publishes an event when available for   subscribers downstream.   Early indicators can be produced by processing collection events straight through aggregation.

 

Description: C:\attach_eedc33fa30740decc3c11109a9545f49 Description: C:\attach_bf805f306c696c59eabb40e425f8f142

Figure 7: Examples of EDA

 

Enabling Architectural Mechanisms

 

71.               Within each statistical organization, there needs to be an infrastructural environment in which the generic services can be combined and configured to run as element of organization specific processes. This environment is not part of the CSPA. The CSPA assumes that each statistical organization has such an environment and makes statements about the characteristics and capabilities that such a platform must have in order to be able to accept and run statistical services that comply with CSPA.

 

72.               Platform for Service Communication:   A communication platform provides the capability for communication between statistical services. It enables inter-service communication while allowing statistical services to remain autonomous and add additional capabilities for monitoring and orchestrating the information flow. To assemble a built statistical service, the communication platform is updated to integrate with new services. There are multiple ways of establishing a platform communication. Examples of architectural components could be BPMS, ESB, Workflow Engines, Orchestration Engines, Message Queuing and Routing.
 
73.               Platform for Configuring and Controlling Services and Processes:   The Platform for Controlling Service and Process execution encompasses the functionalities and   tools to support the management and maintenance of services metadata, artifacts and policies.   Examples of how this mechanism could be achieved include Business Process Modelling System, Lifecycle Management, Service Monitoring and Management.    

   
74.               Platform for Reporting on Services and Processes:   The Platform for reporting is responsible for enabling real-time monitoring and near-real-time presentation of user defined business key performance indicators (KPIs).   Examples of how this mechanism could be achieved are Static Dashboard, Business Activity Monitoring (also generates alerts and notifications to user when these KPIs cross specified thresholds).

 

 

 

 

 

D.               Catalogues

 

75.               In the context of a CSPA, catalogues of various artefacts are seen as key enablers. They provide lists and descriptions of standardized artefacts, and, where relevant, information on how to obtain and use them.

 

76.               The catalogues can be at many levels, from global to local. However, for the purposes of this project, it is the global level that is the primary interest. The global catalogue is called the Global Artefact Catalogue. The Catalogue will provide information about resources and potential collaboration partners, helping to ensure that the modified component conforms to the requirements of the CSPA. Governance and support mechanisms and processes will need to be defined to ensure the continued relevance and utility of the catalogues.

 

77.               The Global Artefact Catalogue will include three sub catalogues. These are shown in Figure 8.

Description: C:\attach_c59f7772283e3216d60e5ac75089f935

 

Figure 8: Types of Catalogues

 

78.               The catalogues should support the lifecycle management, governance and use of components and services, providing the right level of artefacts for service design, integration and transition to a production environment.

 

79.               They are not meant to function as an "app store", as they are not designed to be platform specific, and will not necessarily hold copies of executable code. They will, however, provide all necessary information about how to access the artefacts, including contact details for further information. They should provide sufficient contents to support the CSPA.

 

Description: C:\attach_2f53361754db79a7a7879b750aa15afb

Figure 9: Types of artefacts in the Catalogues

 

80.               As shown in Figure 9, the types of artefacts in Catalogues are:

 

       Frameworks include artefacts such as the GSBPM and the GSIM

       Standards include DDI and SDMX

       Policies support governance and use, for example "no changes to GSBPM or GSIM unless they are the result of a formal approval process"

       Guidelines concern the practical use and integration of artefacts

       Other knowledge assets could include architectural principles and information on how to use the catalogues

       Planning metadata is intended as a repository for information about developments in progress or planned, to promote collaboration at the earliest possible stage in the development of artefacts, e.g. "Statistics New Zealand aim to develop and make available component X by (date)". This will provide the basis for a global roadmap.

 

VII.               Governance of the architecture

 

81.               Creating a CSPA is only the first part of the story. It is likely that the architecture, or elements of it, will need to be revised and refined from time to time to ensure continued relevance. As the architecture will be a common asset for the international official statistics community, there will need to be a clear and transparent process for change management.

 

82.               This process does not need to be invented from scratch. The CSPA is being developed as a project overseen by the High-Level Group for the Modernization of Statistical Production and Services (HLG) so it is assumed that this group will be the custodians of the outputs. Whilst the architecture itself will be “owned” by the international official statistics community, it will be administered by the HLG, as top-level representatives of that community.

 

83.               However the HLG itself is unlikely to be involved in detailed discussions about specific elements of the architecture. Therefore, for practical purposes, proposals for changes will usually be discussed at a lower level. These discussions will take place either in one of the four modernization committees that are being established under the HLG, or in a specific task team. These groups will then formulate recommendations for change, which will typically be bundled together as a package to be formally signed off by the HLG.

 

84.               The HLG and the modernization committees will need to carefully balance the advantages of change, in terms of increasing relevance and usefulness, against the costs of having to implement those changes within statistical organizations. A reasonable degree of stability over time is therefore a key requirement for the architectural framework.

 



Annex 1: Quality Attributes / Non-Functional Requirements

 

Table 3: Further details on Quality Attributes

CATEGORY

Quality Attribute

Definition

Notes

FURPS

Design Qualities

Reusability (Exploitability)

Is the capability for components and subsystems to be suitable for use in other applications and other scenarios. It minimizes the duplication of components and also the implementation time.

 

Functionality

 

Legal and licensing issues or patent-infringement-avoidability

Is the conformance of a system to legal, licensing and patent regulations.

 

Functionality

 

Privacy

Is the ability of a system to prevent the disclosure or loss of information.

Confidentiality

Functionality

 

Multilanguage support (internationalization and localization)

Is the capability of a system to handle different languages and regional differences.

 

Functionality

 

 

Is the degree in which a system conforms to a rule, such as a specification, an architecture description or reference, policy, standard or law.

 

Supportability

 

Extensibility (adding features, and carry-forward of customizations at next major version upgrade)

Is the ability of system to take into consideration future growth? It is measured by the ability of a service to be extended and the level effect required to implement the extension.

 

Supportability

 

Maintainability

The ease with which a software system or component can be modified to correct faults, improve performance, or other attributes, or adapt to a changed environment

 

Supportability

 

Platform compatibility

Is the extent to which a system needs to be adapted to a given technological environment?

 

Supportability

 

Price

Are the measures of the cost-benefit relation of a system to create a solution for any given problem.

 

Supportability

Run time Qualities

Security

Is the capability of a system to prevent malicious or accidental actions outside of the designed usage, and to prevent disclosure or loss of information. A secure system aims to protect assets and prevent unauthorized modification of information.

 

Functionality

 

Efficiency (resource consumption for given load)

Is the degree in which a process uses the lowest amount of inputs to create the greatest amount of outputs. It relates to the use of all inputs to produce any given output.

 

Performance

 

Interoperability

Is the ability of diverse systems and organizations to work together (inter-operate). The term is often used in a technical systems engineering sense, or alternatively in a broad sense, taking into account social, political, and organizational factors that impact system to system performance.

 

Functionality

 

Effectiveness (resulting performance in relation to effort)

Is the capability of a system to help a user to reach his specific goals in a given context.

 

Performance

 

Performance / response time (performance engineering)

Is an indication of the responsiveness of a system to execute any action within a given time interval. It can be measured in terms of latency or throughput. Latency is the time taken to respond to any event. Throughput is the number of events that take place within a given amount of time.

 

Performance

 

Availability (see service level agreement)

Is the proportion of time that the system is functional and working. It can be measured as a percentage of the total system downtime over a predefined period. Availability will be affected by system errors, infrastructure problems, malicious attacks, and system load.

 

Reliability

 

Fault tolerance (e.g. Operational System Monitoring, Measuring, and Management)

Is the ability of a system to continue its intended operation, possibly at a reduced level, rather than failing completely, when some part of the system fails.

 

Reliability

 

Backup

Is the capacity of a system to copy and to archive computer data so it may be used to restore the original after a data loss event.

 

Reliability

 

Robustness

Is the ability of a computer system to cope with errors during execution or the ability of an algorithm to continue to operate despite abnormalities in input, calculations, etc.

 

Reliability

System Qualities

Failure management

Are the anticipation, detection and resolution of programming, application, and communication errors.

 

Reliability

 

Quality (e.g. faults discovered, faults delivered, fault removal efficacy)

Is the capability of a system to meet the needs and concerns of the stakeholders for whom it was developed.

 

Reliability

 

Configuration management

Is the capacity to instantiate and customize new versions of the system.

 

Supportability

 

Deployment

Is the ability to schedule software distribution

 

Supportability

 

Operability

Is the ability to keep a system in a reliable condition according to predefined operational requirements

 

Supportability

 

Portability

Is the capacity of a system to be adapted to different specified environments without applying other actions or means than those provided for this purpose for the software considered.

 

Supportability

 

Scalability (horizontal, vertical)

Is the ability of a system, network, or process to handle a growing amount of work in a capable manner or its ability to be enlarged to accommodate that growth.

 

Supportability

 

Backward Compatibility

Is the ability of a system to work with input generated by an older product or technology. If products designed for the new standard can receive, read, view or play older standards or formats, then the product is said to be backward-compatible

Just standards

Supportability

 

Supportability

Is the ability of the system to provide information helpful to identifying and resolving issues when it fails to work correctly.

 

Supportability

 

Testability

Is a measure of how easy it is to create test criteria for the system and its components, and to execute this tests in order to determine if the criteria are met.

 

Supportability

User Qualities

Documentation

Is the degree in which a system is described in all its parts in order to get complete knowledge of how it has been developed and how must be used.

 

Usability

 

Usability by target user community

Is the measure of how easily the system can be used by a given group of users who share certain characteristics.

 

Usability

 


Annex 2: Differences between “Classic” SOA   and   EDA

 

67.               The following table summarizes the main differences between the architecture patterns   "Classic" SOA   and   Event Driven Service Oriented Architecture.

 

Table 4: Side by side comparison of the characteristics of   "Classic"   SOA and   EDA

 

 
Characteristics

"Classic" SOA

Event Driven SOA (EDA)

Architectural Style

Functional service style is required

Event driven service style (publish-subscribe) is required

Service Interface

Based on the operation contracts,
data contracts (for parameters) and fault contracts.

Based on the lists of events that the service can publish.

Control Flow

Sequential flow is required

Sequential flow is possible by using additional platform functionalities.

Request

The service can be called exactly once.

Calls are not made explicitly.

Request access point

It is known precisely which service interface should be called

Requests are not made directly to a service interface.  

Response

A response when service execution completes can be explicitly   expected

A response when service execution completes cannot be explicitly   expected unless there is an event associated to it.
Different recipients can respond differently to the same event.
It is not known how recipients respond to this event

Message Exchange Pattern (MEPs)

Various MEPs available.
Synchronous and asynchronous

Only one-way communication from the publisher service to subscriber (s)   (publish subscribe pattern).

Multiple consumers

One at a time

All recipients listening to the event are notified

 

 

 

 

 

 

 

 

 

Table 5: Side by side comparison of the potential adherence to the Statistical   Service Design Principles   of   "Classic"   SOA and   EDA   (No, Depends on, Yes-partially, Yes-fully)

 

  Service Design Principles

"Classic" SOA

Event Driven SOA (EDA)

Managed standardized service contracts based on GSIM fragments/entities.
 

Yes-fully
Functionality   explicitly defined in the service interface (in the operation parameters).

Yes-fully
Functionality   need to be described externally (textually). The event type data payload is a   contract based on GSIM.

Enable services to be loosely coupled externally and be aware of internal coupling.

Yes-partially
Loose coupling based on strict service contract is possible

Yes-fully
Very loose coupling. Provider is not aware of consumer's existence.
Consumers is coupled to the event only (not the source of the event   if a mediator is in place for subscription)

Maximize service autonomy (completeness) to   enable share-ability and reusability (External & Internal)

Depends on   implementation but potentially   Yes-fully

Depends on   implementation but potentially   Yes-fully

Non-functional requirements (quality   attributes) form a key input in design decisions.

Depends on   implementation.

Depends on   implementation.

Granularity based on the level below a GSBPM sub phase

Depends on   design

Depends on   design

Independence between design and   implementation

Depends on   implementation but potentially   Yes-fully

Depends on   implementation but potentially   Yes-fully

 

 


Annex 3: Glossary

 

Table 6: Glossary

Term

Definition

Application Architecture

The set of practices used to select, define or design software components and their relationships.

Source: Sprint

Architectural Pattern

The description of a recurring particular design problem which comes from different design contexts. The solution schema is specified by describing its components, its responsibilities its relations and the ways they collaborate.

Business Architecture

Defines and evolves understanding of what our organization does and how we do it (statistics in our case)  

Source: Statistics New Zealand

Business Process

A series of logically related activities or tasks performed together to produce a defined set of result .

Capability Architecture

1.                   Start with business outcomes.   To a CEO, the metrics — the business outcomes — that most determine the organization’s success boil down to the balance sheet, statement of operations, and statement of cash flow, combined with soft metrics that indicate the organization’s power, stature, and influence in the marketplace.

2.                   Direct the evolution of business capabilities.   Business plans make a poor foundation for tech strategy. They will — and should — constantly change in response to competitive and political dynamics. A better foundation is an organization’s core capabilities — product design, customer service, etc. — which we build on and evolve to achieve better outcomes. But we need to do more than model and plan business capabilities, we need to implement them using an integrated, operational, measurable combination of people, processes, technology, and physical resources. Thus, Forrester places the notion of an evolving   business capability implementation   — a complete running part of a business — as the target for Business Capability Architecture.

3.                   Provide design implementation models oriented around business change.   A business capability map is good, but if you can’t carry it through to implementation, it’s just a pretty picture. So Forrester’s Business Capability Architecture — by analyzing the ways that businesses change and evolve — provides design principles and implementation models   for integrated, coherent, holistic implementation of business capabilities. This includes the notion of a   business capability platform   — a cohesive, integrated, multitechnology, business-focused platform — that gets past the single technology focus of our current day BPM apps, event-driven apps, and the like .

Source: Forester

Common Statistical Production Architecture

A set of principles for increased   interoperability within and between statistical organizations through the sharing of processes and components, to facilitate   real collaboration opportunities,   international decisions and investments and   sharing of designs,   knowledge and practices

Source: Sprint

Composition

Assembly of Services that in itself is a service

Enterprise Architecture

Enterprise architecture is about understanding all of the different elements that go to   make up the enterprise and how those elements interrelate. It is   an approach to enabling the vision and strategy of an organization,   by providing a clear, cohesive, and achievable picture of what’s required to get there.

Source: Sprint

Function

A sequence of program instructions that perform a specific task, packaged as a unit.

Global Platform roadmap

An international level roadmap showing at what point in future new capabilities will be added to the Statistical Organization library

GSBPM aligned process catalogue  

Collection of pre-assembled GSBPM sub processes. Each element listing preconditions, inputs, output, outcome

GSIM aligned Information Object Catalog

Collection of schemas, each schema describing one information object exchanged between services through an interface.

Industry Architecture

A set of agreed common principles and standards designed to promote greater interoperability within and between the different players that make up an "industry", where an industry is defined as a set of organizations with similar inputs, processes, outputs and goals.

Source: Sprint

Information Architecture

Builds understanding of our information, its flows and uses across the organization, and how we manage that information.

Source: Sprint

Integration Architecture

An Architecture specifically focusing on the integration, co-operation, etc.   of certain elements, e.g. Components, Services or data.

Source: Sprint

Interface

A type of contract by which subsystems or component communicate.

Source: Sprint

Message Exchange Pattern

A model that describes how two different parts of a message passing system connect and communicate with each other. There are two major message exchange patterns - a   request-response   pattern and   one-way   pattern (also called   fire-and-forget ). For example the HTTP is a request-response pattern protocol and UDP is a one-way pattern.

Metadata Driven

An architecture that supports the design, composition, operation, and management of statistical business processing and its inputs and outputs through interaction with standard metadata - all of which is automated to the maximum degree possible.

Orchestration

A way of stringing together a number of services to build a statistical process. Orchestration composes services to a new service that has central control over the whole process.

Orchestration vs Workflow

The main difference between a workflow automation and orchestration is that work flows are processed and completed as processes within a single domain. (Erl)

Orchestration is the connecting and automating of workflows when applicable to deliver a defined service.

Principles

General rules and guidelines, intended to be enduring and seldom amended, that inform and support design and decision making, and the way in which an organization sets about fulfilling its mission.

Source: Sprint

Protocol

Formats and rules for exchanging messages in or between computing systems

Quality attributes

Quality attributes are the overall factors that affect runtime behavior, system design, and user experience.

They represent areas of concern that have the potential for application wide impact. When designing applications to meet any of the quality attributes requirements, it is necessary to consider the potential impact on other requirements and to analyze the tradeoffs between them.

The importance or priority of each quality attributes differs from system to system.

Reuse

Reuse is the concept of using a common asset (implemented component, a component definition, a pattern...) repetitively in different (or similar) contexts (for example in different business processes), and/or by different participants, and/or overtime.

Source: Sprint

Service

A service is a logical representation of a repeatable business activity that has a specified outcome and is self-contained, may be composed of other services and is a "black box" to consumers of the service.

Source:   TOGAF (G113):

Service Contract

A   service contract   is comprised of one or more published documents (called service description documents) that express meta information about a service. The fundamental part of a service contract consists of the service description documents that express its technical interface. These form the technical service contract which essentially establishes an API into the functionality offered by the service. A service contract can be further comprised of human-readable documents, such as a Service Level Agreement (SLA) that describes additional quality-of-service features, behaviors, and limitations.

Source:   http://serviceorientation.com/soaglossary/service_contract

Service Design Principles

A service design principle represents a highly recommended guideline for shaping solution logic in a certain way and with certain goals in mind. These goals are usually associated with establishing one or more specific design characteristics (as a result of applying the principle).

Source:   http://serviceorientation.com/index.php/soaglossary/design_principle

Service Interface

A service interface is the abstract boundary that a service exposes. It defines the types of messages and the message exchange patterns that are involved in interacting with the service, together with any conditions implied by those messages.

Source:   http://www.w3.org/TR/ws-arch/#service_interface

Service Level

Service Levels describes named categories of measurable increments of non-functional/quality attributes (like scalability, availability, security, transaction support…) packages offered by a service provider. For example, service levels for availability might be offered as GOLD (99,999%), SILVER (98%) and BRONZE (95% up time). Service Levels are used by the provider to negotiate with a service consumer a service-level agreement (SLA).

Service Oriented Architecture

Service-Oriented Architecture   (SOA)   is an   architectural style   that supports a way of thinking (Service Orientation) in terms of services and service-based development and the outcomes of services.

The SOA architectural style has the following distinctive features:

 

                   It is based on the design of the services – which mirror real-world business activities – comprising the enterprise (or inter-enterprise) business processes.

                   Service representation utilizes business descriptions to provide context (i.e., business process, goal, rule, policy, service interface, and service component) and implements services using service orchestration.

                   It places unique requirements on the infrastructure – it is recommended that implementations use open standards to realize interoperability and location transparency.

                   Implementations are environment-specific – they are constrained or enabled by context and must be described within that context.

                   It requires strong governance of service representation and implementation.

                   It requires a “Litmus Test”, which determines a “good service”.

 

Source: The Open Group   http://www.opengroup.org/soa/source-book/soa/soa.htm

Share

Share is an   ownership   concept where an asset is made available   to other participants for use.

 

There are levels of sharing. A limited form of sharing would be to provide another participant with the means to replicate (make a copy) the asset (for example give the source code)(i.e. they share an aspect of the asset only). A more involved form of sharing would entail that asset is actually been made entirely common (in this case the asset is also reused).  

Source: Sprint

Technology Architecture

The architecture describing the infrastructure technology underlying (supporting) the business and application layers.

Source: Sprint