EUROSTAT_FINAL

EUROPEAN COMMISSION

EUROSTAT

Directorate B: Quality, methodology and information systems

Unit B6: Reference databases  and metadata

 

 

 

 

 

 

 

 

 

 

 

Feasibility Study on the collection and production of process related metadata

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

APRIL 2011


1.    Overview of the applied Methodology

In this section the applied methodology for feasibility study on process metadata is outlined. The study was composed of the following stages:

 

       Determination of target population

The target population was identical and was composed of the 33 NSIs that participated also in the second phase of the assessment analysis. The NSIs that form the target population are the 27 EU member states, the 3 EFTA countries and 3 countries that are candidate for EU membership [1] .

       Determination of needs for information

The use of statistical business process models and the availability of process metadata in relation to the main phases of those processes have been defined as the main subjects of the analysis.

The integration and the harmonisation of the statistical business processes is mainly related to the 1) Use of a specific Model/Standard of statistical and 2) The documentation of the statistical processes. For that reason, these two elements were defined as the main areas of investigation in the feasibility analysis.

       Data Collection

The questionnaire was sent to the 33 concerned NSIs on 16 December 2010 and replies were asked to be reported by 21 January 2011.

       Analysis

Finally, 32 out of 33 NSIs participated to the survey by filling in the questionnaire for process related metadata and the statistical business process models.


2.    Current Situation

In this section are provided the main outcomes of the analysis both for the Statistical Business Process Models and also for the metadata that describe those processes. Within the framework of the feasibility analysis were investigated the following aspects:

1) The role of the Statistical Business Process Models in the statistical business lifecycle of the NSIs within the ESS

2) Process metadata in terms of current and future availability, content and their relation to IT applications that concern their production, storage and dissemination. The analysis for process metadata was conducted in relation to the 9 main phases of the Generic Statistical Business Process Model (GSBPM).

2.1.                      Statistical Business Process Models

The idea behind modelling the statistical process (through the development of statistical business process models) is not really new, since in the statistical community, relevant efforts towards this direction are made for more than ten years now. Moreover, the whole process has reached a certain level of maturity, where the next logical step was the development of a generic international model, such as the GSBPM.

A generic model can provide answers and solutions to many of the daily challenges that the statistical offices have to tackle, such as:

       Harmonisation of used terminologies;

       Common framework for metadata systems development;

       Facilitation of quality management procedures;

       Software sharing and reuse

       Enabling of process-based management, and other.

However, despite the importance and the added value of generic business process models, as well as the resources invested on their development, it is estimated that the adoption of such models by the statistical offices is by no means at the same level.

In the rest of this subsection, we will present and comment on the findings of the “2010/2011” questionnaire survey of the project on monitoring of national metadata systems, which are related to the issue of statistical business process models.

Use of models/standards in the statistical business lifecycle

The first important finding of the analysis is related to the fact that the European Statistical Community is by no means homogeneous, as far as the use of statistical business process models is concerned.

More specifically, 19 out of the 32 NSIs declare that they use a model/standard for modelling their business process, but only 4 of them have adopted and use the GSBPM (See Table 2.1). Out of the remaining 15 countries which are using other models/standards, this model is related to the GSBPM for 12 of them (See Table 2.2). Another 2 of them are using models not related to GSBPM and for the last one no information is available.

Finally, about 40% of the NSIs (13/32) do not use any kind of model/standard.

 

If we focus only on those NSIs that have adopted some kind of a model for business process modelling (GSBPM or other), then it proves that their maturity level is again much differentiated among them.

In fact, only 5 out the 19 NSIs currently use their model to a large extent for their processes and 8 use it to a small extent (See Table 2.3).

The picture is different when considering new/future processes. In this case, 5 out of the 8 NSIs which currently use the model to a small extent have plans to further extend its use whereas the other 3 don't have particular plans. Finally, all the NSIs that are currently use the statistical business process model at a large extent plan to preserve this policy also for the future processes.

 

Table 2 . 1 : Statistical Business Process Model

Do you have a Model/Standard for describing the statistical business processes ?

Total

Yes, we have the Generic Statistical Business Process Model (GSBPM )

4

Yes, we have other than GSBPM

15

No, we do not have a specific Model/Standard

13

Total

32

 

Table 2 . 2 : Relation of the model to the GSBPM

Is the model related to the GSBPM ?

Total [2]

Yes

12

No

2

NA

1

Total

100

 


Table 2 . 3 : Extent of use of the Statistical Business Process Model

Current Processes

Future Process

Total

Use of the Model to a large extent

Use of the Model to a smaller extent

We have adopted the Model but we are not actually using it for the moment

NA

Use of the Model to a large extent

5

0

0

0

5

Use of the Model to a smaller extent

5

3

0

0

8

We have adopted the Model but we are not actually using it for the moment

2

0

2

0

4

NA

1

0

0

1

2

Total

13

3

2

1

19

 

Obstacles in using a Specific Model/Standard

Another important objective of the questionnaire survey was to pinpoint the reasons, which inhibit and discourage NSIs from adopting and using any type of generalised business process models.

The majority of the NSIs that does not have a model (9/13) indicated as the main obstacle for introducing models/standards the “limited human and/or financial resources” (See Table 2.4). Concerning the "absence of (corporate) strategy for improving the degree of harmonisation and standardisation of business process", it was mentioned as a reason for the absence of a model by 2 NSIs. 

 

Table 2 . 4 : Reasons for not having a model/standard

Reasons for not having a Model/Standard

Total [3]

Absence of strategy for improving the degree of harmonisation and standardisation of business processes

2

Limited human and/or financial resources

9

Other

3

 


Future plans for the adoption of a Model/Standard

Another important target of the survey questionnaire was to draw the general picture regarding the future of the statistical business process models and standards in the NSIs. In order to achieve this goal, the NSIs that do not currently use any kind of model or standard (13 in totals) were invited to provide information regarding their future plans.

Again, the results were multivariate (See table 2.5): 4 out of the 13 NSIs declared that they intend to adopt GSBPM in the future, although the implementation has not started yet. The same number of respondents (i.e. 31%) replied that it is their intention to use some kind of model/standard, but they have not yet decided which one. Moreover, another 2 NSIs stated that they are already running a project for incorporating in the organisation a model/standard, although this is not GSBPM.

None of the NSIs mentioned a progressing project for the adoption of GSBPM.

 

Finally, one NSI reported that it has no plan for adoption of a model (GSBPM or other) and one respondent did not provide any answer to this question.

 

Table 2 . 5 : Future plans for using a model/standard

Existence of plans for the future adoption and implementation of a statistical business process model

Total [4]

Yes, we plan to adopt GSBPM. The implementation did not start yet

4

Yes, but we do not know yet which model/standard we will adopt

4

Yes , a project for the adoption of a Model/Standard other than GSBPM is in progress

2

No, we do not have any plans

1

Other

1

NA

1

Total

100

 


Assess the contribution of GSBPM

Another aspect of the questionnaire survey addressed the issue of assessing the contribution of GSBPM on various aspects of the statistical production. The respondents were asked to evaluate the importance of the contribution of GSBPM on a selection of 8 main issues:

 

For the field of the "importance", a four-level scale was used, varying from “very important” to “not important at all”.

 

The analysis revealed that NSIs consider that the most important contribution of the GSBPM is on the issue of “standardisation of statistical processes”, since 23/31 NSIs (74%) selected the option “Very important” for this one (See Table 2.6). On the other hand, the least important contributions were found to be on the issues “impact on the organisation structure” and "Measurement of operational costs" both selected as to be “not important at all” by 5/31NSIs (16%).

As far as each individual issue is concerned, we can also outline that the majority of the NSIs considered that the GSBPM has a significant contribution (selected answers “very important” and “important”) in the fields of:

       “Development of statistical metadata systems” (10 and 17 respondents respectively),

       “Quality assessment of statistical business processes”, (same scores),

       “Description of statistical business processes” (18 and 11),

       “Increase of understanding of statistical business processes” (17 and 12).

 

Table 2 . 6 : Contribution of GSBPM

Issue to which GSBPM contributes

Importance of GSBPM's contribution

Very important

Important

Not all that important

Not important at all

Don’t know

Development of statistical metadata systems

10

17

4

0

0

Standardisation of statistical business processes

23

7

1

0

0

Quality assessment of statistical business processes

10

17

2

2

0

Description of statistical business processes

18

11

1

1

0

Impact on the organisational structure

5

9

12

5

0

Increase of understanding of  statistical business processes

17

12

0

2

0

Provision of  an input to high-level corporate work planning

6

12

8

3

2

Measurement of operational costs

1

13

10

5

2


 

The entities of the NSIs that are involved in each phase of the GSBPM

As a last step for this part of the analysis, the questionnaire focused on the nine phases of the statistical business process, as these have been defined by the statistical community and the way each statistical institute deals with each individual phase (central handling, handling by production units, other entities or any combination of the three methods).

In total, 30 organisations responded to this question (See Table 2.7).

The results reveal that, in general, Statistical Production Teams are involved implied in most of the phases of the statistical business processes, either alone or in cooperation with other partners (mainly with central units/departments). In each phase of the statistical business process (except Dissemination), the main responsibilities are assigned exclusively to the Statistical Production Teams in about half of the participating countries. This is merely the case for phases “5.Process” and “6.Analysis”. The role of the Statistical Production Teams is also essential in phases “2.Design” and "9.Evaluation" (involvement in 29/30 NSIs and 25/27 NSIs respectively) during which Central Units/Dept are often collaborating.

The most centralized phase is Dissemination. This phase is exclusively under the responsibility of Central Units/Departments by 37% of the NSIs (11/30). Moreover, the proportion of the NSIs, in which dissemination processes are executed both by Central Units/Departments and Statistical Production Teams, is also equal to 37%.

 

Table 2 . 7 : Entities involved in each phase of the statistical business process

 

The entities of the NSIs that are involved in the different phases

Total

Central

Unit/Dept.

+

  Statistcial

Production

Teams

+

Other entity)

Central

Unit/Dept.

+

Statistical

Production

Teams

Central

Unit/Dept.

Statistical

Production

Teams

Statistical

Production

Teams

+

Other entity

Central

Unit/Dept.

+

other entity

Other

NA

1. Specify needs

3

3

7

14

2

0

0

0

29

2. Design

1

12

1

13

3

0

0

0

30

3. Build

4

8

4

11

1

1

0

1

30

4. Collect

3

4

8

13

1

0

1

0

30

5. Process

2

8

1

17

2

0

0

0

30

6. Analyse

1

5

1

20

3

0

0

0

30

7. Disseminate

4

11

11

2

0

1

1

0

30

8. Archive

1

6

6

11

2

0

2

0

28

9. Evaluate

1

10

0

12

2

0

1

1

27


 

2.2.                      Process Metadata

The main objective of the survey for process metadata was to measure the extent to which the production and provision of process metadata within the ESS are feasible. Hence, within the framework of the feasibility study the following issues were investigated:

       The extent of current and future availability of process metadata in the ESS for each phase of the Generic Statistical Business Process Model (GSBPM).

       The content of process metadata. The type of information that is currently provided or is feasible to be produced in the future.

       The identification of the phases of the statistical business process for which better documentation is necessary.

       The degree at which the existing IT infrastructure of the NSIs currently supports the production, storage and dissemination of process metadata within specific phases of the statistical production lifecycle.

 

Current availability of process metadata

 

Process metadata concern the description of the statistical business process. The main reason for investigating the availability of process metadata for each phase of GSBPM is that the structure of the model is considered as the proper basis for the compilation of process metadata.

Regarding the results of the analysis, these indicate very clearly that process metadata are currently available within each phase of GSBPM but the extent of availability differs among the 9 phases (See Figure 2.8)

Phase “7.Disseminate” is the one for which process metadata are the most often fully available in NSIs (61%). Furthermore, in 32% of the NSIs, information is also partially available for this phase.

The same cumulated share (93%) of NSIs with available or partially available process metadata can be found for Phases "4.Collect" and "5.Process", but the amount of NSIs where the information is fully available for these both phases is lower than for the phase "7.Disseminate" (52% and 42% respectively).

Among all phases, the availability of metadata for phases “8.Archive “and “9.Evaluate” are the lowest. More concretely, 11/30 (37%) NSIs do not have metadata at all for these both phases. 

Concerning the remaining phases ("1.Specify Needs", "2.Design", "3.Build", and "6. Analyse"), one can remark that in about half of the NSIs, process metadata are partially available.


Figure 2 .8: Availability of process metadata in each phase of GSBPM


Future availability of process metadata

 

The feasibility for future provision of process metadata was investigated for the subset of the NSIs that do not currently provide process metadata for the different phases of GSBPM. The results of this analysis are provided in Table 3.9.

According to this table, phases "1.Specify needs", "8.Archive" and "9.Evaluate" are those where the future collection of information seem to be the most feasible.

To a lower extent, information concerning phase "3.Build" could also be partially collected.

In opposite, it appears difficult for NSIs which don't collect any process metadata on phase "2.Design" to get more information in the future.

Finally, uncertainty exists concerning the possible future collection of metadata related to phase "6.Analyse" where 2 out of the 4 NSIs replying to this question can't conclude on any possible improvement.

 

Table 2 .9: Feasibility to collect currently non-provided process metadata

Phases of GSBPM

Feasibility to collect currently non available process metadata

Total

Feasible

Partially feasible

Not feasible

I don't know

1. Specify needs

Total

3

2

3

2

10

2. Design

Total

0

1

3

0

4

3. Build

Total

0

4

1

0

5

4. Collect

Total

0

1

1

0

2

5. Process

Total

0

1

1

0

2

6. Analyse

Total

0

1

1

2

4

7. Disseminate

Total

0

1

1

0

2

8. Archive

Total

3

2

2

0

7

9. Evaluate

Total

3

4

2

1

10

 


Types of process metadata currently available

 

Apart from the extent of availability, the content of process metadata that are currently collected by the NSIs within each phase of GSBPM was also investigated. Therefore, NSIs were asked to indicate to which of the following categories the metadata that they currently collect do belong.

The proposed categories of process metadata were:

1)       Methodological process metadata: D escribe the methodological tools and standards along particular statistical production process

2)       Technical process metadata: D escribe the workflow, IT tools and staff activities at each steps of the production cycle.

3)       Process quality metadata: D escribe the quality of the statistical output and the underlying statistical production process.

 

The distribution of the available types of metadata is provided in Table 3.10.

According to the replies, all the possible combinations of the 3 types of process metadata that are currently available in ESS were mentioned.

The analysis revealed that Methodological process metadata are the most common types of process metadata that are currently collected in the ESS . They are the predominant type of metadata collected in most of the phases of the GSBPM. This is particularly the case for phases "2.Build", "5.Process" and "6.Analyse" where their availability is often combined with process quality metadata.

This latter type of process metadata is also collected for phases "4.Collect" and "7. Disseminate" and can merely be found in phase "9.Evaluate" where 8 out of 11 NSIs mentioned it as a type of available metadata.

In phases "4.Collect" and "8.Archive", the three proposed types of metadata are generally available to more or less the same extent.

Finally, it should also be noticed that the proposed field "other type" has often been chosen by NSIs when responding to this question (See footnote of Table 2.8)


Table 2 . 8 : Types of process metadata

Phases of GSBPM

Total Number of NSIs by type of process metadata for each phase of GSBPM

Total

Methodological

+

Technical

+

Process quality

Methodological +

Process quality

Methodological

Process quality

Methodological

+

Technical

Technical

+

Process

Quality

Technical

Other Types [5]

1. Specify needs

1

1

9

2

0

0

0

7

20

2. Design

2

6

6

2

2

0

1

6

25

3. Build

2

3

2

0

1

0

5

10

23

4. Collect

3

4

4

3

2

0

2

8

26

5. Process

2

6

4

2

3

1

0

9

27

6. Analyse

2

6

4

2

1

0

0

7

22

7. Disseminate

2

3

4

4

0

1

1

10

25

8. Archive

1

2

2

1

0

0

5

5

16

9. Evaluate

1

2

3

5

0

0

0

0

11


Improvement of documentation of statistical business processes

 

Another aspect of the questionnaire survey concerned the identification of the phases of the statistical business processing for which better documentation is considered as necessary. Table 2.9 indicates that among all NSIs that replied to this question (24 replies in total), more than half of them consider that better documentation is necessary for all the phases of the statistical business processing.

Phase "2.Design" is the one for which NSIs would require more documentation (83%) followed by phase "9.Evaluate" and "6.Analyse" (79% and 75% respectively).

 

Table 2 . 9 : Need for better documentation

Phases of statistical business process

% of Total respondents [6]

1. Specify needs

71

2. Design

83

3. Build

67

4. Collect

63

5. Process

71

6. Analyse

75

7. Disseminate

54

8. Archive

67

9. Evaluate

79

 


IT applications in the production, storage and dissemination of process metadata

 

In the survey, the use of dedicated IT applications in the production, storage and dissemination of process related metadata was investigated for the phases 4 to 7 of the GSBPM.

 

The results from the 29 replies received are available in Table 2.10.

It shows that dedicated IT applications are mainly used for the production and the storage of process metadata that are collected within phases "4.Collect" and "5.Process" (between 15 and 19 NSIs for the four cases).

Phase "6.Analyse" is the phase where dedicated IT applications are the least often used, for production, storage as well as dissemination of process related metadata (less than half of the NSIs concerned).

Logically, phase "7.Diseminate" is the one where IT applications are especially dedicated for the dissemination of process related metadata (16/29). However, the use of these IT applications for the production and storage of the process related metadata within this phase concerns a similar number of NSIs (15/29).

 

Table 2 . 10 :  The use of dedicated IT applications

Phases of statistical business process

Dedicated IT Application(s) for:

The production of

process related metadata

The storage of

process related metadata

The dissemination of

process related metadata

4. Collect

19

16

9

5. Process

15

17

10

6. Analyse

8

10

6

7. Disseminate

15

15

16

 

 


[1] Iceland is member of EFTA and constitutes also a candidate country since 2009. In the analysis it is considered as an EU candidate country. Hence, in the 3 rd phase of the assessment analysis the population of candidate countries that are monitored was enlarged. The set of candidate countries is composed of Croatia , Turkey and Iceland .

[2] The Total concerns the 15 NSIs which use a model other than GSBPM

[3] The Total concerns the 13 NSIs which do not use a model/standard.

One country selected two answers, therefore the sum of the Total column equals to 14.

[4] 13 NSIs that do not use a model/standard

[5] Other refers to very analytical descriptions of process metadata that cannot directly be classified into one or more of the main types (Methodological, Technical, Process Quality)

[6] In Total 24 NSIs provided information for the need of improving the documentation within each phase