|Statistics South Africa||2. Modelling the Information and Processes of a Statistical Organization (Statistics South Africa)|
1.2 Current situation
High-Level Organisational Structure
Number of Staff: ± 2,000
Figure 1: Stats SA Organization Chart
The Data Management and Information Delivery (DMID) project (magenta shaded box) is located within the Data Management and Technology Division (DMT)
The yellow shaded boxes indicate the ongoing projects that are concurrent with the DMID project.
The following chart shows how the DMID project is structured:
Figure 2: The DMID Project Structure
The following chart show how the DMID project is structured, including the supplier's resources:
Figure 3: DMID Organization Chart
Prescient Business Technologies (PBT) - is the name of the supplier to the DMID project, developing the ESDMF System.
ESDMF - End to end Statistical Data Management Facility.
PM - Project Manager.
Number of staff:
- Stats SA: 1 Project Manager, 1 Technical Lead/Project Manager, 7 Developers, 1 Chief Standards Officer, 6 Data Quality officers/specialists, 3 Methodologists, 1 Systems Analyst, DBA (as needed), Network Support Technician (as needed) (20 total - excluding "as needed")
- PBT: 1 Project Manager, 1 Technical Lead (50%), 1 Architect, 1 Business Analyst (50%), 1 Release Coordinator, 1 Trainer, 1 Organisational Change Management Lead, 3 Developers, 1 Account Manager (10 total)
Statistics South Africa's development of the metadata management system has its origins in the organisation's requirement to develop a data warehouse. The idea of a data warehouse came about because the organisation wanted to improve the quality of the statistics produced. It was believed that the data warehouse would play a major role in positioning the organisation within its vision of becoming the "preferred supplier of quality statistics". To begin our data warehouse initiative, we paid exploratory visits to various statistical organisations that had embarked on data warehouse developments in order to learn from their experiences. These visits taught us that a number of things about the complexities, difficulties and peculiarities of developing a data warehouse. In particular, our visit to the Australian Bureau of Statistics showed us that for a data warehouse to have any chance of succeeding in a statistical organisation, it needs to have a strong foundation of standards and policies that govern the statistical production processes. Standardisation of concepts and their definitions, as well as classifications of the terms of the actual survey process, were all found to be necessary for the production of quality statistics. For it to be successful, a data warehouse also needs to operate in this environment.
A formal process for standardization was developed through consultation with standards experts standards development and implementation lifecycle was developed to monitor the standardization process. The following is the standards development lifecycle.
Figure 4: Standards Lifecycle
The next step for us was to investigate the strength of our standards and policy foundation. Upon this investigation, a number of gaps were identified. Chief among these was the lack of standard metadata in the organisation. The need for standardisation of metadata necessitated the development of a metadata management system. However, this had to form a good mix with all the other identified ingredients necessary for the production of quality statistics.
Strategically, our metadata management system forms part of a larger system of applications called the End-to-end Statistical Data Management Facility (ESDMF). As an end-to-end system, the ESDMF will consist of tools and applications to support the whole statistical production process. Within this facility exists a metadata subsystem (refer to figure 5), which plays a central role as the ESDMF was conceived to be metadata driven. In a statistical organisation, a metadata driven system is inevitable because metadata is used and generated at every stage of the statistical production process.
Figure 5: Conceptual components of the ESDMF
As a data factory, a statistical organisation needs to organise and package data in ways that make it useful to the end user. Produced data must also meet certain minimum quality standards. To satisfy both these requirements, use of metadata is invoked. In packaging its data and statistical products, a statistical organisation must ensure that they are attached with metadata for ease of analysis and interpretation by their users. Metadata also play a key role in ensuring that the end products of this data factory are of good quality. Such metadata includes descriptions of concepts used in the organisation, classifications of these concepts, methodologies and business rules. These are all necessary metadata to ensure that products are of good quality.
The development of a metadata management system was informed by the following principles:
- Maintenance of trust in official statistics: Descriptions of data collection methods, data processing, and storage needed form part how statistical data are presented to the end user. When presented like this, statistical data and products engender trust to the users.
- Facilitation of correct interpretation of statistical data: Metadata accompanying datasets and other statistical products.
- Quality of statistics: Standard metadata contributes to the improvement of a number of quality dimensions. Standardisation of concepts and their definitions and classifications are essential ingredients of standardized metadata.
Programme Providing Frame for Stats SA Projects
The work of all Stats SA components is mapped out in the organisation's Work Programme. Organisational units must support the following strategic themes to advance the work of the organisation:
- Providing Relevant Statistical Information to meet user Needs
- Enhancing the Quality of Products and Services
- Developing and Promoting Statistical Coordination and Partnerships
- Building Human Capacity
This project is aimed at supporting the strategic theme "Enhancing the Quality of Products and Services". Within the DMID project, the metadata management system, more than any of its components, addresses this strategic theme.
Overall Project Objective
Statistics South Africa's metadata management system therefore forms part of the organisation's broader objective to continuously improve the quality of its products. As the driver of the overall facility, the metadata management system is the first deliverables of the DMID project. The metadata management system is also divided into smaller logical units based on the organisation's classification of its metadata. Survey metadata, consisting of elements for providing the overall description of a statistical survey is the first of these metadata deliverables. The survey metadata component is fashioned along the lines of Statistics Canada's Integrated Metadata Database (IMDB) Metastat.
Following the survey metadata component will be the definitional metadata component. This will incorporate into the metadata management system the standardised organisation-wide concepts and their definitions and classifications as well as other components that form part of definitional metadata.
The supplier had a difficult time understanding the business of Stats SA, which is statistical production processes. Additionally, the goal of the project is to improve quality, which will help support the vision of Stats SA "to be the preferred supplier of quality statistics". Even in the face of this vision, the supplier failed to recognize that quality was a primary business objective.
Under pressure of meeting the deliverables, the supplier ignored the Skills Transfer Plan, with the result that the Stats SA developers were not involved in the final design and development of the system.
For a project of this magnitude (three years), we decided to break down the deliverables into twelve phases. Each phase was planned to be three months long in duration. Also, each phase was planned to be a complete deliverable in its own right, even though the next phase was planned to build on the previous phases. The first phase was delivered late mainly due to the lack of understanding that the supplier demonstrated. The key is that clear understanding of the requirements is very important in meeting the deliverables as well as milestones for those deliverables.