An effective SMS can provide the following benefits to all users of statistical metadata:
- Better quality statistical information;
- Improved interpretability of statistics;
- Improved quality of metadata;
- Better discovery, retrieval and exchange of data and metadata;
- Common terminology, names and descriptions for standard metadata elements to improve communication;
- Improved efficiency through central metadata repositories that are organized to facilitate reuse of existing data;
- Improved knowledge of metadata flows.
An effective SMS allows organizations to be flexible and responsive to rapidly evolving requirements for statistical information. Statistical organizations can interoperate more effectively with their respondents, with data consumers and with other agencies that form part of the larger statistical system. Well-maintained metadata allow the organization to operate in a more transparent and quality assured manner. The SMS supports more effective change management processes, reducing risks to business continuity and the barriers to business process improvement.
Statistical organizations and other stakeholders can benefit when metadata exist throughout the entire statistical production process, rather than as captive to a particular statistical processing system or infrastructure package.
Back to top
Users within a statistical organization
In addition to the general benefits described above, the SMS provides senior managers with a tool to facilitate the design, planning, decision-making and evaluation processes of statistical information systems, and enables other strategic initiatives to be designed and implemented much more effectively, efficiently and confidently. An SMS should provide the tools for answering questions like: to what extent do users actually use the statistical outputs? Are they satisfied with the quality of data and metadata? Are there complaints or unmet demands from users or respondents?
An SMS should provide answers to these questions, facilitating administrative management of a statistical system. Last but not least, senior managers of statistical information systems will be able to verify the costs and benefits of individual statistical activities.
To achieve these benefits, metadata about the following will be needed:
- End-user needs and stakeholder requirements on a national and international level;
- Available statistical services;
- External information systems related to statistical information systems;
- Suppliers and sources of data in statistical information systems;
- Statistical production process;
- Statistical publications, release calendar, copyright and other dissemination issues;
- Responsibilities inside the statistical organization;
- Costs and revenues.
Back to top
Information system designers
These people are responsible for the design, implementation, maintenance and evaluation of statistical systems. When designing new systems, they need access to metadata from existing systems, either within or outside the organization, to inform the design, development and implementation of a new system. For existing systems under their responsibility, they need feedback about performance, costs, usage, and user satisfaction.
The following information is required when designing and developing an SMS:
- How similar systems have been designed in the past;
- What observation data are already available;
- How can these data be obtained;
- What methods, tools and software components are available and how can they be used.
For maintenance and evaluation of an existing SMS, the following information will be needed:
- Detailed, up-to-date documentation of the system;
- Feedback, both formal and informal, concerning production and usage of the system;
- Experiences from similar systems;
- Knowledge about methods, tools and software components;
- Special evaluation studies performed on an ad hoc basis.
Back to top
The subject-matter statistician is the expert in a particular field of statistics within a statistical organization. They have the crucial role of understanding the users' information requirements, in the context of the users' policy and program environment, as well as the capabilities of their statistical office, i.e. what they can do to provide the required information. Subject-matter staff work with other specialists to design and construct an appropriate collection mechanism and produce statistics. However, the statistician then has the role of communicating the information to users through the creation of statistical products and the provision of associated metadata to assist users in understanding the results. Evaluation is also an important responsibility for the subject-matter specialist.
Given these roles, the SMS is a knowledge management system for the subject-matter statistician. In this information system they would want to be able to create, update, search, browse and retrieve many different types of metadata entities that would cover many aspects, such as:
- user (customer) requirements;
- standard concepts, data elements and classifications;
- operational information and quality metrics about the operation of their survey system;
- documentation about statistical techniques (methodology) applied to their survey;
- products created from the statistical data.
The benefits of an SMS to the subject-matter statistician include:
- access to a consistent store of standard classifications, data elements, process engines that can be used in new statistical process development with the knowledge that using these elements will assist greatly in ensuring statistical integration;
- tools and links that enable the subject-matter statistician to create statistical products for the organization with a common 'look and feel';
- a record of statistical collections, including all previous cycles, and a reference point to find information on related collections. This is an invaluable resource for new employees coming into a statistical field and for statisticians in other fields who might be researching a new collection - there may be elements in another statistical activity that can be reused;
- standard processes, such as the registration of new data elements, which would provide a common method for the creation and use of metadata;
- registration of who is making use of which metadata, and noting of any issues encountered with any of the existing metadata content. This facilitates effective information sharing between subject matter areas. It also facilitates change management for metadata which recognizes and supports the stakes held by each subject matter area.
Back to top
An SMS creates a framework for design and implementation of statistical tasks and surveys to meet obligations in the production of official statistics and needs of end users. The SMS provides tools for safeguarding the integration of statistical information systems at national and international level. It is an indispensable tool for the maintenance, use and further development of statistical classifications and nomenclatures; statistical registers; statistical standards; knowledge about statistical methods and relevant research methods.
Methodologists require metadata relating to the following:
- Content of available statistical data (microdata, macrodata) and associated data concepts;
- Quality of statistical data (relevance, accuracy, timeliness, punctuality, accessibility, clarity, coherence and comparability);
- Existing statistical tasks and surveys (questionnaires, other sources, etc);
- End users and their feedback;
- Requests of international organizations and related standards;
- Data sources and their links;
- Respondents' information systems;
- Administrative data;
- Information systems and their output databases (portals);
- Statistical registers (population, farms etc);
- Statistical classifications, nomenclatures and related international standards;
- Statistical population, statistical units, measurement units, time series;
- Statistical methods and relevant research projects.
Back to top
Administrators of metadata content
The SMS should ensure smooth and systematic update and maintenance of statistical metadata. Maintenance of metadata content will be performed by subject-matter specialists, methodologists and standards/metadata specialists responsible for metadata content. Metadata should be updated once and in one place. This will help avoid inconsistencies and unnecessary redundancies. Updates to all the dimensions of the corporate metadata repository should be automated.
The administrator will need a user-friendly interface, avoiding any special technical skill. The system should assist in identifying stakeholders which will be impacted by any administrative action, and in assessing the impact of that action on their use of the metadata. This assists in change notification, stakeholder consultation and risk/impact remediation. The administrator will need the following metadata:
- Information related to the content of and links between statistical metadata;
- Information about organization of metadata in the corporate metadata repository;
- Metadata allowing discovery and retrieval;
- Updating methods and procedures.
Back to top
These people are responsible for the technical maintenance of the SMS. They should cooperate with designers, evaluators and content administrators in solving technological issues and for the further development of the SMS. The technical administrator will use, oversee and maintain the following metadata:
- Technical metadata related to the SMS, and to the links with production systems;
- Information and knowledge about technological aspects of statistical production;
- Information about technical links to other information systems;
- Information about tools and software used by content administrator.
Back to top
Information technology specialists
People operating and monitoring the statistical production process are important metadata users. Ideally, the SMS supports tuning of statistical business processes. For example, the statistical impact of a particular process (or choosing a particular threshold for that process, such as significance editing) can be assessed so that the practical "value added" of the process can weighed against its costs in terms of resources and time. This is not so much tuning how the process works with metadata, but using metadata to tune the process.
Metadata driven statistical production creates favourable conditions for standardization and thus efficiency of statistical production systems. Metadata on the content of statistical data and associated concepts, including all other delimiting metadata (statistical classifications, statistical units, measurement unit, time series, statistical population etc.), are a key condition for the whole throughput of production phases (data collection, processing, analysis and dissemination). Technical metadata on the organization of the corporate metadata repository, and links to the production systems, belong to the metadata set needed for fulfilling functions of data processing. The Generic Statistical Business Process Model can be used as an organizing framework for metadata, as well as a means of benchmarking business processes.
Ideally, statistical production processes will generate metadata about their own performance, giving producers feedback about functioning and efficiency of metadata driven production. In this respect, producers should cooperate with SMS designers, subject-matter specialists and methodologists, content and technical administrators on the design, implementation, evaluation, and further development of the SMS.
Back to top
Users outside the statistical organization
Benefits for respondents and data suppliers
Respondents are important partners of any statistical information system. Statistical data suppliers are often also the users of statistical data. Their role is becoming more important with the growing number of systems and online communication possibilities. Bearing in mind the possibility of electronic data reporting from respondents' information systems to the statistical information system and the possibility of online access of respondents to the statistical information system, it is evident that the needs of respondents will change. The SMS will play a key role in those tasks.
There is a growing need to harmonize methodological definitions of data and related metadata from respondents and statistical information systems. Attention should be drawn to the implementation and use of relevant technological metadata standards. SDMX (Statistical Data and Metadata eXchange) has been developed specifically for exchange of statistical data and metadata. The SDMX standards and guidelines aim at establishing a set of commonly recognised rules, adhered to by all players. This makes it possible not only to have easy access to statistical data, but also metadata, making the data more meaningful and usable. The standards will allow statistical organizations to fulfil their responsibilities towards users and partners, including international organizations, in a more efficient way, among other things by using their online databases to give access as soon as the data are released.
Respondents and data suppliers will require the following information:
- Metadata related to the content (definitions, terminology) of statistical data in the input stage of the statistical production;
- Security and confidentiality of microdata;
- Feedback from statistical outputs;
- Information about the content of statistical warehouses;
- Knowledge about comparability of statistical and respondents data/systems;
- Technical parameters for search and retrieval of metadata in the common metadata repository, and links to statistical warehouses;
- Knowledge about potential interface between statistical information systems and respondents' information systems;
- Relevant technological standards for metadata and data supply;
- Information about software and other tools supporting supply of data and metadata;
- Information about strategies for further SMS development;
- Training in use of the SMS.
Back to top
Benefits for end users on the national level
Understanding and classifying different communities of end users could help in determining user requirements. The SMS will help users to better discover, understand, interpret and interrogate the data they need. The proliferation of statistical information has raised the issue of consistency and comparability of data. Comparability of data is desirable, but not always possible. It is important to know what the differences are and the reasons for them, with explanations according to differing levels of user understanding of statistical concepts. The SMS will also assist to convey the credibility of statistical data and to recognize intellectual property rights.
It is important to monitor user feedback and to embrace the need for metadata in both directions. The SMS will offer the possibility to understand how users search and the terms that they use. The SMS will also support the management of access to microdata. The fact that users are increasingly requesting access to microdata, calls for tools that allow concerns about confidentiality protection to be overcome.
With the spreading use of the Internet it is important to provide users with the appropriate information about the data available from statistical websites. However, there is a potential to flood users with too much metadata. Appropriate communication of metadata should be based on principles of 'cognitive psychology', recognizing the important role that presentation plays in metadata consumption.
An increase in the possibilities for syndication and reuse of data by external websites, such as online communities, web services and 'mashups', means that metadata need to be more closely, but flexibly coupled with data in a way that both web services and people can use. The websites of statistical organizations may not be the main source for data, with users going to an increasing number of secondary providers of online statistical information.
This heterogeneity, together with more visible methodological differences and inconsistencies of statistics disseminated via the Internet, poses difficulties for the users. Clearly, there is a need for harmonization of metadata accompanying statistical information on the Internet. International standards should play an important role in this respect.
The following metadata are vital for end users of statistical metadata and data:
- Availability of statistical outputs;
- Metadata related to the statistical outputs (metadata and data concepts and definitions,
classifications, aggregations, statistical and evaluation methods, terminology, history, etc);
- Metadata about quality (e.g. explanatory notes);
- Access to microdata;
- Time series;
- Updating procedures;
- Statistical revisions;
- Responsibility for individual statistical outputs;
- Links to other information systems both national and international;
- Planned changes in statistical outputs;
- Content related standards, both national and international;
- Outcomes from statistical analysis on users feedback;
- Rules for searching, accessing and downloading statistical metadata and data from output databases;
- Technological standards relevant for extraction and transfer of data and metadata;
- Information about software and other tools supporting search, retrieval and downloading of metadata and data;
- Users training possibilities;
- Metadata based services such as classification coders and metadata mappings that other producers and users of statistics can apply.
Back to top
Benefits for international users
There are more and more demands by international users for greater consistency when interacting with statistical organizations. In the case of international organizations, the metadata and data requirements (particularly concerning collection and exchange) have to be coordinated not to overburden countries with duplicate requests. In order to fulfil this task, better integration of metadata at the national and international level is needed.
A lot of metadata are available on websites of international organizations. Links could be inserted from the metadata of international organizations to more detailed metadata on national websites. Coordination of access could be achieved through a single gateway for data and metadata. To this end, joint hubs based on SDMX standards are at present under intensive development.
The needs of international users increasingly impact the architecture of the SMS of national statistical organizations. International collaboration and alignment should be driven as much by national statistical organizations - and their national interests - as by international organizations. Processing and storage power allows formats that are globally rather than locally optimised (e.g. XML) to be viable, opening up practical application of standards. This is coupled with movements and standard, open, toolsets that promote "open source" and similar collaborative developments. There are unprecedented opportunities to collaborate. At the same time there are unprecedented rates of change in technologies, end user expectations, information needs and data sources, which require national statistical offices to collaborate in order to continue meeting user requirements.
Metadata needed by international users are similar to those needed by end users on national level (see above). Furthermore, the following information would be required:
- Complying with international standards (coherence, comparability, explanatory notes);
- Standards used for electronic metadata and data transfer;
- Information about other international and national users;
- Indication of needs for revision and/or standardization of statistical data and metadata concepts.