1. Broad description
Statistical dissemination in the Brazilian Institute of Geography and Statistics (IBGE) was traditionally carried out in two ways: for the general public, by means of media communication, assisted by media releases or press conferences; and for the general users, through printed publications and electronic publications. For more specialized users and government agencies, the requirements are met through customized tables and public use microdata files.
A policy of free dissemination of all products through Internet has been adopted in IBGE, since 2001. There has been outstanding growth in this communication channel.
As well as the electronic publications, the IBGE web page contains two important databases: Aggregated Statistical Tables (SIDRA) - a database with information grouped at territorial level that allows the users to construct tables according to selected information; and Multidimensional Statistical Database (BME) - a database with microdata information that allows users to construct tables according to selected information and confidentiality constraints. This database requires Internet subscription.
IBGE has been releasing public use microdata files for households' statistics since the early 1990s. Measures taken to protect the confidentiality of these microdata include suppression of geographical detail. However, no public use microdata files are released for businesses data, or for the 1996 Agricultural Census and the short form 2000 Population Census.
The pressure of increasing demand, the advance of technology and the increase of sensitivity to privacy issues have encouraged the development of arrangements to provide restricted access for researchers to data files that the statistical agency does not release to the general public. These arrangements permit a more in-depth analysis than was possible when using tabular aggregated data. This is done in IBGE via on-site access at the headquarters of the agency.
This Case Study provides short summaries of the procedures that have been implemented and are currently in use by IBGE since 2003, in order to permit external researchers, analysts within government, academia and other organizations to access restricted data.
2. Why is it good practice?
Confidentiality is a key element of respondents' trust, thus maintaining their cooperation in the provision of accurate data. As a result, the policy for the release of data is to prevent disclosure of information about individual persons or businesses, consistent with IBGE's legislation supporting confidentiality.
But it is also essential to try to reach the needs of the research community while maintaining confidentiality and security.
To provide restricted data access for analysis requires collaboration between all involved parties and preparation to deal with a variety of situations and questions.
Technical developments may allow for new ways of achieving the needs of the research community whilst maintaining confidentiality and security.
3. Target audience
The target audience is researchers requiring special data access to information not available through the web site or public use data files.
4. Detailed description
The following describes the administrative and technical measures to regulate the access of restricted microdata and to ensure that the output is released with an adequate level of protection so that individual data cannot be disclosed. The procedures cover the following steps:
The researcher submits the research project to be evaluated if it is for public or academic interest, for statistical purposes and also whether it is feasible.
(2) evaluation of the project
A Committee of Assessment of Restricted Data Access evaluates the project, based on submissions of the thematic area responsible for the survey microdata. The Committee authorizes (or not) the access to internal data files under the appropriate conditions.
The Committee is chaired by the Deputy Director for Surveys and composed of senior staff members dealing with business, methodology and dissemination coordination.
(3) formal agreements to access
Once a project has been authorized, formal agreements between the researcher and the agency are established. These agreements involve a written contract (contractual arrangement), and an agreement form outlining the conditions of access and setting out fees for the proposed work.
(4) on-site access
The databases are installed in the room with special computers for the researchers. The security features of the computers include a blockade to external networks to prevent transfer of data. Furthermore, the external disk drives and serial parallel ports are disabled. The identification of the enterprises is recoded in the databases from businesses surveys of IBGE or from external sources.
The researchers do the work and save the output in the hard disk of the special computer and then prepare a report document. A CD-Rom with this information is prepared by IBGE staff, to be analysed by the thematic survey area.
(5) evaluation of output
The statistical output must be analysed before its release to the researcher to ensure the technical assessment of disclosure risks and confidentiality requirements. The analysis is undertaken by the thematic area responsible for the survey microdata, the same that gave submissions for the committee decision.
Once the output of the project has been approved, i.e. the thematic area judges that there is no risk of disclosure, another formal agreement is established. This new agreement outlines the conditions of use of the data generated by the special access, i.e. the user has to recognize that data are the property of IBGE and has to provide advice of this special access when releasing the results and analysis involving these data.
Table 1 shows the number of projects analysed by the Committee from September 2003 to February 2006. In 37 projects analysed, 3 projects involved data from the long form of the 2000 Population Census. In this case, the researcher needed different geographical areas from the weighting areas used in the sampling weighting process. One project involved data from an annual trade survey; one from an annual services survey; 30 projects from manufacturing surveys; and 2 projects involved data from manufacturing, trade or services surveys simultaneously.
Table 1 - Number of Projects Analysed by the Committee
(September 2003 - February 2006)
|Thematic area||Number of projects|
|At least 2 businesses surveys||2|
5. Supporting legislation
The regulations for the provision of restricted data access were established by IBGE using the following expedients:
- Resolution of the Board of Directors, n. 7, of May 29, 2003 - that created the Committee of Assessment of Restricted Data Access.
- Regulation of the Chief Statistician, n. 485, of July 8, 2003 - that appointed the members of the Committee.
- Regulation of the General Coordinator of the Centre for Documentation and Dissemination of Information, n.1, of September 10, 2003 - that established the objectives of the rooms for use in the on-site restricted access.
Provides a secure way of providing researcher access to IBGE data for projects that are of clear statistical or academic benefit.
Although about 40 projects have been working on this on-site system at IBGE since 2003, we have had a lot of difficult tasks to face. It has been:
- time-consuming to analyse projects because, in many cases, there is a need to contact the proponents to redesign the project or to provide detailed explanations of why the project is not feasible;
- time-consuming to prepare user-friendly documentation;
- time-consuming to analyse the outputs due to faults in the documentation.
In general, the expected work time is underestimated.
Another issue involves managing the tension between the agency and the researchers in regard to the acceptability of the current practice. The culture and value system of the research community is very different from that of a National Statistical Office.
Researchers still think of microdata access arrangements as unnecessary bureaucracy, too limiting and inconvenient. This lack of convenience for the researcher includes the requirement to work at the agency. That can be an expensive option, especially for researchers living in other cities or countries. Another point is that sometimes the researcher is forced to use unfamiliar data analysis software.
There is an internal debate about the acceptability of this practice. Even under measures to regulate the access of restricted microdata, there is a worry that it could still alarm public opinion with suspicion of disclosure. The reaction of respondents would have some impact on response rates.
Increasingly, researchers are looking to link data sets with the data sets of the agency. Although matching of databases brings benefits, the identification risks increase.
There are some issues concerning transparency. The IBGE web site was an effective way to provide information on how to make access available for researchers. However, information about the procedures is only provided through Intranet and the users learn about the procedures only when asking for special data.
Therefore, it is a challenge for us to be transparent about the arrangements of providing access to data for researchers under controlled conditions for specific purpose. But the visibility of such arrangements is necessary to increase public confidence that microdata will be used properly. We would want to be completely transparent about the specific uses of microdata to avoid suspicion of misuse and ensure that researchers are aware of the consequences for them and their institution if there are breaches of confidentiality. On the other hand, there is a fear of excessively increasing the demand.
There is a demand to install rooms for on-site access outside the headquarters of the agency, especially in the big cities like São Paulo and Brasília. But to meet this demand requires investment in resources to train staff and prepare the infrastructure.
IBGE (2003), Resolução do Conselho Diretor nº 7, de 29.05.2003. (Resolution of the Board of Directors of IBGE, n. 7, of May 29, 2003 - that created the Committee of Assessment of Restricted Data Access).
IBGE (2003), Portaria do Presidente nº 485, de 08.07.2003. (Regulation of the IBGE´s Chief Statistician, n. 485, of July 8, 2003 - that appointed the members of the Committee).
IBGE (2003), Norma de Serviço CDDI n.º 1, de 10.09.2003. (Regulation of the General Coordinator of the IBGE´s Centre for Documentation and Dissemination of Information, n.1, of September 10, 2003 - that established the objectives of the rooms for use in the on-site restricted access).
Lei nº 5534, de 14 de novembro de 1968. Brasília, Diário Oficial da União. (Law 5534 of November 14, 1968. Law on the obligatory character of providing statistical data and confidentiality).
30 Aug 2013