Big Data Home

Quick link:
[2014 HLG project on Big Data]

Skip to end of metadata
Go to start of metadata


Country/Organization name

Istat (Italy)

Contact person for this case study

Giulio Barcaroli




Type of Big Data used


Project description

Reference is to the "Survey on information and communication technology in enterprises": the aim is to replicate the information content of the sections of the questionnaire related to the websites indicated by a subset of responding enterprises, by using Internet as Data Source (IAD). The content of the website is analysed and modeled under a "machine learning" approach. Techniques that will be used are: (info) web scraping; (ii) text mining


National or international scope of data source


Public/private source

Private sector

Data access framework

The data are open for public access

Payment for data


Data access

Raw (micro)data are transferred to the statistical organization for processing

Domain and use

Information society

Degree of progress in use of the Big Data source


Tools and methods for processing

Different tools are being evaluated in order to choose the best for both:
1. web scraping
2. text mining

Privacy and confidentiality issues

For the moment, no confidentiality issue is evident, as scraped websites are open to public

Links and attachments

No files have been attached to this page.




 Click for viewtracker

Page viewed 994 times by 14 users since 04 Oct, 2013:
Name Last viewed Times viewed
Anonymous 22 Oct, 2014 14:31 968
Nadia Mignolli 06 Oct, 2014 11:43 1
Lily Ma 24 Sep, 2014 18:55 2
Fiona Willis-Núñez 15 Aug, 2014 14:44 4
Michael Behrman 12 Aug, 2014 21:21 4
Matjaz Jug 08 Aug, 2014 15:51 1
Steven Vale 08 Aug, 2014 15:21 1
Szőke Katalin 10 Jul, 2014 11:22 2
Monica Scannapieco 04 Jun, 2014 16:49 1
Eva Holm 21 Nov, 2013 16:55 1
Juan Munoz-Lopez 13 Nov, 2013 16:33 1
Pilar Rey del Castillo 23 Oct, 2013 12:50 1
Carlo Vaccari 05 Oct, 2013 16:16 1
Giulio Barcaroli 04 Oct, 2013 15:16 6