Statistical Disclosure Control: Techniques to protect confidential information
10am, A54 PSC
Abstract: Statistical agencies collect data, process it, and publish summary statistics (mainly in the form of tables). During this process, the agency faces the problem of providing the data users with useful statistical information, while ensuring that the responses from individuals are protected. The two most popular ways to do this are “cell suppression” and “controlled rounding”. To do this effectively, one must solve complex combinatorial optimization problems. For large data sets, sophisticated algorithmic approaches are needed to find optimal or near-optimal solutions. The research that we have conducted on cell suppression was supported by EUROSTAT through the IV European Union framework (IST-2000-25069, "Computational Aspects of Statistical Confidentiality", 2000-2002). The research on controlled rounding was supported by the U.K. statistical agency through the ONS Contract IT 03/0763 (2003-2005). The methods developed in both projects have been incorporated into a software package called tau-ARGUS, which is now in use by many statistical agencies around the world.
(a) J.J. Salazar González. “A Unified Mathematical Programming Framework for different Statistical Disclosure Limitation Methods”, Operations Research 53/5 (2005) 819-829
(b) J.J. Salazar González, "Controlled Rounding and Cell Perturbation: Statistical Disclosure Limitation Methods for Tabular Data", Mathematical Programming 105 (2006) 583–603.
(c) J.J. Salazar González, “Statistical Confidentiality: Optimization Techniques to Protect Tables”, Computers & Operations Research 35 (2008) 1638-1651
(d) G.T. Duncan, M. Elliot, J.J. Salazar González "Statistical Confidentiality: Principles and Practice" Springer (2011).
About the speaker: Juan-Jose Salazar-Gonzalez is full professor in the Department of Statistics and Operational Research at the University of La Laguna, Tenerife. He is known internationally for his work in two distinct research areas: optimisation problems arising in transportation and logistics, and methods for statistical disclosure control. He is currently involved in a large FP7-funded project called “Data without Boundaries”.