Analysing Sensitive Data from Dynamically-Generated Overlapping Contingency Tables

Published in Journal of Official Statistics, 2020

Recommended citation: Bon, J.J., Baffour, B., Spallek, M., & Haynes, M. (2020), Journal of Official Statistics, Volume 36, Issue 2, Pages 275-296. https://doi.org/10.2478/jos-2020-0015

Contingency tables provide a convenient format to publish summary data from confidential survey and administrative records that capture a wide range of social and economic information. By their nature, contingency tables enable aggregation of potentially sensitive data, limiting disclosure of identifying information. Furthermore, censoring or perturbation can be used to desensitise low cell counts when they arise. However, access to detailed cross-classified tables for research is often restricted by data custodians when too many censored or perturbed cells are required to preserve privacy. In this article, we describe a framework for selecting and combining log-linear models when accessible data is restricted to overlapping marginal contingency tables. The approach is demonstrated through application to housing transition data from the Australian Census Longitudinal Data set provided by the Australian Bureau of Statistics.