Exact Statistical Inference for Differentially Private Data


Differential privacy (DP) is a mathematical framework that protects confidential information in a transparent and quantifiable way. I discuss how two classes of approximate computation techniques can be systematically adapted to produce exact statistical inference using DP data. For likelihood inference, we call for an importance sampling implementation of Monte Carlo expectation-maximization, and for Bayesian inference, an approximate Bayesian computation (ABC) algorithm suitable for possibly complex likelihood. Both approaches deliver exact statistical inference with respect to the joint statistical model inclusive of the differential privacy mechanism, yet do not require analytical access of such joint specification. Highlighted is a transformation of the statistical tradeoff between privacy and efficiency, into the computational tradeoff between approximation and exactness. Open research questions on two fronts are posed: 1) how to afford computationally accessible and (approximately) correct statistical analysis tools to DP data users; 2) how to understand and remedy the effect of any necessary post-processing with statistical analysis.

Slides from this talk can be found here.