学术报告

【online】 Balancing Inferential Integrity and Disclosure Risk via Model ......

发布人:发布时间: 2021-03-24

字体大小: 【小】 【中】 【大】

题目: Balancing Inferential Integrity and Disclosure Risk via Model Targeted Masking and Multiple Imputation


报告人:Dr. Linglong Kong (Department of Mathematical and Statistical Sciences, University of Alberta, Canada)


时间:202141日  11:00


报告方式:腾讯会议平台  ID140 324 213


摘要: There is a growing expectation that data collected by government-funded studies should be openly available to ensure research reproducibility, which also increases concerns about data privacy. A strategy to protect individuals' identity is to release multiply imputed (MI) synthetic datasets with masked sensitivity values (Rubin, 1993). However, information loss or incorrectly specified imputation models can weaken or invalidate the inferences obtained from the MI-datasets. We propose a new masking framework with a data-augmentation (DA) component and a tuning mechanism that balances protecting identity disclosure against preserving data utility. Applying it to a restricted-use Canadian Scleroderma Research Group (CSRG) dataset, we found that this DA-MI strategy achieved a $0 \%$ identity disclosure risk and preserved all inferential conclusions. It yielded $95 \%$ confidence intervals (CIs) that had overlaps of $98.5 \%$ $(95.5 \%)$ on average with the CIs constructed using the full, unmasked CSRG dataset in a work-disability (interstitial lung disease) study. The CI-overlaps were lower for several other methods considered, ranging from $73.9 \%$ to $91.9 \%$ on average with the lowest value being $28.1 \%$; such low CI-overlaps further led to some incorrect inferential conclusions. These findings indicate that the DA-MI masking framework facilitates sharing of useful research data while protecting participants' identities.Joint work with Adrian E. Raftery, Russell J. Steele, and Naisyin Wang

报告人简介:Dr. Linglong Kong is an associate professor at the department of Mathematical and Statistical Sciences of the University of Alberta. He is a Canadian Research Chair in Statistical Learning. He has published more than 50 peer-reviewed manuscripts including top journals AOS, JASA and JRSSB, and top conferences ICML, ICDM, AAAI and IJCAI. Currently, Dr. Linglong Kong is serving as associate editors of Journal of the American Statistical Association, International Journal of Imaging Systems and Technology, Canadian Journal of Statistics, member of the Board of Directors of the Statistics Society of Canada and Western North American Region of The International Biometric Society, the ASA Statistical Imaging Session program chair-past and the ASA Statistical Computing Session program chair-elect. His research interests include statistical machine learning, high-dimensional data analysis, neuroimaging data analysis, robust statistics and quantile regression.


邀请人:杨孝平 老师