华艺学术文献数据库

题名	Predictive Mean Matching Imputation Procedure Based on Machine Learning Models for Complex Survey Data
DOI	10.6339/24-JDS1135
作者	Sixia Chen；Chao Xu
关键词	imputation ； missing data ； nonresponse bias
期刊名称	Journal of Data Science
卷期/出版年月	22卷3期（2024 / 07 / 01）
页次	456 - 468
内容语文	英文
中文摘要	Missing data is a common occurrence in various fields, spanning social science, education, economics, and biomedical research. Disregarding missing data in statistical analyses can introduce bias to study outcomes. To mitigate this issue, imputation methods have proven effective in reducing nonresponse bias and generating complete datasets for subsequent analysis of secondary data. The efficacy of imputation methods hinges on the assumptions of the underlying imputation model. While machine learning techniques such as regression trees, random forest, XGBoost, and deep learning have demonstrated robustness against model misspecification, their optimal performance may necessitate fine-tuning under specific conditions. Moreover, imputed values generated by these methods can sometimes deviate unnaturally, falling outside the normal range. To address these challenges, we propose a novel Predictive Mean Matching imputation (PMM) procedure that leverages popular machine learning-based methods. PMM strikes a balance between robustness and the generation of appropriate imputed values. In this paper, we present our innovative PMM approach and conduct a comparative performance analysis through Monte Carlo simulation studies, assessing its effectiveness against other established methods.
主题分类	基礎與應用科學 > 資訊科學基礎與應用科學 > 統計