题名

訊息設限與測量誤差下之復發事件分析

并列篇名

Recurrent event data analysis with informative censoring and measurement error

作者

游翔

关键词

復發事件資料 ; 訊息設限 ; 測量誤差 ; Recurrent event data ; Informative censoring ; Measurement error

期刊名称

清華大學統計學研究所學位論文

卷期/出版年月

2017年

学位类别

博士

导师

鄭又仁;王清雲

内容语文

英文

中文摘要

復發事件 (recurrent event) 在長期追蹤以及臨床實驗中是相當常見的資料型態。對於復發型資料分析而言,實驗者通常對共變數因子 (covariates) 對復發事件的頻率函數(rate function)的影響感到興趣。文獻中有許多統計方法可用於估計共變數因子對頻率函數的影響 (effect),但大多需要設限時間獨立(independent censoring)以及共變數測量值準確等假設。然而,在真實資料分析中,復發事件可能被其他事件 (例如:死亡) 中止而違反設限時間獨立之假設。此種情況,我們稱之為訊息設限 (informative censoring)。此外,共變數因子的測量值可能受限於測量誤差 (measurement errors) 而需要被校正。本篇論文主要提出半母數估計方法,在訊息設限和共變數因子有測量誤差的情況下,對復發型資料的共變數因子進行迴歸分析。本論文總共分為兩部分: 第一部分探討單一復發事件 (univariate recurrent event) 的估計方法。我們利用共享脆弱模型 (shared frailty model)來解釋訊息設限和復發事件之間的關聯性以及發生於同一人之事件的相關性。詳細而言,假設當脆弱變數 (frailty variable) 給定之後,復發事件服從一個普瓦松過程,其強度函數為一共享脆弱模型,且不假定脆弱變數的分配。在共變數和測量誤差通同時服從常態分配的假設下,我們提出迴歸校正法(regression calibration approach)和動差校正法(moment corrected approach) 去修正測量誤差在迴歸參數估計中造成的偏誤。此兩種方法皆屬於有母數校正方法且需要重複測量資料 (replicated data) 去估計測量誤差的變異數 (variance)。在第二部分,我們將第一部份的方法延伸到多變量復發事件 (multivariate recurrent event data) 分析。在此類資料中,研究者會對兩種類型以上的復發事件同時感到興趣。另外,我們考慮的情況為:每個樣本都有一個不偏測量值(surrogate),但只有一部分的樣本有工具變數 (instrumental variable)。重複測量資料和驗證資料 (validation data)皆不可得。我們假設不同類型復發事件的頻率函數服從不同的共享脆弱模型,其中脆弱變數用來描述訊息設限和復發事件之間的關聯以及不同復發事件之間的相關性。為修正測量誤差,我們提出兩個無母數校正方法(non-parametric correction approaches)去估計迴歸參數。第一個無母數校正方法只用工具變數可得之部份樣本來進行估計。為增進估計效率,我們提出第二個校正方法將其餘的樣本也納入估計。不同於第一部分,第二部分之方法不需要普瓦松過程的假設以及共變數和測量誤差的分配假設 (distributional assumption)。在估計過程中,我們亦不假定脆弱函數之分配。在兩個部分中,我們分別對本文提出之估計統計量建立大樣本理論,且利用模擬實驗來檢查估計量的表現。最後,我們將本文提出之估計方法套用到硒與癌症預防之雙盲實驗資料 (the Nutritional Prevention of Cancer trial),估計硒的補充對預防鱗狀細胞癌 (squamous cell carcinoma) 和 基底細胞癌 (basal cell carcinoma)的復發之效用。

英文摘要

Recurrent event data are frequently observed in many longitudinal and clinical studies. In the literature, various methods have been proposed to analyze covariate effects on the occurrence rate of a recurrent event, yet these methods usually require the assumption of independent censoring and accurately measured covariates. However, in many real data applications, informative censoring occurs when the recurrent event process is stopped by some terminal events that are related to the recurrent event (e.g. death). Additionally, the covariates could be measured with errors and need to be corrected. In this doctoral dissertation, we develop semi-parametric estimation to deal with informative censoring and measurement errors for recurrent event data. This dissertation contains two works. In the first work, we propose two approaches to estimate regression parameters for univariate recurrent event data in the presence of informative censoring and measurement errors. Explicitly, we impose a shared frailty model on the intensity function of a Poisson process to characterize the informative censoring and the dependence of the events within a subject without specifying the frailty distribution. To estimate the regression parameters, a regression calibration method and a moment corrected method are proposed for adjusting measurement errors. Both methods are referred to as the parametric correction because they assume that the underlying covariates and error terms are normally distributed. Moreover, the replicated data is needed to estimate the measurement error variance. In the second work, we extend the first work to accommodate informative censoring and measurement errors in multivariate recurrent event data, in which more than one type of events is of interest. Also, we consider a situation that a surrogate is available for all subjects but an instrumental variable is obtained only for a fraction of subjects. No replicated data or a validation set is available. To formulate the dependence of the informative censoring on the recurrent event processes, a shared frailty model is imposed on the rate function for each type of recurrent event, where the frailty distribution is unspecified. The shared frailty model also characterizes the association among different types of recurrent events. For regression parameter estimation, we first construct a simple correction approach, in which only subjects with an observed instrumental variable are involved in the estimation. To gain the efficiency of the simple correction estimator, we further develop a new correction approach to incorporate the information from the whole cohort. Distinct from the approaches in our first work, the approaches in the second work require neither the assumption of a Poisson process nor the distributional assumption of the underlying covariates and measurement errors. The asymptotic properties of the four proposed estimators are established. The performance of all proposed methods is investigated through simulation studies. We illustrate the proposed methods with the Nutritional Prevention of Cancer data, which aims to assess the effect of plasma selenium supplement on recurrences of squamous cell carcinoma and basal cell carcinoma.

主题分类 基礎與應用科學 > 統計
理學院 > 統計學研究所
参考文献
  1. Amorim, L. D. and Cai, J. (2015). Modelling recurrent events: a tutorial for analysis in epidemiology. International journal of epidemiology 44, 324–333.
    連結:
  2. Andersen, P. K. and Gill, R. D. (1982). Cox’s regression model for counting processes: a large sample study. The Annals of Statistics 10, 1100–1120.
    連結:
  3. Armstrong, B. (1985). Measurement error in the generalised linear model. Communications in Statistics-Simulation and Computation 14, 529–544.
    連結:
  4. Balakrishnan, N. and Peng, Y. (2006). Generalized gamma frailty model. Statistics in Medicine 25, 2797–2816.
    連結:
  5. Buonaccorsi, J. (2010). Measurement error: models, methods, and applications. Chapman and Hall/CRC, New York.
    連結:
  6. Buzas, J. S. (1997). Instrumental variable estimation in nonlinear measurement error models. Communications in Statistics-Theory and Methods 26, 2861–2877.
    連結:
  7. Buzas, J. S. (1998). Unbiased scores in proportional hazards regression with
    連結:
  8. Cai, J. and Schaubel, D. E. (2004). Marginal means/rates models for multiple
    連結:
  9. type recurrent event data. Lifetime data analysis 10, 121–138.
    連結:
  10. Asymptotics for the simex estimator in nonlinear measurement error models.
    連結:
  11. Journal of the American Statistical Association 91, 242–250.
    連結:
  12. M. R. (2012). Nonlinear and nonparametric regression and instrumental
    連結:
  13. variables. Journal of the American Statistical Association 99, 736–750.
    連結:
  14. Measurement error in nonlinear models: a modern perspective. Chapman &
    連結:
  15. (1984). On errors-in-variables for binary regression models. Biometrika 71,
    連結:
  16. Chen, C. M., Chuang, Y. W., and Shen, P. S. (2015). Two-stage estimation for
    連結:
  17. multivariate recurrent event data with a dependent terminal event. Biometrical
    連結:
  18. (1996). Effects of selenium supplementation for cancer prevention in patients
    連結:
  19. in parametric measurement error models. Journal of the American Statistical
    連結:
  20. events. Springer, New York.
    連結:
  21. in cancer metastatic to bone. Journal of the American Statistical
    連結:
  22. Association 104, 60–75.
    連結:
  23. recurrent asthma event rate over time in frailty models. Journal of the Royal
    連結:
  24. Fleming, T. R. and Harrington, D. P. (1991). Counting processes and survival
    連結:
  25. analysis. John Wiley & Sons, New York.
    連結:
  26. Foutz, R. V. (1977). On the unique consistent solution to the likelihood equations.
    連結:
  27. Journal of the American Statistical Association 72, 147–148.
    連結:
  28. Ghosh, D. and Lin, D. Y. (2000). Nonparametric analysis of recurrent events
    連結:
  29. and death. Biometrics 56, 554–562.
    連結:
  30. for covariate measurement error in a stratified Cox model. Biostatistics 5,
    連結:
  31. Horvitz, D. G. and Thompson, D. J. (1952). A generalization of sampling
    連結:
  32. without replacement from a finite universe. Journal of the American statistical
    連結:
  33. Hu, C. and Lin, D. Y. (2002). Cox regression with covariate measurement error.
    連結:
  34. Hu, C. and Lin, D. Y. (2004). Semiparametric failure time regression with
    連結:
  35. replicates of mismeasured covariates. Journal of the American Statistical
    連結:
  36. the Cox model when covariate variables are measured with error. Biometrics
    連結:
  37. event data with time-dependent covariates and informative censoring.
    連結:
  38. Biometrics 66, 39–49.
    連結:
  39. Statistical Association 99, 1153–1165.
    連結:
  40. Statistical Association 95, 1209–1219.
    連結:
  41. regression with errors in covariates. Journal of the American Statistical
    連結:
  42. Hughes, M. D. (1993). Regression dilution in the proportional hazards model.
    連結:
  43. Biometrics 49, 1056–1066.
    連結:
  44. models for repeated events with random effects and measurement error.
    連結:
  45. Journal of the American Statistical Association 94, 111–124.
    連結:
  46. function approach to the analysis of recurrent and terminal events. Biometrics
    連結:
  47. Lancaster, T. and Intrator, O. (1998). Panel data with survival: hospitalization
    連結:
  48. of hiv-positive patients. Journal of the American Statistical Association 93,
    連結:
  49. failure distributions and rates from automobile warranty data. Lifetime Data
    連結:
  50. with error-prone time-varying covariates: A risk set calibration approach.
    連結:
  51. Biometrics 67, 50–58.
    連結:
  52. for the mean and rate functions of recurrent events. Journal of the Royal
    連結:
  53. in frailty proportional hazards models. Statistics in medicine 27, 2665–2683.
    連結:
  54. Liu, L., Wolfe, R. A., and Huang, X. (2004). Shared frailty models for recurrent
    連結:
  55. General joint frailty model for recurrent event data with a dependent terminal
    連結:
  56. cystic fibrosis in the us and canada. Pediatric Pulmonology 28, 231–241.
    連結:
  57. Nakamura, T. (1992). Proportional hazards model with covariates subject to
    連結:
  58. measurement error. Biometrics 48, 829–838.
    連結:
  59. Newey, W. K. and McFadden, D. (1994). Large sample estimation and hypothesis
    連結:
  60. testing. Handbook of econometrics 4, 2111–2245.
    連結:
  61. Ng, E. T. M. and Cook, R. J. (1999). Robust inference for bivariate point
    連結:
  62. of multivariate recurrent event rates with application to a blood transfusion
    連結:
  63. study. Statistical methods in medical research. In Press.
    連結:
  64. of the American Statistical Association 88, 811–820.
    連結:
  65. Prentice, R. L. (1982). Covariate measurement errors and parameter estimation
    連結:
  66. analysis of multivariate failure time data. Biometrika 68, 373–379.
    連結:
  67. (1996). Neonatal characteristics as risk factors for preschool asthma. Journal
    連結:
  68. of Asthma 33, 255–264.
    連結:
  69. proportional hazards model with multiple longitudinal covariates measured
    連結:
  70. with error. Biostatistics 3, 511–528.
    連結:
  71. approach to joint modeling of longitudinal and time-to-event data. Biometrics
    連結:
  72. hazards model with covariate measurement error. Biometrics 61, 702–714.
    連結:
  73. Statistical Association 109, 1636–1646.
    連結:
  74. Stefanski, L. A. (1985). The effects of measurement error on parameter estimation.
    連結:
  75. Biometrika 72, 583–592.
    連結:
  76. for generalized linear measurement- error models. Biometrika 74, 703–716.
    連結:
  77. Therneau, T. M. and Hamilton, S. A. (1997). rhDNase as an example of recurrent
    連結:
  78. event analysis. Statistics in medicine 16, 2029–2047.
    連結:
  79. proportional hazards model with longitudinal covariates measured with error.
    連結:
  80. Biometrika 88, 447–458.
    連結:
  81. recurrent event data: parametric random effects models with measurement
    連結:
  82. error. Statistics in Medicine 16, 853–864.
    連結:
  83. calibration in failure time regression. Biometrics 53, 131–145.
    連結:
  84. accommodate covariate measurement error. Journal of the Royal Statistical
    連結:
  85. with informative censoring. Journal of the American Statistical Association
    連結:
  86. scale-change models for recurrent events and failure time. Journal of the
    連結:
  87. American Statistical Association. In Press.
    連結:
  88. of correlated recurrent and terminal events. Biometrics 63, 78–87.
    連結:
  89. methods for recurrent event data with covariate measurement error. Canadian
    連結:
  90. applications to bleeding and transfusion events in myelodysplastic syndrome.
    連結:
  91. Journal of biopharmaceutical statistics 24, 429–442.
    連結:
  92. Zhao, H. and Lin, J. (2012). The large sample properties of the solutions of
    連結:
  93. Robison, L. L. (2011). Semiparametric transformation models for joint analysis
    連結:
  94. Zhu, L., Sun, J., Tong, X., and Srivastava, D. K. (2010). Regression analysis of
    連結:
  95. multivariate recurrent event data with a dependent terminal event. Lifetime
    連結:
  96. covariate measurement error. Journal of Statistical Planning and Inference
  97. 67, 247–257.
  98. Carroll, R. J., Kuchenhoff, H., Lombard, F., and Stefanski, L. A. (1996).
  99. Carroll, R. J., Ruppert, D., Crainiceanu, C. M., Tosteson, T. D., and Karagas,
  100. Carroll, R. J., Ruppert, D., Stefanski, L. A., and Crainiceanu, C. M. (2006).
  101. Hall, London.
  102. Carroll, R. J., Spiegelman, C. H., Lan, K. K., Bailey, K. T., and Abbott, R. D.
  103. 19–25.
  104. Journal 57, 215–233.
  105. Clark, L. C., Combs, G. F., Turnbull, B. W., Slate, E. H., Chalker, D. K.,
  106. Chow, J., Davis, L. S., Glover, R. A., Graham, G. F., Gross, E. G., et al.
  107. with carcinoma of the skin: a randomized controlled trial. Journal of the
  108. American Medical Association 276, 1957–1963.
  109. Cook, J. R. and Stefanski, L. A. (1994). Simulation-extrapolation estimation
  110. association 89, 1314–1328.
  111. Cook, R. J. and Lawless, J. F. (2007). The statistical analysis of recurrent
  112. Cook, R. J., Lawless, J. F., Lakhal-Chaieb, L., and Lee, K. A. (2009). Robust
  113. estimation of mean functions and treatment effects for recurrent events under
  114. event-dependent censoring and termination: application to skeletal complications
  115. Cox, D. R. (1972). Regression models and life-tables. Journal of the Royal
  116. Statistical Society Series B 34, 187–220.
  117. Duchateau, L., Janssen, P., Kezic, I., and Fortpied, C. (2003). Evolution of
  118. Statistical Society Series C 52, 355–363.
  119. Fuller, W. A. (1987). Measurement error models. John Wiley & Sons, New
  120. York.
  121. Ghosh, D. and Lin, D. Y. (2002). Marginal regression models for recurrent and
  122. terminal events. Statistica Sinica 12, 663–688.
  123. Gorfine, M., Hsu, L., and Prentice, R. L. (2004). Nonparametric correction
  124. 75–87.
  125. Association 47, 663–685.
  126. Scandinavian Journal of Statistics 29, 637–655.
  127. Association 99, 105–118.
  128. Hu, P., Tsiatis, A. A., and Davidian, M. (1998). Estimating the parameters in
  129. 54, 1407–1419.
  130. Huang, C. Y., Qin, J., and Wang, M. C. (2010). Semiparametric analysis for recurrent
  131. Huang, C. Y. and Wang, M. C. (2004). Joint modeling and estimation for
  132. recurrent event processes and failure time data. Journal of the American
  133. Huang, Y. and Wang, C. Y. (2000). Cox regression with accurate covariates
  134. unascertainable: a nonparametric-correction approach. Journal of the American
  135. Huang, Y. and Wang, C. Y. (2001). Consistent functional methods for logistic
  136. Association 96, 1469–1482.
  137. Huang, Y. and Wang, C. Y. (2006). Errors-in-covariates effect on estimating
  138. functions: Additivity in limit and nonparametric correction. Statistica Sinica
  139. 96, 861–881.
  140. Huber, P. J. (2009). Robust statistics. John Wiley & Sons, New Jersey.
  141. Jiang, W., Turnbull, B. W., and Clark, L. C. (1999). Semiparametric regression
  142. Kalbfleisch, J. D., Schaubel, D. E., Ye, Y., and Gong, Q. (2013). An estimating
  143. 69, 366–374.
  144. 46–53.
  145. Lawless, J. F., Hu, J., and Cao, J. (1995). Methods for the estimation of
  146. Analysis 1, 227–240.
  147. Lawless, J. F. and Nadeau, C. (1995). Some simple robust methods for the
  148. analysis of recurrent events. Technometrics 37, 158–168.
  149. Liao, X., Zucker, D. M., Li, Y., and Spiegelman, D. (2011). Survival analysis
  150. Lin, D. Y., Wei, L. J., Yang, I., and Ying, Z. (2000). Semiparametric regression
  151. Statistical Society Series B 62, 711–730.
  152. Liu, L. and Huang, X. (2008). The use of gaussian quadrature for estimation
  153. events and a terminal event. Biometrics 60, 747–756.
  154. Mazroui, Y., Mathoulin-Pelissier, S., Soubeyran, P., and Rondeau, V. (2012).
  155. event: application to follicular lymphoma data. Statistics in medicine 31,
  156. 1162–1176.
  157. Morgan, W. J., Butler, S. M., Johnson, C. A., Colin, A. A., FitzSimmons,
  158. S. C., Geller, D. E., Konstan, M. W., Light, M. J., Rabin, H. R., Regelmann,
  159. W. E., et al. (1999). Epidemiologic study of cystic fibrosis: design and implementation
  160. of a prospective, multicenter, observational study of patients with
  161. Nakamura, T. (1990). Corrected score function for errors-in-variables models:
  162. Methodology and application to generalized linear models. Biometrika 77,
  163. 127–137.
  164. processes. The Canadian Journal of Statistics 27, 509–524.
  165. Nielsen, G. G., Gill, R. D., Andersen, P. K., and Sørensen, T. I. (1992). A
  166. counting process approach to maximum likelihood estimation in frailty models.
  167. Scandinavian Journal of Statistics 19, 25–43.
  168. Ning, J., Rahbar, M. H., Choi, S., Piao, J., Hong, C., del Junco, D. J., Rahbar,
  169. E., Fox, E. E., Holcomb, J. B., and Wang, M. C. (2015). Estimating the ratio
  170. Pepe, M. S. and Cai, J. (1993). Some graphical displays and marginal regression
  171. analyses for recurrent failure times and time dependent covariates. Journal
  172. in a failure time regression model. Biometrika 69, 331–342.
  173. Prentice, R. L., Williams, B. J., and Peterson, A. V. (1981). On the regression
  174. Rosner, B., Willett, W. C., and Spiegelman, D. (1989). Correction of logistic
  175. regression relative risk estimates and confidence intervals for systematic
  176. within-person measurement error. Statistics in medicine 8, 1051–1069.
  177. Schafer, D. W. and Purdy, K. G. (1996). Likelihood analysis for errors-invariables
  178. regression with replicate measurements. Biometrika 83, 813–824.
  179. Schaubel, D., Johansen, H., Dutta, M., Desmeules, M., Becker, A., and Mao, Y.
  180. Song, X., Davidian, M., and Tsiatis, A. A. (2002a). An estimator for the
  181. Song, X., Davidian, M., and Tsiatis, A. A. (2002b). A semiparametric likelihood
  182. 58, 742–753.
  183. Song, X. and Huang, Y. (2005). On corrected score approach for proportional
  184. Song, X. and Wang, C. Y. (2014). Proportional hazards model with covariate
  185. measurement error and instrumental variables. Journal of the American
  186. Stefanski, L. A. and Carroll, R. J. (1987). Conditional scores and optimal scores
  187. Thall, P. F. and Vail, S. C. (1990). Some covariance models for longitudinal
  188. count data with overdispersion. Biometrics 46, 657–671.
  189. Tsiatis, A. A. and Davidian, M. (2001). A semiparametric estimator for the
  190. Turnbull, B. W., Jiang, W., and Clark, L. C. (1997). Regression models for
  191. Wang, C. Y., Cullings, H., Song, X., and Kopecky, K. J. (2017). Joint nonparametric
  192. correction estimator for excess relative risk regression in survival
  193. analysis with exposure measurement error. Journal of the Royal Statistical
  194. Society Series B. In Press.
  195. Wang, C. Y., Hsu, L., Feng, Z. D., and Prentice, R. L. (1997). Regression
  196. Wang, C. Y. and Sullivan Pepe, M. (2000). Expected estimating equations to
  197. Society Series B 62, 509–524.
  198. Wang, C. Y. and Wang, S. (1997). Semiparametric methods in logistic regression
  199. with measurement error. Statistica Sinica 7, 1103–1120.
  200. Wang, M. C., Qin, J., and Chiang, C. T. (2001). Analyzing recurrent event data
  201. 96, 1057–1065.
  202. Xu, G., Chiou, S. H., Huang, C. Y., Wang, M. C., and Yan, J. (2016). Joint
  203. Ye, Y., Kalbfleisch, J. D., and Schaubel, D. E. (2007). Semiparametric analysis
  204. Yi, G. Y. and Lawless, J. F. (2012). Likelihood-based and marginal inference
  205. Journal of Statistics 10, 530–549.
  206. Zeng, D., Ibrahim, J. G., Chen, M. H., Hu, K., and Jia, C. (2014). Multivariate
  207. recurrent events in the presence of multivariate informative censoring with
  208. general estimating equations. Journal of Systems Science and Complexity
  209. 25, 315–328.
  210. Zhu, L., Sun, J., Srivastava, D. K., Tong, X., Leisenring, W., Zhang, H., and
  211. of multivariate recurrent and terminal events. Statistics in medicine 30,
  212. 3010–3023.
  213. data analysis 16, 478–490.