Title:Model selection and combination for estimating treatment effects
Time:2021.7.20 15:30-17:00
Location:A618
Presenter:Yuhong Yang's Lecture Notice
About the presenter:
Yuhong Yang received his Ph.D from Yale in statistics in 1996. He then joined Department of Statistics at Iowa State University and moved to the University of Minnesota in 2004. His research interests include model selection, multi-armed bandit problems, forecasting, high-dimensional data analysis, and machine learning. He has published in top journals in several fields, including Annals of Statistics, JASA, JRSSB, Biometrika, IEEE Transaction on Information Theory, Journal of Econometrics, Proceedings of AMS, Journal of Machine Leaning Research, and International Journal of Forecasting. He is a fellow of Institute of Mathematical Statistics and was a recipient of the US NSF CAREER Award.
Host:Prof. Zheng Haitao
Abstrasct:
It is well understood that a treatment’s effect on a response may be heterogeneous with respect to baseline covariates. This is an important premise of personalized medicine and personalized business/economic policy/decision making. Various methods for estimating heterogeneous treatment effects have been proposed. However, little attention has been given to the problem of choosing between estimators of treatment effects. Models that best estimate the regression function may not be best for estimating the effect of a treatment; therefore, there is a need for model selection methods that are targeted to treatment effect estimation directly. We develop a treatment effect cross-validation aimed at minimizing treatment effect estimation errors. Theoretically, treatment effect cross-validation has a model selection consistency property when the data splitting ratio is properly chosen. Practically, treatment effect cross-validation has the flexibility to compare different types of models. We illustrate the methods by using simulation studies and data from a clinical trial comparing treatments of patients with human immunodeficiency virus.
When estimating conditional treatment effects, the currently dominating practice is to select a statistical model or procedure based on the data. However, because finding out the best model can be very difficult due to limited information, combining estimates from the candidate procedures often provides a more accurate and much more stable estimate than the selection of a single procedure. We propose a method of model combination that targets accurate estimation of the treatment effect conditional on covariates. We provide a risk bound for the resulting estimator under squared error loss and illustrate the method using data from a labor skills training program.