TY - JOUR
T1 - Statistical modeling methods
T2 - challenges and strategies
AU - Henley, Steven S.
AU - Golden, Richard M.
AU - Kashner, T. Michael
N1 - Publisher Copyright:
© 2019, © 2019 Martingale Research Corporation.
PY - 2020/1/1
Y1 - 2020/1/1
N2 - Statistical modeling methods are widely used in clinical science, epidemiology, and health services research to analyze data that has been collected in clinical trials as well as observational studies of existing data sources, such as claims files and electronic health records. Diagnostic and prognostic inferences from statistical models are critical to researchers advancing science, clinical practitioners making patient care decisions, and administrators and policy makers impacting the health care system to improve quality and reduce costs. The veracity of such inferences relies not only on the quality and completeness of the collected data, but also statistical model validity. A key component of establishing model validity is determining when a model is not correctly specified and therefore incapable of adequately representing the Data Generating Process (DGP). In this article, model validity is first described and methods designed for assessing model fit, specification, and selection are reviewed. Second, data transformations that improve the model’s ability to represent the DGP are addressed. Third, model search and validation methods are discussed. Finally, methods for evaluating predictive and classification performance are presented. Together, these methods provide a practical framework with recommendations to guide the development and evaluation of statistical models that provide valid statistical inferences.
AB - Statistical modeling methods are widely used in clinical science, epidemiology, and health services research to analyze data that has been collected in clinical trials as well as observational studies of existing data sources, such as claims files and electronic health records. Diagnostic and prognostic inferences from statistical models are critical to researchers advancing science, clinical practitioners making patient care decisions, and administrators and policy makers impacting the health care system to improve quality and reduce costs. The veracity of such inferences relies not only on the quality and completeness of the collected data, but also statistical model validity. A key component of establishing model validity is determining when a model is not correctly specified and therefore incapable of adequately representing the Data Generating Process (DGP). In this article, model validity is first described and methods designed for assessing model fit, specification, and selection are reviewed. Second, data transformations that improve the model’s ability to represent the DGP are addressed. Third, model search and validation methods are discussed. Finally, methods for evaluating predictive and classification performance are presented. Together, these methods provide a practical framework with recommendations to guide the development and evaluation of statistical models that provide valid statistical inferences.
KW - Goodness-of-fit
KW - Information Matrix Test
KW - model misspecification
KW - model selection
KW - specification analysis
UR - http://www.scopus.com/inward/record.url?scp=85073326243&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85073326243&partnerID=8YFLogxK
U2 - 10.1080/24709360.2019.1618653
DO - 10.1080/24709360.2019.1618653
M3 - Article
SN - 2470-9360
VL - 4
SP - 105
EP - 139
JO - Biostatistics and Epidemiology
JF - Biostatistics and Epidemiology
IS - 1
ER -