• Evidence-based Medicine and Clinical Epidemiology Center, Xinqiao Hospital, Third Military Medical University, Chongqing 400037, China;
Export PDF Favorites Scan Get Citation

Objective To explore the selection problem of independent variables and stepwise regression method for multiple logistic regression analysis. Methods According to the data of the case-control investigation for coronary heart disease, age (X1), hypertension history (X2), hypertension family history (X3), smoking (X4), hyperlipidemia history (X5), animal fat intake (X6), weight index (X7), type A personality (X8), and coronary heart disease (CHD, Y) were analyzed by SPSS 18.0 software. The multiple logistic regression analysis was done and the differences of risk factors were compared among 6 kinds stepwise regression variable selection method. Results The univariate analysis showed that no difference was found between CHD group and non-CHD group in age distribution (P=0.116). But the multivariate logistic regression analysis showed that, comparing to population over 65 years old, age was a protective factor on the low age groups (OR< 45=0.100, 0.000 to 0.484, P=0.020; OR45-54=0.051, 0.003 to 0.975, P=0.048). If the age was defined as categorical variable, the risk factors for coronary heart disease were animal fat intake (X6), type A personality (X8), hypertension history (X5) and age (X1), respectively (P < 0.05). If the age was defined as a continuous variable, the effect of age (X1) was not statistically significant (P=0.053). The common risk factors were intake of animal fat (X6) and type a personality (X8) by six kinds method of stepwise variable selection. In addition, the risk factor also included hyperlipidemia history (X5) (forward-condition, forward-LR, forward-wald), hypertension family history (X3), age (X1) (backward-condition, backward-LR) and hypertension history (X2) (backward-wald). Conclusion Stepwise regression method should be used to analyze all the variables, including no statistically significant independent variables in univariate analysis. If the categorical variable is regarded as continuous variables, some information may be lost, and even the risk factors may be missed. When the risk factors are not the same by several stepwise regression variable selection method, it should be combined with clinical and epidemiological significance, as well as biological mechanisms and other professional knowledge.

Citation: XURu-fu. Selection for Independent Variables and Regression Method in Logistic Regression: An Example Analysis. Chinese Journal of Evidence-Based Medicine, 2016, 16(11): 1360-1364. doi: 10.7507/1672-2531.20160205 Copy

Copyright © the editorial department of Chinese Journal of Evidence-Based Medicine of West China Medical Publisher. All rights reserved

  • Previous Article

    Development of Animal Burn Models in Rats: A Systematic Review