I was using logistic regression today and as always am troubled by how to report the results. I did a little sleuthing and here's what I decided. Odds Ratios = Bad, Relative Risk = Not so Bad.
To begin at the beginning.
Logistic regression is a form of regression which is used when the dependent variable is a dichotomous variable. In other words, continuous variables are not used as dependents in logistic regression. However, independent variables included in the model may be of any type.
Multinomial logistic regression is used when the dependent variable has more than two classes although it can also used for binary dependent variables. When multiple classes of the dependent variable can be ranked, then ordinal logistic regression is preferred to multinomial logistic regression.
Critically, the impact of predictor variables is usually explained in terms of odds ratios.
Unfortunately, odds ratios are not intuitive to most people and many interpret them (incorrectly) as probabilities. For example, suppose there was a group of dogs, 40 which were male and 60 which were female. The probability of randomly selecting a male dog is 40 / (60+40) or 40/100 = 40%. The odds, however, of randomly selecting a male dog is quite different - it is 40/(100-40) or 40/60 = 67%.
Let's look at another example. Suppose that we have a group of students of which some are classified as ADD. Of 80 boys, 13 were classified as ADD and 67 were not. Of 100 girls, 6 were classified as ADD and 94 were not. The odds of a boy being classified as ADD (as the logistic regression output would report) is 13/67 = .194; the odds of a girl being so classified is 6/94 = .064. The odds ratio of being classified as ADD varies based upon sex. We could report the odds ratio of a boy being classified as compared to a girl as .194/.068 = 3.03125:1 or roughly 3:1 Unfortunately, again, this is not the probability that a boy will be classified as ADD compared to a girl. What it means is that for every boy not classified as ADD, 3 times as many boys will be classified as ADD than the number of girls classified for every girl not classified.
Do not write that in your report. Because it is uninterpretable (which is not a word), but you know what I mean.
So instead, reporting an estimated relative risk may be the best bet.
Estimated Relative Risk = Odds Ratio / ((1-Pr) + (Pr * Odds Ratio))
where Pr is the proportion of non-treated persons that exhibit the outcome of interest.
In our example the Odds Ratio is 3.039 and Pr is the proportion of girls who are classified as ADD = 6/100 = .06
Thus the estimated relative risk of a boy being classified as ADD is:
3.03125 / ((1-.06) +(.06*3.03125)) = 3.03125 / (.94+.1818) = 3.03125/1.1218 = 2.702. or "Boys are 2.7 times as likely to be classified as ADD as that of a girl being classified as ADD."
Compared to the true relative risk = (13/80) / (6/100) = .16252 /.06 = 2.71. our estimated risk is not a bad estimate - and much easier to explain!
Friday, October 30, 2009
Subscribe to:
Post Comments (Atom)
1 comments:
Hi - I'm a statistician working on the analysis of a randomized design: Intervention vs Control group. There are several covariates and tons of outcomes. I use SAS to do my analyses, and have definitely encountered the problem you write about. I found this reference today: http://www.ats.ucla.edu/stat/sas/faq/relative_risk.htm I think this might be a good approach for us.
Post a Comment