在做这个模型的时候,我们会把一般喜欢,特别喜欢这种程度变量化为数字的模式,所以这些数字仅仅代表1类,并不能加减,没有意义。
ordered logit model 也被称为proportional-odds model,模型内每个事件的odds ratio均为独立,每个种类的odds假定不变,因此各种类间的斜率并不会改变,种类间的区别主要体现在截距β上。
-
ordered logit model 与MNL的区别体现在下图
聚个例子
预测一个人捡到钱包会不会归还
- dependent variable:
- Least ethical (拿走了钱包和钱)
- Ethical (归还了钱包)
- Most ethical (归还了钱包和钱)
- Independent variables
- Gender (1=male or 0=female)
- Business School (1=yes or 0=no)
- Punish (disciplinary measures by parent)
(1) punished in elementary
(2) punished in elementary and middle school
(3) punished in elementary, middle and high - Explanation by parents regarding disciplinary measures (1=yes or 0=no)
- male与female为二分类变量,即male与female仅会出现一个。同理,如果有一个变量有三个种类,结果中也仅会出现两个。
- 那么上面的结果就说明了males are more likely to be less ethical. 因为least ethical的截距最高为1.21,ethical的截距为1.18,most ethical 的截距作为baseline为0
- Residual Deviance = −2LL = (−2)x(−151.8837) = 303.7675
AIC = −2LL+2k = 303.765+2(10) = 323.768
最后放代码
#Ordered logit model
#Hess=TRUE: This will generate a model with the observed information matrix from optimization (called the Hessian) which is used to get standard errors
library(MASS)
ordlog<-polr(honestfac~male+business+punish_el+explain,data=wallet, Hess=TRUE)
summary(ordlog)
c(deviance(ordlog),ordlog$edf) #Deviance and number of parameters (includes intercepts)
ci<-round(confint(ordlog),3) #confidence intervals
round(exp(coef(ordlog)),3) #odds ratios
round(exp(cbind(OR=coef(ordlog),ci)),3) #confidence intervals with respective odds ratios
#show the cutoffs on a plot
#Example using ord data for gender
x<-seq(-4,4,by=0.5)
plot(x,dlogis(x),type="l") #prob density function for a logistic distribution
abline(v=c(-1.6315,0.1229),col="red",lwd=2) #female
abline(v=c(-1.6315,0.1229)-1.0729,lty=2,col="blue",lwd=2) #males
text(-3.6,0.10,"P(Less \nEthical)")
text(2.8,0.10,"P(More \nEthical)")
legend ("topright",lty=1:2,lwd=2,legend=c("Males","Females"),col=c("red","blue"),bty="n")