(Latest changes: 22.11 second version).
Understanding and comparing R print-outs
The print-outs below are from LM lm, GLM
glm, mvGLM vglm, LMM lmer and
GLMM glmer.
Below we have fit a model to a data set, and then printed the
summary of the model. For each of the print-outs you need
to know (be able to identify and explain) every entry. In particular
identify and explain:
- which model: model requirements
- how is the model fitted (versions of maximum likelihood)
- parameter estimates for \(\beta\)
- inference about the \(\beta\): how
to find CI and test hypotheses (which hypothesis is reported test
statistic, and possibly \(p\)-value
for)
- model fit (deviance, AIC, R-squared, F)
In addition, further inference can be made using
anova(fit1,fit2), confint,
residuals, fitted, AIC,
ranef and other functions we have worked with in the PL, IL
and on the compulsory exercises.
MLR - multiple linear regression
library(gamlss.data)
fitLM = lm(rent ~ area + location + bath + kitchen + cheating, data = rent99)
summary(fitLM)
fitGLM = glm(rent ~ area + location + bath + kitchen + cheating, data = rent99)
summary(fitGLM)
##
## Call:
## lm(formula = rent ~ area + location + bath + kitchen + cheating,
## data = rent99)
##
## Residuals:
## Min 1Q Median 3Q Max
## -633.41 -89.17 -6.26 82.96 1000.76
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -21.9733 11.6549 -1.885 0.0595 .
## area 4.5788 0.1143 40.055 < 2e-16 ***
## location2 39.2602 5.4471 7.208 7.14e-13 ***
## location3 126.0575 16.8747 7.470 1.04e-13 ***
## bath1 74.0538 11.2087 6.607 4.61e-11 ***
## kitchen1 120.4349 13.0192 9.251 < 2e-16 ***
## cheating1 161.4138 8.6632 18.632 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Residual standard error: 145.2 on 3075 degrees of freedom
## Multiple R-squared: 0.4504, Adjusted R-squared: 0.4494
## F-statistic: 420 on 6 and 3075 DF, p-value: < 2.2e-16
##
##
## Call:
## glm(formula = rent ~ area + location + bath + kitchen + cheating,
## data = rent99)
##
## Coefficients:
## Estimate Std. Error t value Pr(>|t|)
## (Intercept) -21.9733 11.6549 -1.885 0.0595 .
## area 4.5788 0.1143 40.055 < 2e-16 ***
## location2 39.2602 5.4471 7.208 7.14e-13 ***
## location3 126.0575 16.8747 7.470 1.04e-13 ***
## bath1 74.0538 11.2087 6.607 4.61e-11 ***
## kitchen1 120.4349 13.0192 9.251 < 2e-16 ***
## cheating1 161.4138 8.6632 18.632 < 2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for gaussian family taken to be 21079.53)
##
## Null deviance: 117945363 on 3081 degrees of freedom
## Residual deviance: 64819547 on 3075 degrees of freedom
## AIC: 39440
##
## Number of Fisher Scoring iterations: 2
GLM - Binomial regression with logit-link
library(investr)
fitgrouped = glm(cbind(y, n - y) ~ ldose, family = "binomial", data = investr::beetle)
summary(fitgrouped)
##
## Call:
## glm(formula = cbind(y, n - y) ~ ldose, family = "binomial", data = investr::beetle)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -60.717 5.181 -11.72 <2e-16 ***
## ldose 34.270 2.912 11.77 <2e-16 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for binomial family taken to be 1)
##
## Null deviance: 284.202 on 7 degrees of freedom
## Residual deviance: 11.232 on 6 degrees of freedom
## AIC: 41.43
##
## Number of Fisher Scoring iterations: 4
GLM - Poisson regression with log-link
crab = read.table("https://www.math.ntnu.no/emner/TMA4315/2018h/crab.txt")
colnames(crab) = c("Obs", "C", "S", "W", "Wt", "Sa")
crab = crab[, -1] #remove column with Obs
crab$C = as.factor(crab$C)
model3 = glm(Sa ~ W + C, family = poisson(link = log), data = crab, contrasts = list(C = "contr.sum"))
summary(model3)
##
## Call:
## glm(formula = Sa ~ W + C, family = poisson(link = log), data = crab,
## contrasts = list(C = "contr.sum"))
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) -2.92089 0.56010 -5.215 1.84e-07 ***
## W 0.14934 0.02084 7.166 7.73e-13 ***
## C1 0.27085 0.11784 2.298 0.0215 *
## C2 0.07117 0.07296 0.975 0.3294
## C3 -0.16551 0.09316 -1.777 0.0756 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## (Dispersion parameter for poisson family taken to be 1)
##
## Null deviance: 632.79 on 172 degrees of freedom
## Residual deviance: 559.34 on 168 degrees of freedom
## AIC: 924.64
##
## Number of Fisher Scoring iterations: 6
Categorical regression, nominal model
# data from Agresti (2015), section 6, with use of the VGAM packages
data = "http://www.stat.ufl.edu/~aa/glm/data/Alligators.dat"
ali = read.table(data, header = T)
attach(ali)
y.data = cbind(y2, y3, y4, y5, y1)
x.data = model.matrix(~size + factor(lake), data = ali)
library(VGAM)
# We fit a multinomial logit model with fish (y1) as the reference category:
fit.main = vglm(cbind(y2, y3, y4, y5, y1) ~ size + factor(lake), family = multinomial,
data = ali)
summary(fit.main)
pchisq(deviance(fit.main), df.residual(fit.main), lower.tail = FALSE)
##
## Call:
## vglm(formula = cbind(y2, y3, y4, y5, y1) ~ size + factor(lake),
## family = multinomial, data = ali)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept):1 -3.2074 0.6387 -5.021 5.13e-07 ***
## (Intercept):2 -2.0718 0.7067 -2.931 0.003373 **
## (Intercept):3 -1.3980 0.6085 -2.297 0.021601 *
## (Intercept):4 -1.0781 0.4709 -2.289 0.022061 *
## size:1 1.4582 0.3959 3.683 0.000231 ***
## size:2 -0.3513 0.5800 -0.606 0.544786
## size:3 -0.6307 0.6425 -0.982 0.326296
## size:4 0.3316 0.4482 0.740 0.459506
## factor(lake)2:1 2.5956 0.6597 3.934 8.34e-05 ***
## factor(lake)2:2 1.2161 0.7860 1.547 0.121824
## factor(lake)2:3 -1.3483 1.1635 -1.159 0.246529
## factor(lake)2:4 -0.8205 0.7296 -1.125 0.260713
## factor(lake)3:1 2.7803 0.6712 4.142 3.44e-05 ***
## factor(lake)3:2 1.6925 0.7804 2.169 0.030113 *
## factor(lake)3:3 0.3926 0.7818 0.502 0.615487
## factor(lake)3:4 0.6902 0.5597 1.233 0.217511
## factor(lake)4:1 1.6584 0.6129 2.706 0.006813 **
## factor(lake)4:2 -1.2428 1.1854 -1.048 0.294466
## factor(lake)4:3 -0.6951 0.7813 -0.890 0.373608
## factor(lake)4:4 -0.8262 0.5575 -1.482 0.138378
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Names of linear predictors: log(mu[,1]/mu[,5]), log(mu[,2]/mu[,5]),
## log(mu[,3]/mu[,5]), log(mu[,4]/mu[,5])
##
## Residual deviance: 17.0798 on 12 degrees of freedom
##
## Log-likelihood: -47.5138 on 12 degrees of freedom
##
## Number of Fisher scoring iterations: 5
##
## Warning: Hauck-Donner effect detected in the following estimate(s):
## '(Intercept):1'
##
##
## Reference group is level 5 of the response
## [1] 0.1466189
Categorical regresion, ordinal model
# Read mental health data from the web:
library(knitr)
data = "http://www.stat.ufl.edu/~aa/glm/data/Mental.dat"
mental = read.table(data, header = T)
library(VGAM)
# We fit a cumulative logit model with main effects of 'ses' and 'life':
fit.imp = vglm(impair ~ life + ses, family = cumulative(parallel = T), data = mental)
# parallell=T gives proportional odds structure - only intercepts differ
summary(fit.imp)
##
## Call:
## vglm(formula = impair ~ life + ses, family = cumulative(parallel = T),
## data = mental)
##
## Coefficients:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept):1 -0.2819 0.6231 -0.452 0.65096
## (Intercept):2 1.2128 0.6511 1.863 0.06251 .
## (Intercept):3 2.2094 0.7171 3.081 0.00206 **
## life -0.3189 0.1194 -2.670 0.00759 **
## ses 1.1112 0.6143 1.809 0.07045 .
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Names of linear predictors: logitlink(P[Y<=1]), logitlink(P[Y<=2]),
## logitlink(P[Y<=3])
##
## Residual deviance: 99.0979 on 115 degrees of freedom
##
## Log-likelihood: -49.5489 on 115 degrees of freedom
##
## Number of Fisher scoring iterations: 5
##
## No Hauck-Donner effect found in any of the estimates
##
##
## Exponentiated coefficients:
## life ses
## 0.7269742 3.0380707
LMM - random intercept and slope
library(lme4)
fm1 <- lmer(Reaction ~ Days + (Days | Subject), sleepstudy)
summary(fm1)
## Linear mixed model fit by REML ['lmerMod']
## Formula: Reaction ~ Days + (Days | Subject)
## Data: sleepstudy
##
## REML criterion at convergence: 1743.6
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -3.9536 -0.4634 0.0231 0.4634 5.1793
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Subject (Intercept) 612.10 24.741
## Days 35.07 5.922 0.07
## Residual 654.94 25.592
## Number of obs: 180, groups: Subject, 18
##
## Fixed effects:
## Estimate Std. Error t value
## (Intercept) 251.405 6.825 36.838
## Days 10.467 1.546 6.771
##
## Correlation of Fixed Effects:
## (Intr)
## Days -0.138
GLMM - random intercept and slope Poisson
RIKZ <- read.csv("https://oharar.github.io/TMA4315/Module08/RIKZ.csv")
library(lme4)
fitRI = glmer(Richness ~ NAP + (1 + NAP | Beach), data = RIKZ, family = poisson(link = log))
summary(fitRI)
## Generalized linear mixed model fit by maximum likelihood (Laplace
## Approximation) [glmerMod]
## Family: poisson ( log )
## Formula: Richness ~ NAP + (1 + NAP | Beach)
## Data: RIKZ
##
## AIC BIC logLik -2*log(L) df.resid
## 218.7 227.8 -104.4 208.7 40
##
## Scaled residuals:
## Min 1Q Median 3Q Max
## -1.35846 -0.51129 -0.21846 0.09802 2.45384
##
## Random effects:
## Groups Name Variance Std.Dev. Corr
## Beach (Intercept) 0.2630 0.5128
## NAP 0.0891 0.2985 0.18
## Number of obs: 45, groups: Beach, 9
##
## Fixed effects:
## Estimate Std. Error z value Pr(>|z|)
## (Intercept) 1.6942 0.1868 9.071 < 2e-16 ***
## NAP -0.6074 0.1374 -4.421 9.81e-06 ***
## ---
## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
##
## Correlation of Fixed Effects:
## (Intr)
## NAP 0.121
Exam and exam preparation
The exam: 27th November 9am
Aids “Specified printed and hand-written support material is allowed.
A specific basic calculator is allowed.”
(= a yellow sheet. You can get these from the department offices on
the 7th floor)
Relevant exams are found on the bottom of each module page, and
linked to from the wiki page.
There will be a revision session in R4 on Tuesday 26th November
10:00-12:00 (R4 is behind the cafe in Realfagbygget)