Pairwise differences in cross-over design

R
intermediate
nlme
Author

Dieter Menne

Published

November 27, 2024

A colleague asked me how to present the results of a study in gastric emptying. As you have seen multiple time here, the emptying time t50 is the target variable. But read on even if you work with blood pressure (also often misused as example), rehabilitation, or pharmacology. The methods are the same.

12 subjects received 4 different meals on separate occasions in random order. The gastric half-emptying time t50 measured with MRI techniques was the target variable. How do I compute the confidence intervals of t50 and the 6 possible pairwise differences between meals with their confidence intervals?

Generate simulated study results

See the source code how to generate the data.

'data.frame':   48 obs. of  3 variables:
 $ subject: chr  "S01" "S01" "S01" "S01" ...
 $ meal   : Factor w/ 4 levels "a","b","c","d": 1 2 3 4 1 2 3 4 1 2 ...
 $ t50    : num  80.8 84.5 85.3 83.1 82.3 ...

Simulated data: 12 subjects, each received 4 different meals.

Mixed model and contrasts with confidence intervals

Click for code
# Contrasts to get confidence intervals of emptying times
absContrast = rbind("a" = c(1,0,0,0), "b" = c(1,1,0,0), 
                    "c" = c(1,0,1,0), "d" = c(1,0,1,0))
# Linear mixed model fit with meal as fixed and 
# subject as random 
# lme is a function in package nlme
d_lme = lme(t50~meal, random = ~1|subject, data = d)
summary(d_lme)  
Linear mixed-effects model fit by REML
  Data: d 
  AIC BIC logLik
  212 223   -100

Random effects:
 Formula: ~1 | subject
        (Intercept) Residual
StdDev:        0.78        2

Fixed effects:  t50 ~ meal 
            Value Std.Error DF t-value p-value
(Intercept)    81      0.61 33     132  0.0000
mealb           2      0.81 33       2  0.0417
mealc           2      0.81 33       2  0.0334
meald           3      0.81 33       4  0.0009
 Correlation: 
      (Intr) mealb mealc
mealb -0.66             
mealc -0.66   0.50      
meald -0.66   0.50  0.50

Standardized Within-Group Residuals:
  Min    Q1   Med    Q3   Max 
-2.01 -0.57  0.14  0.72  1.71 

Number of Observations: 48
Number of Groups: 12 

t50 with confidence intervals

When the design is complete, i.e. no data missing, the estimate of t50 is equal to the treatment mean. The 95% confidence intervals give an approximate range where future means of studies will be expected. Note that I avoid the formulation “where with 95% percent probability the result will be expected”; watch my favorite youtube video to understand the subtle differences.

Click for code
# glht is a function in package multcomp
glh_abs = glht(d_lme, linfct = absContrast)
ci_abs = data.frame(round(confint(glh_abs)$confint,1))
ci_abs
  Estimate lwr upr
a       81  79  82
b       82  81  84
c       83  81  84
d       83  81  84

Pairwise differences with confidence intervals

The pairwise differences are computed with the Tukey correction.

Click for code
# mcp is a function in package multcomp
tukey_contrast = mcp(meal = "Tukey")
glh_tukey = glht(d_lme, linfct = tukey_contrast)
ci_tukey = data.frame(round(confint(glh_tukey)$confint,1))
ci_tukey
      Estimate  lwr upr
b - a      1.7 -0.4 3.8
c - a      1.8 -0.3 3.9
d - a      3.0  0.9 5.0
c - b      0.1 -2.0 2.2
d - b      1.2 -0.8 3.3
d - c      1.2 -0.9 3.2

See the vignette coming with package multcomp for alternatives, such as the Dunnett correction; it is useful if you plan to compare the treatments against one reference treatment, here a, and gives slightly narrower confidence interval.

Click for code
dunnet_contrast = mcp(meal = "Dunnet")
glh_dunnet = glht(d_lme, linfct = dunnet_contrast)
ci_dunnet = data.frame(round(confint(glh_dunnet)$confint,1))
ci_dunnet
      Estimate  lwr upr
b - a      1.7 -0.2 3.6
c - a      1.8 -0.1 3.7
d - a      3.0  1.1 4.9

See the paper by Gelman, Hill and Yajima for why you should not worry about multiple corrections when using a function like lme to estimate contrasts. Some reviewers will disagree, though, but give it a try.

How to put that into a publication

I leave it up to you to arrange the pairwise differences neatly in a triangular table or use the results inline.

And which pairing is significant?

Only one pair is “significant”: both lower and upper limit have the same sign.

But my boss wants real p-values!

No problem, these are easy to calculate (hint: summary(glh_tukey)$test$pvalues), but I am not showing the details here to make it harder for you to produce the horrible p-value table. Once there are p-values, no one will look at the confidence intervals of the differences. Better journals with statistical reviewers will applaud if you just present confidence intervals; lesser journals with medical reviewers who have built their reputations on p-values will ask you to p. If you do understand the “reputation” part: watch the youtube video.

p with a view