Download the program gastempt1.r into RStudio.

# gastempt1.r # Vanilla fit of gastric emptying curves

library(nlme)This loads

*package*

*nlme*, where core function like

*nlme*and

*lme*are defined. For reasons only R-historian can understand, the library() functions loads the package

*nlme*, and not the library

*nlme*. Details of this package are documented in

*Pinheiro PC, Bates DM (2000) Mixed-Effects Models in S and S-Plus*. Springer, ISBN 0-387-98957-0, or PB for short.

### Self-Starting Emptying Function

options(digits=3) # print 3 significant digits # Self starting functions used to fit gastric emptying curves EmptInit0= function(mCall,LHS,data) { model=as.character(mCall[[1]]) xy=sortedXyData(mCall[["t"]],LHS,data) if (nrow(xy) < 6) { .Exit("Too few distinct input values to fit tmodel") } # we use the first of the sorted data points for # the initial volume (v0) estimation v0=xy[1,2] xy$y=log(xy$y/v0) tempt=log(0.5)/coef(lm(y~x,data=xy))[2] kappa = 0.8 # a good starting value tempt=max(min(tempt/2,200),20) # empirical correction value = c(v0,kappa,tempt) names(value)=mCall[c("v0","kappa","tempt")] value }

For non-linear function fits, good initial values are required. We define function *EmptInit0* that uses a linear fit to the log-transformed data points for a good guess. This function later will be extended to handle other function models, and we will hide it in a separate file because it rarely need changes.

# Standard LinExp model SSEmptLinExp=selfStart(~v0*(1+kappa*t/tempt)*exp(-t/tempt), initial=EmptInit0, parameters= c("v0","kappa","tempt"))

Here we define the model function *LinExp* for gastric emptying with accomodation we introduced in the publications. Conventionally, the function name is prepended by *SS* since
this is a **s**elf-**s**tarting function. Three parameters v_{o}, kappa and t_{empt} are to be fitted. Time t is no fit parameter, it is defined by the data values (PB ch. 8.1.2).

# Get Data comma-separated (csv) file with header ge = read.table("gastempt.csv",sep=",",header=TRUE)

The file *gastempt.csv* can be downloaded from here and should be unzipped into the current directory of R. The first lines of the file read as follows:

```
subj,treat,t,v
```

2,alb,3,416.53

2,alb,6,418.97

2,alb,9,442.77

When analyzing your own data, the header line should be exactly the same as given here. The
first number in each row is the ID of the subject or patient; you can also use initials instead of numbers here. The second field is the treatment, it should be a short text, e.g. *alb* for albumin. If you have one treatment only, put in a dummy text instead; in the example on this page, we do not explicitely use the treatment, but we include it in the more advanced examples. The third number is the time in minutes after start, followed by the volume. The latter does not need to have units of ml, it may also represent counts or voxels. The sort order of the data is arbitrary. Click here if you want to know how to use field separators other than the comma, or retrieve data from a database.

# Make sure subject are not treated as numbers ge$subj = as.factor(ge$subj)This line is only required when your subject IDs are numbers; in this case

*read.table*assumes that subjects are continuous variables, and R would try to fit some linear regression to the data in analysis. By explicitly converting to factors, we make sure that

*subj*is treated as a categorical variable. If the IDs were something like "A01","A02", this line were not needed, because

*read.table*reads in string-like values as factors by default.

ge$subjtreat = as.factor(paste(ge$subj,ge$treat,sep="."))

*nlsList*itself could handle both factors separately in a formula, we are going to pass the result as an approximation to

*nlme*, whose current implementation can handle only one combined factor.

gelist = nlsList(v~SSEmptLinExp(t,v0,kappa,tempt)|subjtreat,data=ge) print(gelist)

See PB chapter 8.1.3. If we had only one curve, we would call *n**ls*, which determines values of v_{0}, kappa and t_{empt} of the nonlinear function SSEmptLinExp such that the sum-of-square deviation of the measured volumes v is minimized. Function *nlsList* collects the data with common parameter *subjtreat* from dataframe *ge* and fits individual curves to each subset, returning results in a list.

Let's look at the output, printed in green here.

### nlsList Fit

>gelist = nlsList(v~SSEmptLinExp(t,v0,kappa,tempt)|subjtreat,data=ge) Error in nls(formula = formula, data = data, control = control) : step factor 0.000488281 reduced below 'minFactor' of 0.000976563 Error in nls(formula = formula, data = data, control = control) : step factor 0.000488281 reduced below 'minFactor' of 0.000976563

Our data contained a total of 8 gastric emptying curves, and for two of these es we got error messages. Compared to nonlinear fit software such as *Microcal Origin*, *Sigma Plot*, *GraphPad Prism* or *Systat*, function *nls* is very outspoken about results it does not like. It simply reports an error when it cannot find a unique combination of kappa and t_{empt} that minimizes the least square fit. When we tried to fit a data set that gave no result in *nls* with *GraphPad Prism*, we got a result of kappa=0.0 and t*empt*=311, which looked resonable, but looking at the respective standard errors of 55 and 17159 (seventeen-thousand!) told us that the rude refusal of *nls* probably was a polite hint after all. This is not specific to *L**inExp* model; it will also occur with conventional power-exponential fitting, but has rarely gained attention in the context of gastric emptying.

> print(gelist) Call: Model: v ~ SSEmptLinExp(t, v0, kappa, tempt) | subjtreat Data: ge Coefficients: v0 kappa tempt10.alb NA NA NA10.lip 547 2.06 43.711.alb NA NA NA11.lip 818 1.50 67.4 12.alb 433 2.50 45.1 12.lip 607 2.01 39.5 2.alb 431 1.11 97.8 2.lip 694 1.83 39.6 Degrees of freedom: 66 total; 48 residual Residual standard error: 25.4

# compute simple nlme fit # This is similar to the example in PB, page 368 (CO2) contr = nlmeControl(pnlsTol=0.3) ge0.nlme=nlme(gelist,control=contr) # Show summary ge0.nlme # We will do a bit more work with the coeffs, so we store them ge0.coef = coef(ge0.nlme)

### nlme Fit

Now we come to the core function. The first line of code is a bit of
tweaking.
When we call *nlme* with the result of *nlsList* passed as start values, we ask for a combined fit of all data sets, asssuming that all curves have some common statistics. This will
tame some of the runaway *NA* results of *nlsList* as if they had been told: "If in doubt, try to behave like your mainstream peers". Or, somewhat more mathematical: "If several pairs of kappa and t_{empt} give the same quality of fit, take that combination that is closer to the mean set of coefficents for all emptying curves".
We will see what this means when we plot the results in the
next example.

You cannot be sure that *nlme* converges for all data sets; sometimes it simply disappears into Nowhere-Land, telling you after 5 minutes that it ran out of patience. But **if** *nlme* reports result, you can be quite sure that all for all coefficients reasonable estimates have been found. Well, almost, but it worked for us with quite a few datasets.

> print(coef(ge0.nlme)) v0 kappa tempt10.alb 545 0.168 151.210.lip 557 1.973 44.611.alb 1053 0.677 113.711.lip 813 1.529 67.0 12.alb 489 1.874 51.4 12.lip 601 2.052 39.3 2.alb 428 1.170 93.8 2.lip 661 2.061 37.8

There is no "NA" entry in this table of coefficients, but the two albumin curves printed in bold that did not converge in *nlsList* above are still outliers (Details). Both have a kappa < 1, indicating that the gastric volume did not rise above the initial volume as generated by secretion or accomodation,

The task of parametrization of the curves is finished, but medical researchers
prefer to quote the half-emtpying time t_{50}. To compute t_{50} from kappa and t_{empt}, you can use our Add-On for Excel, or a Newton approximation by the R function *uniroot* as shown below.

# Determine t(50%) from tempt and kappa. # No closed solution known, we use newton approximation t50 = function(x){ value= uniroot(function(t) SSEmptLinExp(t,1,x$kappa,x$tempt) - 0.5, low=10,up=2000) # we expect between 10 and 2000 minutes value$root }

For a nice display, we will append the t_{50} to the primary estimated coefficients. We do it with a for-loop here to show that R can be used like a normal programming language; in the next example we will apply cryptic-short R code instead.

ge0.coef$t50=NA for (i in 1:nrow(ge0.coef)) { ge0.coef$t50[i]=t50(ge0.coef[i,]) } print(ge0.coef)

> print(ge0.coef) v0 kappa tempt t5010.alb 545 0.168 151.2 124.410.lip 557 1.973 44.6 109.711.alb 1053 0.677 113.7 152.111.lip 813 1.529 67.0 143.9 12.alb 489 1.874 51.4 123.2 12.lip 601 2.052 39.3 98.5 2.alb 428 1.170 93.8 172.9 2.lip 661 2.061 37.8 95.2

We feel better know about the two fits printed in bold; even if the values for t_{empt}
are on the large side, the medically relevant values for t_{50} are within a reasonable range.

In the next example, we will clean up the code and add graphics for presentation and statistical diagnostic.