The parameters estimated in a moult analysis are mean start date,
duration and standard deviation. The estimates in the moult
package are found by maximising a likelihood. One way to construct
confidence intervals for these estimates is to first obtain an estimate
for the standard error (SE) and then assuming a normal sampling
distribution for the estimate and constructing a confidence interval
as
estimate ± 1.96 × SE(estimate).
In the moult package, SEs are found from the observed information matrix (inverse of the matrix of 2nd derivatives of the log-likelihood function). The above normal confidence interval only works if the sampling distributions of the statistics are close to normal.
A second reason for introducing bootstrap confidence intervals: the
standard deviation parameter in the moult
package is
estimated on the log-scale. It’s SE on the original scale can be
approximated (using the Delta method) but finding covariances is much
more complicated, e.g. it is difficult to obtain the covariance between
start date and standard deviation in start date for both parameters on
the original scale.
One solution to solve both of the above problems (normal assumption possibly not valid, and covariances for all parameters) is bootstrapping. Bootstrapping also allows us to find confidence intervals for any derived parameters, such as the mean end date of moult.
The confint.moult
function finds percentile bootstrap
confidence intervals for a range of parameters, and also returns
bootstrap estimates of the covariance matrix and standard errors.
Bootstrapping approximates the sampling distribution of any statistic. It works by repeatedly sampling from the data with replacement, and estimating the statistic of interest in each of these re-samples. The resulting bootstrap sample can be used to estimate standard errors and to obtain confidence intervals. Bootstrap percentile intervals estimate confidence intervals using the percentiles (e.g. 2.5% and 97.5% for a 95% confidence level) of the boostrap distribution.
library(moult)
data(sanderlings)
m2 <- moult(MIndex ~ Day, data = sanderlings)
summary(m2)
#>
#> Call:
#> moult(formula = MIndex ~ Day, data = sanderlings)
#>
#>
#> Duration coefficients:
#> Estimate Std. Error
#> duration 96.08 6.902
#>
#> Mean start date coefficients:
#> Estimate Std. Error
#> mean-start-day 131.4 2.096
#>
#> Coefficients for standard deviation in start date:
#> Estimate Std. Error
#> SD-start-day 19.18 1.984
#>
#> Log-likelihood: -255 on 3 Df
We want to obtain confidence intervals for the usual parameters mean start date, duration, SD in start date. In addition we might be interested in confidence intervals for the mean end date (mean start date + duration) and the mid-date (start date + 0.5 × duration).
In the following code the moult data are bootstrapped B = 1000 times (1000 bootstrap
samples). The same moult model as used for the original data
(m2
above) is fitted to each bootstrap sample, and the mean
start date, standard deviation in start date, duration, mean end date,
and mid-date are estimated. From these bootstrapped values a covariance
matrix can be obtained. Standard errors are square roots of the
diagonals of this covariance matrix.
95% bootstrap percentile intervals are obtained from the quantiles of the bootstrap distribution. Percentile intervals usually work relatively well if the bootstrap distribution is symmetric.
boot.ci$bootstrap.percentile.ci
#> duration mean sd half.est end.est
#> 2.5% 77.8694 127.3481 14.00026 169.4970 208.6179
#> 97.5% 112.8390 135.3101 22.83148 187.6531 243.7498
boot.ci$bootstrap.vcov
#> duration mean sd half.est end.est
#> duration 77.285649 -2.498993 11.709161 36.143832 74.786656
#> mean -2.498993 4.301364 1.635802 3.051868 1.802372
#> sd 11.709161 1.635802 5.283523 7.490382 13.344963
#> half.est 36.143832 3.051868 7.490382 21.123784 39.195700
#> end.est 74.786656 1.802372 13.344963 39.195700 76.589028
boot.ci$bootstrap.SE
#> duration mean sd half.est end.est
#> 8.791226 2.073973 2.298592 4.596062 8.751516