Implementation of the SIMEX algorithm for measurement error models according to Cook and Stefanski

simex(model, SIMEXvariable, measurement.error, lambda = c(0.5, 1, 1.5,
  2), B = 100, fitting.method = "quadratic",
  jackknife.estimation = "quadratic", asymptotic = TRUE)

# S3 method for simex
plot(x, xlab = expression((1 + lambda)),
  ylab = colnames(b[, -1]), ask = FALSE, show = rep(TRUE, NCOL(b) -
  1), ...)

# S3 method for simex
predict(object, newdata, ...)

# S3 method for simex
print(x, digits = max(3, getOption("digits") - 3), ...)

# S3 method for summary.simex
print(x, digits = max(3, getOption("digits") -
  3), ...)

# S3 method for simex
refit(object, fitting.method = "quadratic",
  jackknife.estimation = "quadratic", asymptotic = TRUE, ...)

# S3 method for simex
summary(object, ...)

Arguments

model

the naive model

SIMEXvariable

character or vector of characters containing the names of the variables with measurement error

measurement.error

given standard deviations of measurement errors. In case of homoskedastic measurement error it is a matrix with dimension 1xlength(SIMEXvariable). In case of heteroskedastic error for at least one SIMEXvariable it is a matrix of dimension nx

lambda

vector of lambdas for which the simulation step should be done (without 0)

B

number of iterations for each lambda

fitting.method

fitting method for the extrapolation. linear, quadratic, nonlinear are implemented. (first 4 letters are enough)

jackknife.estimation

specifying the extrapolation method for jackknife variance estimation. Can be set to FALSE if it should not be performed

asymptotic

logical, indicating if asymptotic variance estimation should be done, in the naive model the option x = TRUE has to be set

x

object of class 'simex'

xlab

optional name for the X-Axis

ylab

vector containing the names for the Y-Axis

ask

logical. If TRUE, the user is asked for input, before a new figure is drawn

show

vector of logicals indicating for wich variables a plot should be produced

arguments passed to other functions

object

of class 'simex'

newdata

optionally, a data frame in which to look for variables with which to predict. If omitted, the fitted linear predictors are used

digits

number of digits to be printed

Value

An object of class 'simex' which contains:

coefficients

the corrected coefficients of the SIMEX model,

SIMEX.estimates

the estimates for every lambda,

model

the naive model,

measurement.error

the known error standard deviations,

B

the number of iterations,

extrapolation

the model object of the extrapolation step,

fitting.method

the fitting method used in the extrapolation step,

residuals

the residuals of the main model,

fitted.values

the fitted values of the main model,

call

the function call,

variance.jackknife

the jackknife variance estimate,

extrapolation.variance

the model object of the variance extrapolation,

variance.jackknife.lambda

the data set for the extrapolation,

variance.asymptotic

the asymptotic variance estimates,

theta

the estimates for every B and lambda,

...

Details

Nonlinear is implemented as described in Cook and Stefanski, but is numerically instable. It is not advisable to use this feature. If a nonlinear extrapolation is desired please use the refit() method.

Asymptotic is only implemented for naive models of class lm or glm with homoscedastic measurement error.

refit() refits the object with a different extrapolation function.

Methods (by generic)

  • plot: Plot the simulation and extrapolation step

  • predict: Predict using simex correction

  • print: Print simex nicely

  • print: Print summary nicely

  • refit: Refits the model with a different extrapolation function

  • summary: Summary of simulation and extrapolation

References

Cook, J.R. and Stefanski, L.A. (1994) Simulation-extrapolation estimation in parametric measurement error models. Journal of the American Statistical Association, 89, 1314 -- 1328

Carroll, R.J., Küchenhoff, H., Lombard, F. and Stefanski L.A. (1996) Asymptotics for the SIMEX estimator in nonlinear measurement error models. Journal of the American Statistical Association, 91, 242 -- 250

Carrol, R.J., Ruppert, D., Stefanski, L.A. and Crainiceanu, C. (2006). Measurement error in nonlinear models: A modern perspective., Second Edition. London: Chapman and Hall.

Lederer, W. and Küchenhoff, H. (2006) A short introduction to the SIMEX and MCSIMEX. R News, 6(4), 26--31

See also

mcsimex for discrete data with misclassification, lm, glm

Examples

## Seed set.seed(49494) ## simulating the measurement error standard deviations sd_me <- 0.3 sd_me2 <- 0.4 temp <- runif(100, min = 0, max = 0.6) sd_me_het1 <- sort(temp) temp2 <- rnorm(100, sd = 0.1) sd_me_het2 <- abs(sd_me_het1 + temp2) ## simulating the independent variables x (real and with measurement error): x_real <- rnorm(100) x_real2 <- rpois(100, lambda = 2) x_real3 <- -4*x_real + runif(100, min = -10, max = 10) # correlated to x_real x_measured <- x_real + sd_me * rnorm(100) x_measured2 <- x_real2 + sd_me2 * rnorm(100) x_het1 <- x_real + sd_me_het1 * rnorm(100) x_het2 <- x_real3 + sd_me_het2 * rnorm(100) ## calculating dependent variable y: y <- x_real + rnorm(100, sd = 0.05) y2 <- x_real + 2*x_real2 + rnorm(100, sd = 0.08) y3 <- x_real + 2*x_real3 + rnorm(100, sd = 0.08) ### one variable with homoscedastic measurement error (model_real <- lm(y ~ x_real))
#> #> Call: #> lm(formula = y ~ x_real) #> #> Coefficients: #> (Intercept) x_real #> -0.004752 0.994554 #>
(model_naiv <- lm(y ~ x_measured, x = TRUE))
#> #> Call: #> lm(formula = y ~ x_measured, x = TRUE) #> #> Coefficients: #> (Intercept) x_measured #> -0.01217 0.90327 #>
(model_simex <- simex(model_naiv, SIMEXvariable = "x_measured", measurement.error = sd_me))
#> #> Naive model: #> lm(formula = y ~ x_measured, x = TRUE) #> #> SIMEX-Variables: x_measured #> Number of Simulations: 100 #> #> Coefficients: #> (Intercept) x_measured #> -0.01528 0.97007 #>
plot(model_simex)
### two variables with homoscedastic measurement errors (model_real2 <- lm(y2 ~ x_real + x_real2))
#> #> Call: #> lm(formula = y2 ~ x_real + x_real2) #> #> Coefficients: #> (Intercept) x_real x_real2 #> -0.01411 0.99094 2.00011 #>
(model_naiv2 <- lm(y2 ~ x_measured + x_measured2, x = TRUE))
#> #> Call: #> lm(formula = y2 ~ x_measured + x_measured2, x = TRUE) #> #> Coefficients: #> (Intercept) x_measured x_measured2 #> 0.2050 0.8909 1.8593 #>
(model_simex2 <- simex(model_naiv2, SIMEXvariable = c("x_measured", "x_measured2"), measurement.error = cbind(sd_me, sd_me2)))
#> #> Naive model: #> lm(formula = y2 ~ x_measured + x_measured2, x = TRUE) #> #> SIMEX-Variables: x_measured, x_measured2 #> Number of Simulations: 100 #> #> Coefficients: #> (Intercept) x_measured x_measured2 #> -0.04414 0.93716 1.97311 #>
plot(model_simex2)
# NOT RUN { ### one variable with increasing heteroscedastic measurement error model_real (mod_naiv1 <- lm(y ~ x_het1, x = TRUE)) (mod_simex1 <- simex(mod_naiv1, SIMEXvariable = "x_het1", measurement.error = sd_me_het1, asymptotic = FALSE)) plot(mod_simex1) ### two correlated variables with heteroscedastic measurement errors (model_real3 <- lm(y3 ~ x_real + x_real3)) (mod_naiv2 <- lm(y3 ~ x_het1 + x_het2, x = TRUE)) (mod_simex2 <- simex(mod_naiv2, SIMEXvariable = c("x_het1", "x_het2"), measurement.error = cbind(sd_me_het1, sd_me_het2), asymptotic = FALSE)) plot(mod_simex2) ### two variables, one with homoscedastic, one with heteroscedastic measurement error model_real2 (mod_naiv3 <- lm(y2 ~ x_measured + x_het2, x = TRUE)) (mod_simex3 <- simex(mod_naiv3, SIMEXvariable = c("x_measured", "x_het2"), measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE)) ### glm: two variables, one with homoscedastic, one with heteroscedastic measurement error t <- x_real + 2*x_real2 + rnorm(100, sd = 0.01) g <- 1 / (1 + exp(t)) u <- runif(100) ybin <- as.numeric(u < g) (logit_real <- glm(ybin ~ x_real + x_real2, family = binomial)) (logit_naiv <- glm(ybin ~ x_measured + x_het2, x = TRUE, family = binomial)) (logit_simex <- simex(logit_naiv, SIMEXvariable = c("x_measured", "x_het2"), measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE)) summary(logit_simex) print(logit_simex) plot(logit_simex) ### polr: two variables, one with homoscedastic, one with heteroscedastic measurement error if(require("MASS")) {# Requires MASS yerr <- jitter(y, amount=1) yfactor <- cut(yerr, 3, ordered_result=TRUE) (polr_real <- polr(yfactor ~ x_real + x_real2)) (polr_naiv <- polr(yfactor ~ x_measured + x_het2, Hess = TRUE)) (polr_simex <- simex(polr_naiv, SIMEXvariable = c("x_measured", "x_het2"), measurement.error = cbind(sd_me, sd_me_het2), asymptotic = FALSE)) summary(polr_simex) print(polr_simex) plot(polr_simex) } # }