Package 'lodi' reference manual

Title:	Limit of Detection Imputation for Single-Pollutant Models
Description:	Impute observed values below the limit of detection (LOD) via censored likelihood multiple imputation (CLMI) in single-pollutant models, developed by Boss et al (2019) <doi:10.1097/EDE.0000000000001052>. CLMI handles exposure detection limits that may change throughout the course of exposure assessment. 'lodi' provides functions for imputing and pooling for this method.
Authors:	Jonathan Boss [aut], Alexander Rix [aut, cre]
Maintainer:	Alexander Rix <[email protected]>
License:	GPL-3
Version:	0.9.2
Built:	2025-02-20 04:14:36 UTC
Source:	https://github.com/umich-cphds/lodi

Censored Likelihood Multiple Imputation

Description

This function performs censored likelihood multiple imputation for single-pollutant models where the pollutant of interest is subject to varying detection limits across batches (this function will also work if there is only one distinct detection limit). The function outputs a list containing the imputed datasets and details regarding the imputation procedure (i.e., number of imputed dataset, covariates used to impute the non-detects, etc).

Usage

clmi(formula, df, lod, seed, n.imps = 5, verbose = FALSE)
clmi(formula, df, lod, seed, n.imps = 5, verbose = FALSE)

Arguments

`formula`	A formula in the form of `exposure ~ outcome + covariates`. That is, the first variable on the right hand side of `formula` should be the outcome of interest.
`df`	A data.frame with `exposure`, `outcome` and `covariates`.
`lod`	Name of limit of detection variable in `df`.
`seed`	For reproducability.
`n.imps`	Number of datasets to impute. Default is 5.
`verbose`	If `TRUE`, `clmi` prints out useful debugging information while running. Default is `FALSE`.

Details

clmi is somewhat picky regarding the formula parameter. It tries to infer what transformation you'd like to apply to the exposure you are imputing, what the exposure is, and what the outcome is. It attempts to check to make sure that everything is working correctly, but it can fail. Roughly, the rules are:

The left hand side of formula should be the exposure you are trying to impute.
The exposure may be optionally wrapped in a univariate transformation function. If the transformation function is not univariate, you ought to get an error about a "complicated" transformation.
The first variable on the right hand side of formula should be your outcome of interest.

Note

clmi only supports categorical variables that are numeric, (i.e., not factors or characters). You can use the model.matrix function to convert a data frame with factors to a numeric design matrix and subsequently convert that matrix back into a data frame using as.data.frame.
If you get the error message "L-BFGS-B needs finite values of 'fn'", try normalising your data.

References

Boss J, Mukherjee B, Ferguson KK, et al. Estimating outcome-exposure associations when exposure biomarker detection limits vary across batches. Epidemiology. 2019;30(5):746-755. 10.1097/EDE.0000000000001052

Examples

library(lodi)

# Note that the outcome of interest is the first variable on the right hand
# side of the formula.
clmi.out <- clmi(poll ~ case_cntrl + smoking + gender, toy_data, lod, 1)

# you can specify a transformation to the exposure in the formula
clmi.out <- clmi(log(poll) ~ case_cntrl + smoking + gender, toy_data, lod, 1)

library(lodi)

# Note that the outcome of interest is the first variable on the right hand
# side of the formula.
clmi.out <- clmi(poll ~ case_cntrl + smoking + gender, toy_data, lod, 1)

# you can specify a transformation to the exposure in the formula
clmi.out <- clmi(log(poll) ~ case_cntrl + smoking + gender, toy_data, lod, 1)

Single pollutant complete case analysis.

Description

lod_cca is a helper function that does complete case analysis for single pollutant models. The function can be used to compare with clmi.

Usage

lod_cca(formula, df, type)
lod_cca(formula, df, type)

Arguments

`formula`	A R formula in the form outcome ~ exposure + covariates.
`df`	A data.frame that contains the variables `formula` references.
`type`	The type of regression to perform. Acceptable options are linear and logistic.

Examples

library(lodi)
# load lodi's toy data
data("toy_data")
x <- lod_cca(case_cntrl ~ poll + smoking + gender, toy_data, logistic)
# see the fit model
x$model
library(lodi)
# load lodi's toy data
data("toy_data")
x <- lod_cca(case_cntrl ~ poll + smoking + gender, toy_data, logistic)
# see the fit model
x$model

Single pollutant `sqrt(2)` imputation.

Description

lod_root2 is a helper function that performs single imputation with lod / sqrt(2), a common ad hoc approach used in single-pollutant modeling. The function can be used to compare with clmi.

Usage

lod_root2(formula, df, lod, type)
lod_root2(formula, df, lod, type)

Arguments

`formula`	A R formula in the form `outcome ~ exposure + covariates`.
`df`	A data.frame that contains the variables `formula` references.
`lod`	Name of the limit of detection variable.
`type`	The type of regression to perform. Acceptable options are linear and logistic.

Note

Depending on the transformation used, a "Complicated transformation" error may occur. For example, the transformation a * exposure will cause an error. In this case, define a transformation function as f <- function(exposure) a * exposure and use f in your formula. This technical limitation is unavoidable at the moment.

Examples

# load lodi's toy data
library(lodi)
data("toy_data")
lodi.out <- lod_root2(case_cntrl ~ poll + smoking + gender, toy_data, lod,
                        logistic)
# see the fit model
lodi.out$model

# we can log transform poll to make it normally distributed
lodi.out <- lod_root2(case_cntrl ~ log(poll) + smoking + gender, toy_data,
                        lod, logistic)
lodi.out$model

# transforming the exposure results in a new column being added to data,
# representing the transformed lod.
head(lodi.out$data)

# You can even define your own transformation functions and use them
f <- function(x) exp(sqrt(x))
lodi.out <- lod_root2(case_cntrl ~ f(poll) + smoking + gender, toy_data, lod,
                        logistic)
head(lodi.out$data)
# load lodi's toy data
library(lodi)
data("toy_data")
lodi.out <- lod_root2(case_cntrl ~ poll + smoking + gender, toy_data, lod,
                        logistic)
# see the fit model
lodi.out$model

# we can log transform poll to make it normally distributed
lodi.out <- lod_root2(case_cntrl ~ log(poll) + smoking + gender, toy_data,
                        lod, logistic)
lodi.out$model

# transforming the exposure results in a new column being added to data,
# representing the transformed lod.
head(lodi.out$data)

# You can even define your own transformation functions and use them
f <- function(x) exp(sqrt(x))
lodi.out <- lod_root2(case_cntrl ~ f(poll) + smoking + gender, toy_data, lod,
                        logistic)
head(lodi.out$data)

Calculate pooled estimates from `clmi.out` objects using Rubin's rules

Description

Calculate pooled estimates from clmi.out objects using Rubin's rules

Usage

pool.clmi(formula, clmi.out, type)
pool.clmi(formula, clmi.out, type)

Arguments

`formula`	Formula to fit. Exposure variable should end in `_transform_imputed`.
`clmi.out`	An object generated by clmi.
`type`	Type of regression to pool. Valid types are logistic and linear.

Examples

# continue example from clmi
# fit model on imputed data and pool results
library(lodi)
data("toy_data")
clmi.out <- clmi(log(poll) ~ case_cntrl + smoking + gender, toy_data, lod, 1)
results <- pool.clmi(case_cntrl ~ poll_transform_imputed + smoking, clmi.out,
                       logistic)

results$output
# continue example from clmi
# fit model on imputed data and pool results
library(lodi)
data("toy_data")
clmi.out <- clmi(log(poll) ~ case_cntrl + smoking + gender, toy_data, lod, 1)
results <- pool.clmi(case_cntrl ~ poll_transform_imputed + smoking, clmi.out,
                       logistic)

results$output

Synthetic toy data for clmi

Description

Synthetic toy data for clmi

Usage

toy_data
toy_data

Format

A data.frame with 100 observations on 6 variables:

id: Patient ID number.
case_cntrl: Patient's case-control status. Either 1 or 0.
poll: Concentration of pollutant in patient's blood sample.
smoking: Smoking status. Either 1 or 0.
gender: Gender. 1 for male, 0 for female.
batch1: Batch status. Integer
lod: batch's limit of detection for patient.

Package 'lodi'

Help Index

Censored Likelihood Multiple Imputation

Description

Usage

Arguments

Details

Note

References

Examples

Single pollutant complete case analysis.

Description

Usage

Arguments

Examples

Single pollutant sqrt(2) imputation.

Description

Usage

Arguments

Note

Examples

Calculate pooled estimates from clmi.out objects using Rubin's rules

Description

Usage

Arguments

Examples

Synthetic toy data for clmi

Description

Usage

Format

Single pollutant `sqrt(2)` imputation.

Calculate pooled estimates from `clmi.out` objects using Rubin's rules