Package 'dvmisc' reference manual

Title:	Convenience Functions, Moving Window Statistics, and Graphics
Description:	Collection of functions for running and summarizing statistical simulation studies, creating visualizations (e.g. CART Shiny app, histograms with fitted probability mass/density functions), calculating moving-window statistics efficiently, and performing common computations.
Authors:	Dane R. Van Domelen
Maintainer:	Dane R. Van Domelen <[email protected]>
License:	GPL-3
Version:	1.1.5
Built:	2025-03-04 04:49:08 UTC
Source:	https://github.com/vandomed/dvmisc

Convert Continuous BMI Values into 3-Level Factor

Description

Converts a continuous BMI variable into a 3-level factor variable: Normal weight if [-Inf, 25), Overweight if [25, 30), and Obese if [30, Inf).

Usage

bmi3(x, labels = TRUE)
bmi3(x, labels = TRUE)

Arguments

`x`	Numeric vector of BMI values.
`labels`	If `TRUE`, factor levels are labeled `"Normal weight"`, `"Overweight"`, and `"Obese"`; if `FALSE`, factor levels are `[-Inf, 25)`, `[25, 30)`, and `[30, Inf)`.

Value

Factor variable with 3 levels.

Convert Continuous BMI Values into 4-Level Factor

Description

Converts a continuous BMI variable into a 4-level factor variable: Underweight if [-Inf, 18.5), Normal weight if [18.5, 25), Overweight if [25, 30), and Obese if [30, Inf).

Usage

bmi4(x, labels = TRUE)
bmi4(x, labels = TRUE)

Arguments

`x`	Numeric vector of BMI values.
`labels`	If `TRUE`, factor levels are labeled `"Underweight"`, `"Normal weight"`, `"Overweight"`, and `"Obese"`; if `FALSE`, factor levels are `[-Inf, 18.5)`, `[18.5, 25)`, `[25, 30)`, and `[30, Inf)`.

Value

Factor variable with 4 levels.

Shiny App Interface with rpart::rpart

Description

User selects response variable and predictors, app displays optimal tree in terms of cross-validation error along with a summary table and importance plot. Relies on rpart for fitting and fancyRpartPlot for plotting.

Usage

cart_app(data)
cart_app(data)

Arguments

data

Data frame.

Value

Shiny app.

Create a Clean Summary Table from a glm Object

Description

Formats a glm object for printing to console or inputting to kable.

Usage

clean_glm(
  fit,
  columns = NULL,
  expand_factors = TRUE,
  variable_labels = NULL,
  prep_kable = FALSE,
  decimals = 2,
  formatp_list = NULL
)
clean_glm(
  fit,
  columns = NULL,
  expand_factors = TRUE,
  variable_labels = NULL,
  prep_kable = FALSE,
  decimals = 2,
  formatp_list = NULL
)

Arguments

`fit`	Object returned from `glm`.
`columns`	Character vector specifying what columns to include. Choices for each element are `"beta"`, `"se"`, `"betaci"`, `"beta_se"`, `"beta_ci"` `"or"`, `"orci"`, `"or_ci"`, `"hr"`, `"hrci"`, `"hr_ci")`, `"z"`, `"t"`, and `"p"`.
`expand_factors`	Logical value for whether to include two blank rows for factor variables (name of variable and reference group).
`variable_labels`	Character vector in case you want labels other than the variable names.
`prep_kable`	Logical value for whether to prepare for printing via `kable`. Right now, it just adds forward slashes so factor levels are indented, which only applies if there are factor variables and `expand_factors = TRUE`.
`decimals`	Numeric value of vector specifying number of decimal places for each column.
`formatp_list`	Arguments to pass to `formatp`.

Value

Data frame.

Examples

fit <- glm(mpg ~ wt + as.factor(cyl) + hp, data = mtcars)
clean_glm(fit)
fit %>% clean_glm(prep_kable = TRUE) %>% knitr::kable()

fit <- glm(mpg ~ wt + as.factor(cyl) + hp, data = mtcars)
clean_glm(fit)
fit %>% clean_glm(prep_kable = TRUE) %>% knitr::kable()

Convert Numeric to Factor with Convenient Interface

Description

So you can stop guess-and-checking with cut.

Usage

cleancut(x, breaks, labels = NULL)
cleancut(x, breaks, labels = NULL)

Arguments

`x`	Numeric vector.
`breaks`	Character string, e.g. `"[-Inf, 0), [0, 10], (10, Inf)"`.
`labels`	Character vector.

Value

Factor or integer vector.

Examples

x <- rnorm(100)
y <- cleancut(x, "(-Inf, -1), [-1, 1], (1, Inf)")
tapply(x, y, range)

y <- cleancut(x, "(-Inf, -1), [-1, 1], (1, Inf)", c("<-1", "-1 to 1", ">1"))
tapply(x, y, range)

x <- rnorm(100)
y <- cleancut(x, "(-Inf, -1), [-1, 1], (1, Inf)")
tapply(x, y, range)

y <- cleancut(x, "(-Inf, -1), [-1, 1], (1, Inf)", c("<-1", "-1 to 1", ">1"))
tapply(x, y, range)

Create Quantile Groups

Description

Combines quantile and cut into a single function, with strata-specific quantiles possible. For example, you could create sex-specific height tertiles with create_qgroups(height, groups = 3, strata = sex). Compatible with dplyr functions like mutate and transmute.

Usage

create_qgroups(
  x,
  groups = 4,
  probs = seq(1/groups, 1 - 1/groups, 1/groups),
  strata = NULL,
  quantile_list = list(na.rm = TRUE),
  cut_list = list(include.lowest = TRUE)
)
create_qgroups(
  x,
  groups = 4,
  probs = seq(1/groups, 1 - 1/groups, 1/groups),
  strata = NULL,
  quantile_list = list(na.rm = TRUE),
  cut_list = list(include.lowest = TRUE)
)

Arguments

`x`	Numeric vector.
`groups`	Numeric value, e.g. 3 for tertiles, 4 for quartiles, etc.
`probs`	Numeric vector.
`strata`	Factor specifying subgroups to calculate quantiles within. For multivariable subgroups, you can use `interaction`.
`quantile_list`	Arguments to pass to `quantile`.
`cut_list`	Arguments to pass to `cut`.

Value

Factor variable.

Examples

# In mtcars dataset, create tertiles for mpg
mtcars$mpg_tertiles <- create_qgroups(mtcars$mpg, groups = 3)
table(mtcars$mpg_tertiles)

# Define tertile cutpoints separately for 4-, 6-, and 8-cylinder vehicles
mtcars$mpg_tertiles <- create_qgroups(mtcars$mpg, groups = 3, strata = mtcars$cyl)
table(mtcars$mpg_tertiles)

# Works with dplyr functions like mutate
mtcars <- mtcars %>% 
  dplyr::mutate(mpg_tertiles = create_qgroups(mpg, groups = 3, strata = cyl))
table(mtcars$mpg_tertiles)

# Can embed in lm, glm, etc.
summary(lm(mpg ~ create_qgroups(wt), data = mtcars))

# In mtcars dataset, create tertiles for mpg
mtcars$mpg_tertiles <- create_qgroups(mtcars$mpg, groups = 3)
table(mtcars$mpg_tertiles)

# Define tertile cutpoints separately for 4-, 6-, and 8-cylinder vehicles
mtcars$mpg_tertiles <- create_qgroups(mtcars$mpg, groups = 3, strata = mtcars$cyl)
table(mtcars$mpg_tertiles)

# Works with dplyr functions like mutate
mtcars <- mtcars %>% 
  dplyr::mutate(mpg_tertiles = create_qgroups(mpg, groups = 3, strata = cyl))
table(mtcars$mpg_tertiles)

# Can embed in lm, glm, etc.
summary(lm(mpg ~ create_qgroups(wt), data = mtcars))

Create Quantile Groups (Complex Survey Data)

Description

Complex survey version of create_qgroups. Relies heavily on the survey package [1,2].

Usage

create_qgroups_svy(
  x,
  groups = 4,
  probs = seq(1/groups, 1 - 1/groups, 1/groups),
  strata = NULL,
  design,
  svyquantile_list = list(na.rm = TRUE),
  cut_list = list(include.lowest = TRUE)
)
create_qgroups_svy(
  x,
  groups = 4,
  probs = seq(1/groups, 1 - 1/groups, 1/groups),
  strata = NULL,
  design,
  svyquantile_list = list(na.rm = TRUE),
  cut_list = list(include.lowest = TRUE)
)

Arguments

`x`	Numeric vector.
`groups`	Numeric value, e.g. 3 for tertiles, 4 for quartiles, etc.
`probs`	Numeric vector.
`strata`	Factor specifying subgroups to calculate quantiles within. For multivariable subgroups, you can use `interaction`.
`design`	Survey design object.
`svyquantile_list`	Arguments to pass to `svyquantile`.
`cut_list`	Arguments to pass to `cut`.

Value

Factor variable.

References

1. Therneau, T. (2015). A Package for Survival Analysis in S. R package version 2.38. https://cran.r-project.org/package=survival.

2. Therneau, T.M. and Grambsch, P.M. (2000). Modeling Survival Data: Extending the Cox Model. Springer, New York. ISBN 0-387-98784-3.

Cut with Decreasing Factor Levels

Description

Convenience function to get decreasing factor levels from cut. Currently requires specifying breaks as vector of cutpoints rather than number of desired intervals.

Usage

cut_decreasing(x, breaks, include.lowest = FALSE, right = TRUE, ...)
cut_decreasing(x, breaks, include.lowest = FALSE, right = TRUE, ...)

Arguments

`x`, `breaks`, `include.lowest`, `right`	See `cut`. specifying number of intervals is not currently supported).
`...`	Arguments to pass to `cut`.

Value

Factor variable.

Examples

# In mtcars dataset, create 3 mpg groups
table(cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf)))

# Repeat with cut_decreasing to get factor levels ordered from high to low. 
# To match cut here, need to specify right = FALSE
table(cut_decreasing(mtcars$mpg, breaks = c(Inf, 20, 15, -Inf), right = FALSE))

# You can specify breaks from low to high, but then include.lowest and right 
# arguments get confusing
table(cut_decreasing(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf), right = TRUE))

# In mtcars dataset, create 3 mpg groups
table(cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf)))

# Repeat with cut_decreasing to get factor levels ordered from high to low. 
# To match cut here, need to specify right = FALSE
table(cut_decreasing(mtcars$mpg, breaks = c(Inf, 20, 15, -Inf), right = FALSE))

# You can specify breaks from low to high, but then include.lowest and right 
# arguments get confusing
table(cut_decreasing(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf), right = TRUE))

Plot Points +/- Error Bars

Description

Creates plot showing user-specified points (e.g. means, medians, regression coefficients) along with user-specified error bars (e.g. standard deviations, min/max, 95% confidence intervals).

Usage

dots_bars(
  y = NULL,
  bars = NULL,
  bars.lower = y - bars,
  bars.upper = y + bars,
  truth = NULL,
  group.labels = NULL,
  group.dividers = TRUE,
  subgroup.spacing = 1,
  subgroup.labels = NULL,
  subgroup.pch = NULL,
  subgroup.col = NULL,
  points.list = NULL,
  arrows.list = NULL,
  xaxis.list = NULL,
  yaxis.list = xaxis.list,
  abline.dividers.list = NULL,
  abline.truth.list = NULL,
  legend.list = NULL,
  ...
)
dots_bars(
  y = NULL,
  bars = NULL,
  bars.lower = y - bars,
  bars.upper = y + bars,
  truth = NULL,
  group.labels = NULL,
  group.dividers = TRUE,
  subgroup.spacing = 1,
  subgroup.labels = NULL,
  subgroup.pch = NULL,
  subgroup.col = NULL,
  points.list = NULL,
  arrows.list = NULL,
  xaxis.list = NULL,
  yaxis.list = xaxis.list,
  abline.dividers.list = NULL,
  abline.truth.list = NULL,
  legend.list = NULL,
  ...
)

Arguments

`y`	Numeric vector of y-values for different groups, or numeric matrix where each column contains y-values for clustered subgroups within a group.
`bars`	Numeric vector or matrix (matching whichever type `y` is) specifying the length of the error bar for each group/subgroup (i.e. distance from point to one end of error bar).
`bars.lower`	Numeric vector or matrix (matching whichever type `y` is) specifying the position of the lower end of the error bar for each group/subgroup.
`bars.upper`	Numeric vector or matrix (matching whichever type `y` is) specifying the position of the upper end of the error bar for each group/subgroup.
`truth`	Numeric value specifying true value of parameter being estimated. If specified, a horizontal reference line is added to the plot.
`group.labels`	Character vector of labels for the groups.
`group.dividers`	Logical value for whether to add vertical lines distinguishing the groups.
`subgroup.spacing`	Numeric value controlling the amount of spacing between subgroups, with values > 1 corresponding to more spacing.
`subgroup.labels`	Character vector giving labels for the subgroups.
`subgroup.pch`	Plotting symbol for different subgroups within each group.
`subgroup.col`	Plotting color for different subgroups within each group.
`points.list`	Optional list of inputs to pass to `points`.
`arrows.list`	Optional list of inputs to pass to `arrows`.
`xaxis.list`	Optional list of inputs to pass to `axis` for x-axis.
`yaxis.list`	Optional list of inputs to pass to `axis` for y-axis.
`abline.dividers.list`	Optional list of inputs to pass to `abline` for group dividers. Only used if `group.dividers = TRUE`.
`abline.truth.list`	Optional list of inputs to pass to `abline` for horizontal line at true value of parameter. Only used if `truth` is specified.
`legend.list`	Optional list of inputs to pass to `legend`.
`...`	Additional arguments to pass to `plot` function.

Value

Plot showing points +/- error bars across groups/subgroups.

Examples

# Generate 100 values from normal distributions with different means, and 
# graph mean +/- standard deviation across groups 
dat <- cbind(rnorm(100, 2), rnorm(100, 2.5), rnorm(100, 1.75))
means <- apply(dat, 2, mean)
sds <- apply(dat, 2, sd)
fig1 <- dots_bars(y = means, bars = sds, main = "Mean +/- SD by Group",
                  ylab = "Mean +/- SD")
                  
# Simulate BMI values for males and females in 3 different age groups, and 
# graph mean +/- 95% CI
sex <- as.factor(c(rep("Male", 300), rep("Female", 300)))
age <- as.factor(rep(c("Young", "Middle", "Old"), 2))
bmi <- c(rnorm(100, 25, 4), rnorm(100, 26, 4.25), rnorm(100, 27, 4.5),
         rnorm(100, 26.5, 4.5), rnorm(100, 27.25, 4.75), rnorm(100, 28, 5))
dat <- data.frame(sex = sex, age = age, bmi = bmi)
means <- tapply(dat$bmi, dat[, c("sex", "age")], mean)
ci.lower <- tapply(dat$bmi, dat[, c("sex", "age")],
                   function(x) t.test(x)$conf.int[1])
ci.upper <- tapply(dat$bmi, dat[, c("sex", "age")],
                   function(x) t.test(x)$conf.int[2])
fig2 <- dots_bars(y = means, bars.lower = ci.lower, bars.upper = ci.upper,
                  main = "BMI by Sex and Age",
                  ylab = "BMI (mean +/- CI)",
                  xlab = "Age group")

# Generate 100 values from normal distributions with different means, and 
# graph mean +/- standard deviation across groups 
dat <- cbind(rnorm(100, 2), rnorm(100, 2.5), rnorm(100, 1.75))
means <- apply(dat, 2, mean)
sds <- apply(dat, 2, sd)
fig1 <- dots_bars(y = means, bars = sds, main = "Mean +/- SD by Group",
                  ylab = "Mean +/- SD")
                  
# Simulate BMI values for males and females in 3 different age groups, and 
# graph mean +/- 95% CI
sex <- as.factor(c(rep("Male", 300), rep("Female", 300)))
age <- as.factor(rep(c("Young", "Middle", "Old"), 2))
bmi <- c(rnorm(100, 25, 4), rnorm(100, 26, 4.25), rnorm(100, 27, 4.5),
         rnorm(100, 26.5, 4.5), rnorm(100, 27.25, 4.75), rnorm(100, 28, 5))
dat <- data.frame(sex = sex, age = age, bmi = bmi)
means <- tapply(dat$bmi, dat[, c("sex", "age")], mean)
ci.lower <- tapply(dat$bmi, dat[, c("sex", "age")],
                   function(x) t.test(x)$conf.int[1])
ci.upper <- tapply(dat$bmi, dat[, c("sex", "age")],
                   function(x) t.test(x)$conf.int[2])
fig2 <- dots_bars(y = means, bars.lower = ci.lower, bars.upper = ci.upper,
                  main = "BMI by Sex and Age",
                  ylab = "BMI (mean +/- CI)",
                  xlab = "Age group")

Convenience Functions, Moving Window Statistics, and Graphics

Description

Collection of functions for running and summarizing statistical simulation studies, creating visualizations (e.g. CART Shiny app, histograms with fitted probability mass/density functions), calculating moving-window statistics efficiently, and performing common computations.

Details

Package:	dvmisc
Type:	Package
Version:	1.1.5
Date:	2020-09-27
License:	GPL-3

See CRAN documentation for full list of functions.

Author(s)

Dane R. Van Domelen
[email protected]

References

Eddelbuettel, D. and Francois, R. (2011) Rcpp: Seamless R and C++ Integration. Journal of Statistical Software, 40(8), 1-18. http://www.jstatsoft.org/v40/i08/.

Eddelbuettel, D. (2013) Seamless R and C++ Integration with Rcpp. Springer, New York. ISBN 978-1-4614-6867-7.

Eddelbuettel, D. and Balamuta, J.J. (2017). Extending R with C++: A Brief Introduction to Rcpp. PeerJ Preprints 5:e3188v1. https://doi.org/10.7287/peerj.preprints.3188v1.

Acknowledgment: This material is based upon work supported by the National Science Foundation Graduate Research Fellowship under Grant No. DGE-0940903.

Similar to expand.grid but with Sequences Reversed and Ability to Treat Variables as Sets

Description

Loops over the last argument, then the second-last, and so on. It should be faster than expand.grid.

Usage

expand_grid(..., together = NULL)
expand_grid(..., together = NULL)

Arguments

`...`	Vectors you want all combinations of.
`together`	Data frame of vectors, where each row is a set of parameter values that are always kept together.

Value

Data table.

Examples

# Simple example of expand.grid vs. expand_grid
expand.grid(x = c("a", "b", "c"), y = c(1, 2), z = c(TRUE, FALSE))
expand_grid(x = c("a", "b", "c"), y = c(1, 2), z = c(TRUE, FALSE))

# How to keep certain variables together
expand_grid(x = 1: 5, 
            together = data.frame(y = c("1a", "2a"), z = c("1b", "2b")))

# Simple example of expand.grid vs. expand_grid
expand.grid(x = c("a", "b", "c"), y = c(1, 2), z = c(TRUE, FALSE))
expand_grid(x = c("a", "b", "c"), y = c(1, 2), z = c(TRUE, FALSE))

# How to keep certain variables together
expand_grid(x = 1: 5, 
            together = data.frame(y = c("1a", "2a"), z = c("1b", "2b")))

Constant-Scale Gamma Model for Y vs. Covariates with Y Potentially Subject to Multiplicative Lognormal Errors

Description

Uses maximum likelihood to fit Y|X ~ Gamma(exp(beta_0 + beta_x^T X), b), with the shape-scale (as opposed to shape-rate) parameterization described in GammaDist. Y can be precisely measured or subject to multiplicative mean-1 lognormal errors, in which case replicates can be incorporated by specifying y as a list.

Usage

gammareg(
  y,
  x = NULL,
  merror = FALSE,
  integrate_tol = 1e-08,
  integrate_tol_hessian = integrate_tol,
  estimate_var = TRUE,
  fix_posdef = FALSE,
  ...
)
gammareg(
  y,
  x = NULL,
  merror = FALSE,
  integrate_tol = 1e-08,
  integrate_tol_hessian = integrate_tol,
  estimate_var = TRUE,
  fix_posdef = FALSE,
  ...
)

Arguments

`y`	Numeric vector.
`x`	Numeric vector or matrix. If `NULL`, model reduces to marginal Gamma model Y ~ Gamma(exp(beta_0), b).
`merror`	Logical value for whether to model multiplicative lognormal measurement errors in Y.
`integrate_tol`	Numeric value specifying the `tol` input to `hcubature`. Only used if `merror = TRUE`.
`integrate_tol_hessian`	Same as `integrate_tol`, but for use when estimating the Hessian matrix only. Sometimes more precise integration (i.e. smaller tolerance) than used for maximizing the likelihood helps prevent cases where the inverse Hessian is not positive definite.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`fix_posdef`	Logical value for whether to repeatedly reduce `integrate_tol_hessian` by factor of 5 and re-estimate Hessian to try to avoid non-positive definite variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Extract Mean Squared Error (MSE) from Fitted Regression Model

Description

The MSE, defined as the sum of the squared residuals divided by n-p (n = number of observations, p = number of regression coefficients), is an unbiased estimator for the error variance in a linear regression model. This is a convenience function that extracts the MSE from a fitted lm or glm object. The code is rev(anova(model.fit)$"Mean Sq")[1] if model.fit is a lm object and sum(model.fit$residuals^2) / model.fit$df.residual if model.fit is a glm object.

Usage

get_mse(model.fit, var.estimate = FALSE)
get_mse(model.fit, var.estimate = FALSE)

Arguments

`model.fit`	Fitted regression model returned from `lm` or `glm`.
`var.estimate`	If `TRUE`, function returns a variance estimate for the error variance, defined as `2 * MSE^2 / (n - p)`.

Value

If var.estimate = FALSE, numeric value indicating the MSE; if var.estimate = TRUE, named numeric vector indicating both the MSE and a variance estimate for the error variance.

Examples

# Generate 100 values: Y = 0.5 + 1.25 X + e, e ~ N(0, 1)
set.seed(123)
x <- rnorm(100)
y <- 0.5 + 1.25 * x + rnorm(100, sd = 1)

# Fit regression model using lm and using glm
lm.fit <- lm(y ~ x)
glm.fit <- glm(y ~ x)

# Extract MSE from lm.fit and glm.fit
get_mse(lm.fit)
get_mse(glm.fit)

# Generate 100 values: Y = 0.5 + 1.25 X + e, e ~ N(0, 1)
set.seed(123)
x <- rnorm(100)
y <- 0.5 + 1.25 * x + rnorm(100, sd = 1)

# Fit regression model using lm and using glm
lm.fit <- lm(y ~ x)
glm.fit <- glm(y ~ x)

# Extract MSE from lm.fit and glm.fit
get_mse(lm.fit)
get_mse(glm.fit)

Quickly Format and Save a ggplot Object for Posting to Facebook

Description

Optionally resizes title and axis labels and saves a 1000 dpi png 5 inches tall and ~9.5 inches wide via ggsave. Values are adjustable.

Usage

gg_facebook(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 5,
  width = height/0.525,
  dpi = 1000
)
gg_facebook(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 5,
  width = height/0.525,
  dpi = 1000
)

Arguments

`filename`	Character string
`path`	Character string
`plot`	`ggplot`
`resize_labels`	Logical value
`title_size`	Numeric value
`axistitle_size`	Numeric value
`axistext_size`	Numeric value
`height`	Numeric value
`width`	Numeric value
`dpi`	Numeric value

Quickly Format and Save a ggplot Object with Reasonable Defaults

Description

Optionally resizes title and axis labels and saves a 1000 dpi png 6 inches tall and 8 inches wide via ggsave. Values are adjustable.

Usage

gg_quicksave(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 6,
  width = height/0.75,
  dpi = 1000
)
gg_quicksave(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 6,
  width = height/0.75,
  dpi = 1000
)

Arguments

`filename`	Character string
`path`	Character string
`plot`	`ggplot` object
`resize_labels`	Logical value
`title_size`	Numeric value
`axistitle_size`	Numeric value
`axistext_size`	Numeric value
`height`	Numeric value
`width`	Numeric value
`dpi`	Numeric value

Quickly Format and Save a ggplot Object for Posting to Twitter

Description

Optionally resizes title and axis labels and saves a 600 dpi png 5 inches tall and ~8.9 inches wide via ggsave. Values are adjustable.

Usage

gg_twitter(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 5,
  width = height/0.5625,
  dpi = 600
)
gg_twitter(
  filename = paste("fig1-", Sys.Date(), ".png", sep = ""),
  path = getwd(),
  plot = last_plot(),
  resize_labels = TRUE,
  title_size = 16,
  axistitle_size = 13,
  axistext_size = 9,
  height = 5,
  width = height/0.5625,
  dpi = 600
)

Arguments

`filename`	Character string
`path`	Character string
`plot`	`ggplot` object
`resize_labels`	Logical value
`title_size`	Numeric value
`axistitle_size`	Numeric value
`axistext_size`	Numeric value
`height`	Numeric value
`width`	Numeric value
`dpi`	Numeric value

Return the First and Last Part of an Object

Description

Simply head and tail combined.

Usage

headtail(x, ...)
headtail(x, ...)

Arguments

`x`	Input object.
`...`	Additional arguments to pass to `head` and `tail` functions.

Value

Same class as x.

Examples

# Generate data from N(0, 1), sort, and look at smallest and largest 3 values
x <- rnorm(1000)
x.sorted <- sort(x)
headtail(x.sorted, 3)

# Generate data from N(0, 1), sort, and look at smallest and largest 3 values
x <- rnorm(1000)
x.sorted <- sort(x)
headtail(x.sorted, 3)

Histogram with Added Options

Description

Similar to base R function hist, but with two added features: (1) Can overlay one or more fitted probability density/mass functions (PDFs/PMFs) for any univariate distribution supported in R (see Distributions); and (2) Can generate more of a barplot type histogram, where each possible value gets its own bin centered over its value (useful for discrete variables with not too many possible values).

Usage

histo(
  x,
  dis = "none",
  dis_shift = NULL,
  integer_breaks = NULL,
  colors = rep("black", length(dis)),
  lty = 1:length(dis),
  legend_form = ifelse(length(dis) == 1, 0, 1),
  aic_decimals = 1,
  points_list = NULL,
  axis_list = NULL,
  legend_list = NULL,
  ...
)
histo(
  x,
  dis = "none",
  dis_shift = NULL,
  integer_breaks = NULL,
  colors = rep("black", length(dis)),
  lty = 1:length(dis),
  legend_form = ifelse(length(dis) == 1, 0, 1),
  aic_decimals = 1,
  points_list = NULL,
  axis_list = NULL,
  legend_list = NULL,
  ...
)

Arguments

`x`	Numeric vector of values.
`dis`	Character vector indicating which distributions should be used to add fitted PDF/PMF to the histogram. If not `"none"`, choices for each element are: `"beta"` `"binom"` (must specify `size`) `"cauchy"` `"chisq"` `"exp"` `"f"` `"gamma"` `"geom"` `"hyper"` (must specify total number of balls in urn, `N`, and number of balls drawn each time, `k`) `"lnorm"` `"nbinom"` (must specify `size`) `"norm"` `"pois"`, `"t"` `"unif"` `"weibull"`
`dis_shift`	Numeric value for shifting the fitted PDF/PMF along the x-axis of the histogram.
`integer_breaks`	If `TRUE`, integers covering the range of `x` are used for breaks, so there is one bin for each integer. Useful for discrete distributions that don't take on too many unique values.
`colors`	Character vector of colors for each PDF/PMF.
`lty`	Integer vector specifying line types for each curve.
`legend_form`	Integer value controlling what type of legend to include. Choices are 0 for no legend, 1 for legend naming each distribution, and 2 for legend naming each distribution and the corresponding AIC.
`aic_decimals`	Integer value for number of decimals for AIC.
`points_list`	Optional list of inputs to pass to `points` function, which is used to add the fitted PDF/PMF.
`axis_list`	Optional list of inputs to pass to `axis`.
`legend_list`	Optional list of inputs to pass to `legend`.
`...`	May include arguments to pass to `hist` and/or parameter values needed for certain distributions (`size` if `dis = "binom"` or `dis = "nbinom"`, `N` and `k` if `dis = "hyper"`).

Details

When x takes on whole numbers, you typically want to set dis_shift = -0.5 if right = TRUE (hist's default) and dis_shift = 0.5 if right = FALSE. The function will do this internally by default.

To illustrate, suppose a particular bin represents (7, 10]. Its midpoint will be at x = 8.5 on the graph. But if input values are whole numbers, this bin really only includes values of 8, 9, and 10, which have a mean of 9. So you really want f(9) to appear at x = 8.5. This requires shifting the curve to the left 0.5 units, i.e. setting dis_shift = -0.5.

When x takes on whole numbers with not too many unique values, you may want the histogram to show one bin for each integer. You can do this by setting integer_breaks = TRUE. By default, the function sets integer_breaks = TRUE if x contains whole numbers with 10 or fewer unique values.

Value

Histogram with fitted PDFs/PMFs if requested.

Examples

# Sample 10,000 Poisson(2) values and commpare default hist vs. histo
set.seed(123)
x <- rpois(n = 10000, lambda = 2)
par(mfrow = c(1, 2))
hist(x, main = "hist function")
histo(x, main = "histo function")

# Sample 10,000 lognormal(0, 0.35) values. Create histogram with curves
# showing fitted lognormal, normal, and Gamma PDFs.
set.seed(123)
x <- rlnorm(n = 10000, meanlog = 0, sdlog = 0.35)
par(mfrow = c(1, 1))
histo(x, c("lnorm", "norm", "gamma"), main = "X ~ Lognormal(0, 0.35)")

# Generate 10,000 Binomial(8, 0.25) values. Create histogram, specifying
# size = 5, with blue line/points showing fitted PMF.
set.seed(123)
x <- rbinom(n = 10000, size = 5, prob = 0.25)
par(mfrow = c(1, 1))
histo(x, dis = "binom", size = 5, colors = "blue", 
      points_list = list(type = "b"))

# Sample 10,000 Poisson(2) values and commpare default hist vs. histo
set.seed(123)
x <- rpois(n = 10000, lambda = 2)
par(mfrow = c(1, 2))
hist(x, main = "hist function")
histo(x, main = "histo function")

# Sample 10,000 lognormal(0, 0.35) values. Create histogram with curves
# showing fitted lognormal, normal, and Gamma PDFs.
set.seed(123)
x <- rlnorm(n = 10000, meanlog = 0, sdlog = 0.35)
par(mfrow = c(1, 1))
histo(x, c("lnorm", "norm", "gamma"), main = "X ~ Lognormal(0, 0.35)")

# Generate 10,000 Binomial(8, 0.25) values. Create histogram, specifying
# size = 5, with blue line/points showing fitted PMF.
set.seed(123)
x <- rbinom(n = 10000, size = 5, prob = 0.25)
par(mfrow = c(1, 1))
histo(x, dis = "binom", size = 5, colors = "blue", 
      points_list = list(type = "b"))

Check Whether Numeric Value Falls Inside Two Other Numeric Values

Description

Returns TRUE if x falls inside range defined by ends and FALSE otherwise. Also works for multiple sets of values and/or endpoints.

Usage

inside(x, ends, inclusive = TRUE)
inside(x, ends, inclusive = TRUE)

Arguments

`x`	Numeric value or vector of numeric values.
`ends`	Numeric vector of length 2 specifying the endpoints for the interval, or a 2-column numeric matrix where each row specifies a pair of endpoints.
`inclusive`	Logical value indicating whether endpoints should be included.

Value

Logical value or vector.

Examples

# Check whether 2 is inside [0, 2.5]
inside(1, c(0, 2.5))

# Check whether 2 and 3 are inside (0, 3)
inside(c(2, 3), c(0, 3), inclusive = FALSE)

# Check whether 1 is inside [1, 2] and [3, 4]
inside(1, rbind(c(1, 2), c(3, 4)))

# Check whether 2 is inside [0, 2.5]
inside(1, c(0, 2.5))

# Check whether 2 and 3 are inside (0, 3)
inside(c(2, 3), c(0, 3), inclusive = FALSE)

# Check whether 1 is inside [1, 2] and [3, 4]
inside(1, rbind(c(1, 2), c(3, 4)))

Split Continuous Variable into Equal-Width Groups

Description

Splits a continuous variable into equal-width groups. Useful for assessing linearity in regression models.

Usage

interval_groups(x, groups = 5, ...)
interval_groups(x, groups = 5, ...)

Arguments

`x`	Numeric vector.
`groups`	Numeric value specifying number of groups to create.
`...`	Arguments to pass to `cut`.

Value

Factor variable.

Examples

# Convert values from N(0, 1) into 6 equal-width groups
x <- rnorm(1000)
groups <- interval_groups(x, 6)
table(groups)

# Use interval_groups to detect non-linearity
set.seed(123)
x <- rnorm(1000)
y <- 1.5 + 1.25 * x + 0.25 * x^2 + rnorm(1000)
plot(tapply(y, interval_groups(x), mean))

# Convert values from N(0, 1) into 6 equal-width groups
x <- rnorm(1000)
groups <- interval_groups(x, 6)
table(groups)

# Use interval_groups to detect non-linearity
set.seed(123)
x <- rnorm(1000)
y <- 1.5 + 1.25 * x + 0.25 * x^2 + rnorm(1000)
plot(tapply(y, interval_groups(x), mean))

Iterate Function Over All Combinations of User-Specified Inputs, Potentially Multiple Times

Description

Same idea as purrr::pmap, but with some different functionality. It can runs all combinations of vector-valued arguments in ... or the 1st set, 2nd set, and so forth, and multiple trials can be run for each scenario, which can be useful for simulations.

Usage

iterate(
  f,
  ...,
  all_combinations = TRUE,
  fix = NULL,
  trials = 1,
  varnames = NULL
)
iterate(
  f,
  ...,
  all_combinations = TRUE,
  fix = NULL,
  trials = 1,
  varnames = NULL
)

Arguments

`f`	A function.
`...`	Arguments to `f`, any of which can be vector-valued.
`all_combinations`	Logical value for whether to iterate over all combinations of arguments in `...`, or just use the first set of elements, then the second, and so on.
`fix`	List of arguments to `f` to hold fixed rather than loop over.
`trials`	Numeric value.
`varnames`	Character vector of names for values that `f` returns, to avoid generic labels (V1, V2, ...).

Value

Data frame.

Examples

# Define function to generate data from N(mu, sigsq) and perform t-test.
f <- function(n = 100, mu = 0, sigsq = 1, alpha = 0.05) {
  x <- rnorm(n = n, mean = mu, sd = sqrt(sigsq))
  fit <- t.test(x = x, alpha = alpha)
  return(list(t = fit$statistic, p = fit$p.value))
}

# Call f once for various sample sizes and means
f %>% iterate(n = c(100, 500), mu = c(0.1, 0.25))

# Run 100 trials for each scenario and calculate empirical power
f %>% iterate(n = c(100, 500), mu = c(0.1, 0.25), trials = 100) %>%
  group_by(n, mu) %>%
  summarise(mean(p < 0.05))

# Define function to generate data from N(mu, sigsq) and perform t-test.
f <- function(n = 100, mu = 0, sigsq = 1, alpha = 0.05) {
  x <- rnorm(n = n, mean = mu, sd = sqrt(sigsq))
  fit <- t.test(x = x, alpha = alpha)
  return(list(t = fit$statistic, p = fit$p.value))
}

# Call f once for various sample sizes and means
f %>% iterate(n = c(100, 500), mu = c(0.1, 0.25))

# Run 100 trials for each scenario and calculate empirical power
f %>% iterate(n = c(100, 500), mu = c(0.1, 0.25), trials = 100) %>%
  group_by(n, mu) %>%
  summarise(mean(p < 0.05))

Add Elements of Second List to First List, Replacing Elements with Same Name

Description

Adds each element of list2 to list1, overriding any elements of the same name. Similar to modifyList function in utils package, but either list can be NULL. Useful for do.call statements, when you want to combine a list of default inputs with a list of user-specified inputs.

Usage

list_override(list1, list2)
list_override(list1, list2)

Arguments

`list1`	Initial list that has some number of named elements. Can be `NULL` or an empty list.
`list2`	List with named elements that will be added to `list1`, replacing any elements with the same name. Can be `NULL` or an empty list.

Value

List containing the named elements initially in list1 and not in list2, any additional named elements in list2, and any named elements in list1 that were replaced by elements of the same name in list2.

Examples

# Create list that has default inputs to the plot function
list.defaults <- list(x = 1: 5, y = 1: 5, type = "l", lty = 1)

# Create list of user-specified inputs to the plot function
list.user <- list(main = "A Straight Line", lty = 2, lwd = 1.25)

# Combine the two lists into one, giving priority to list.user
list.combined <- list_override(list.defaults, list.user)

# Plot data using do.call
do.call(plot, list.combined)

# Create list that has default inputs to the plot function
list.defaults <- list(x = 1: 5, y = 1: 5, type = "l", lty = 1)

# Create list of user-specified inputs to the plot function
list.user <- list(main = "A Straight Line", lty = 2, lwd = 1.25)

# Combine the two lists into one, giving priority to list.user
list.combined <- list_override(list.defaults, list.user)

# Plot data using do.call
do.call(plot, list.combined)

Convert Logit to Probability

Description

Defined as: exp_x <- exp(x); out <- exp_x / (1 + exp_x). This 2-step approach is faster than exp(x) / (1 + exp(x)) because the exponentials only have to be calculated once.

Usage

logit_prob(x)
logit_prob(x)

Arguments

`x`	Numeric vector.

Value

Numeric vector.

Linear Regression of log(Y) vs. Covariates with Y Potentially Subject to Multiplicative Lognormal Errors

Description

Uses maximum likelihood to fit Y|X ~ Lognormal(beta_0 + beta_x^T X, sigsq). Y can be precisely measured or subject to multiplicative mean-1 lognormal errors, in which case replicates can be incorporated by specifying y as a list).

Usage

lognormalreg(
  y,
  x = NULL,
  merror = FALSE,
  estimate_var = TRUE,
  fix_posdef = FALSE,
  ...
)
lognormalreg(
  y,
  x = NULL,
  merror = FALSE,
  estimate_var = TRUE,
  fix_posdef = FALSE,
  ...
)

Arguments

`y`	Numeric vector or list.
`x`	Numeric vector or matrix. If `NULL`, model reduces to marginal lognormal model Y ~ Lognormal(beta_0, sigsq).
`merror`	Logical value for whether to model multiplicative lognormal measurement errors in Y.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`fix_posdef`	Logical value for whether to repeatedly reduce `integrate_tol_hessian` by factor of 5 and re-estimate Hessian to try to avoid non-positive definite variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Graph Log-Odds of Binary Variable Across A Grouping Variable

Description

Creates plot showing sample log-odds of binary Y variable across levels of a grouping variable, with customizable error bars. Observations with missing values for y and/or group are dropped.

Usage

logodds_graph(
  y,
  group,
  error.bars = "none",
  alpha = 0.05,
  p.legend = "chi",
  plot.list = NULL,
  lines.list = NULL,
  axis.list = NULL,
  legend.list = NULL,
  ...
)
logodds_graph(
  y,
  group,
  error.bars = "none",
  alpha = 0.05,
  p.legend = "chi",
  plot.list = NULL,
  lines.list = NULL,
  axis.list = NULL,
  legend.list = NULL,
  ...
)

Arguments

`y`	Vector of values for binary response variable. Must take on 2 values, but can be any type (e.g. numeric, character, factor, logical). Function plots log-odds of second value returned by `table(y)`.
`group`	Vector of values indicating what group each `y` observation belongs to. Function plots group levels across x-axis in same order as `table(group)`.
`error.bars`	Character string indicating what the error bars should represent. Possible values are `"exact.ci"` for exact 95% confidence interval based on binomial distribution, `"z.ci"` for approximate 95% confidence interval based on Z distribution, and `"none"` for no error bars.
`alpha`	Numeric value indicating what alpha should be set to for confidence intervals. Only used if `error.bars` is `"exact.ci"` or `"z.ci"`.
`p.legend`	Character string controlling what p-value is printed in a legend. Possible values are `"chi"` for Chi-square test of association, `"fisher"` for Fisher's exact test, and `"none"` for no legend at all.
`plot.list`	Optional list of inputs to pass to `plot` function.
`lines.list`	Optional list of inputs to pass to `lines` function.
`axis.list`	Optional list of inputs to pass to `axis` function.
`legend.list`	Optional list of inputs to pass to `legend` function.
`...`	Additional arguments to pass to `chisq.test` or `fisher.test` functions.

Value

Plot showing log-odds of y across levels of group.

Maximum of Numeric Values

Description

Written in C++, this function tends to run faster than max for large numeric vectors/matrices.

Usage

max_n(x)
max_n(x)

Arguments

`x`	Numeric vector.

Value

Numeric value.

Examples

# For large objects, max_n is faster than max
x <- rnorm(100000)
max(x) == max_n(x)
benchmark(max(x), max_n(x), replications = 1000)

# For smaller objects, max_n is slower than max
x <- rnorm(100)
max(x) == max_n(x)
benchmark(max(x), max_n(x), replications = 1000)

# For large objects, max_n is faster than max
x <- rnorm(100000)
max(x) == max_n(x)
benchmark(max(x), max_n(x), replications = 1000)

# For smaller objects, max_n is slower than max
x <- rnorm(100)
max(x) == max_n(x)
benchmark(max(x), max_n(x), replications = 1000)

Mean of Integer Values

Description

Written in C++, this function runs faster than mean for large integer vectors/matrices.

Usage

mean_i(x)
mean_i(x)

Arguments

`x`	Integer vector or matrix.

Value

Numeric value.

Examples

# For very large integer objects, sum_i is faster than sum
x <- rpois(100000, lambda = 5)
mean(x) == mean_i(x)
benchmark(mean(x), mean_i(x), replications = 1000)

# For smaller integer objects, sum_i is slower than sum 
x <- rpois(1000, lambda = 5)
mean(x) == mean_i(x)
benchmark(mean(x), mean_i(x), replications = 1000)

# For very large integer objects, sum_i is faster than sum
x <- rpois(100000, lambda = 5)
mean(x) == mean_i(x)
benchmark(mean(x), mean_i(x), replications = 1000)

# For smaller integer objects, sum_i is slower than sum 
x <- rpois(1000, lambda = 5)
mean(x) == mean_i(x)
benchmark(mean(x), mean_i(x), replications = 1000)

Graph Means Across a Grouping Variable

Description

Creates plot showing mean of Y variable across levels of a grouping variable, with customizable error bars. Observations with missing values for y and/or group are dropped.

Usage

means_graph(
  y,
  group,
  error.bars = "t.ci",
  alpha = 0.05,
  p.legend = TRUE,
  plot.list = NULL,
  lines.list = NULL,
  axis.list = NULL,
  legend.list = NULL,
  ...
)
means_graph(
  y,
  group,
  error.bars = "t.ci",
  alpha = 0.05,
  p.legend = TRUE,
  plot.list = NULL,
  lines.list = NULL,
  axis.list = NULL,
  legend.list = NULL,
  ...
)

Arguments

`y`	Numeric vector of values for the continuous variable.
`group`	Vector of values indicating what group each `y` observation belongs to. Function plots group levels across x-axis in same order as `table(group)`.
`error.bars`	Character string indicating what the error bars should represent. Possible values are `"sd"` for +/- one standard deviation, `"se"` for +/- one standard error, `"t.ci"` for 95% confidence interval based on t distribution, `"z.ci"` for 95% confidence interval based on Z distribution, and `"none"` for no error bars.
`alpha`	Numeric value indicating what alpha should be set to for confidence intervals. Only used if `error.bars` is `"t.ci"` or `"z.ci"`.
`p.legend`	If `TRUE`, p-value (from `t.test` function if group has 2 levels, otherwise `aov` function) is printed in a legend.
`plot.list`	Optional list of inputs to pass to `plot` function.
`lines.list`	Optional list of inputs to pass to `lines` function.
`axis.list`	Optional list of inputs to pass to `axis` function.
`legend.list`	Optional list of inputs to pass to `legend` function.
`...`	Additional arguments to pass to `t.test` or `aov`.

Value

Plot showing mean of y across levels of group.

Minimum of Numeric Values

Description

Written in C++, this function tends to run faster than min for large numeric vectors/matrices.

Usage

min_n(x)
min_n(x)

Arguments

`x`	Numeric vector.

Value

Numeric value.

Examples

# For large objects, min_n is faster than min
x <- rnorm(100000)
min(x) == min_n(x)
benchmark(min(x), min_n(x), replications = 1000)

# For smaller objects, min_n is slower than min
x <- rnorm(100)
min(x) == min_n(x)
benchmark(min(x), min_n(x), replications = 20000)

# For large objects, min_n is faster than min
x <- rnorm(100000)
min(x) == min_n(x)
benchmark(min(x), min_n(x), replications = 1000)

# For smaller objects, min_n is slower than min
x <- rnorm(100)
min(x) == min_n(x)
benchmark(min(x), min_n(x), replications = 20000)

Maximum Likelihood Estimation for X[1], ..., X[n] ~ Gamma(alpha, beta)

Description

Performs maximization via nlminb. alpha and beta correspond to the shape and scale (not shape and rate) parameters described in GammaDist.

Usage

mle_gamma(x, alpha = NULL, beta = NULL, estimate_var = FALSE, ...)
mle_gamma(x, alpha = NULL, beta = NULL, estimate_var = FALSE, ...)

Arguments

`x`	Numeric vector.
`alpha`	Numeric value specifying known alpha.
`beta`	Numeric value specifying known beta.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Examples

# Generate 1,000 values from Gamma(0.5, 1) and estimate alpha and beta
set.seed(123)
x <- rgamma(1000, shape = 0.5, scale = 1)
mle_gamma(x)


# Generate 1,000 values from Gamma(0.5, 1) and estimate alpha and beta
set.seed(123)
x <- rgamma(1000, shape = 0.5, scale = 1)
mle_gamma(x)

Maximum Likelihood Estimation for X[1], ..., X[n] ~ Gamma(alpha, beta) Lognormal(mu, sigsq)

Description

Each observation is assumed to be the product of a Gamma(alpha, beta) and Lognormal(mu, sigsq) random variable. Performs maximization via nlminb. alpha and beta correspond to the shape and scale (not shape and rate) parameters described in GammaDist, and mu and sigsq correspond to meanlog and sdlog^2 in Lognormal.

Usage

mle_gamma_lnorm(
  x,
  gamma_mean1 = FALSE,
  lnorm_mean1 = TRUE,
  integrate_tol = 1e-08,
  estimate_var = FALSE,
  ...
)
mle_gamma_lnorm(
  x,
  gamma_mean1 = FALSE,
  lnorm_mean1 = TRUE,
  integrate_tol = 1e-08,
  estimate_var = FALSE,
  ...
)

Arguments

`x`	Numeric vector.
`gamma_mean1`	Whether to use restriction that the Gamma variable is mean-1.
`lnorm_mean1`	Whether to use restriction that the lognormal variable is mean-1.
`integrate_tol`	Numeric value specifying the `tol` input to `hcubature`.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Examples

# Generate 1,000 values from Gamma(0.5, 1) x Lognormal(-1.5/2, 1.5) and 
# estimate parameters
## Not run: 
set.seed(123)
x <- rgamma(1000, 0.5, 1) * rlnorm(1000, -1.5/2, sqrt(1.5))
mle_gamma_lnorm(x, control = list(trace = 1))

## End(Not run)


# Generate 1,000 values from Gamma(0.5, 1) x Lognormal(-1.5/2, 1.5) and 
# estimate parameters
## Not run: 
set.seed(123)
x <- rgamma(1000, 0.5, 1) * rlnorm(1000, -1.5/2, sqrt(1.5))
mle_gamma_lnorm(x, control = list(trace = 1))

## End(Not run)

Maximum Likelihood Estimation for X[1], ..., X[n] ~ Lognormal(mu, sigsq)

Description

Performs maximization via nlminb. mu and sigsq correspond to meanlog and sdlog^2 in Lognormal.

Usage

mle_lnorm(x, mu = NULL, sigsq = NULL, estimate_var = FALSE, ...)
mle_lnorm(x, mu = NULL, sigsq = NULL, estimate_var = FALSE, ...)

Arguments

`x`	Numeric vector.
`mu`	Numeric value specifying known mu.
`sigsq`	Numeric value specifying known sigsq.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Examples

# Generate 1,000 values from Lognormal(0.5, 1) and estimate mu and sigsq
set.seed(123)
x <- rlnorm(1000, meanlog = 0.5, sdlog = sqrt(1))
mle_lnorm(x)


# Generate 1,000 values from Lognormal(0.5, 1) and estimate mu and sigsq
set.seed(123)
x <- rlnorm(1000, meanlog = 0.5, sdlog = sqrt(1))
mle_lnorm(x)

Maximum Likelihood Estimation for X[1], ..., X[n] ~ Lognormal(mu1, sigsq1) Lognormal(mu2, sigsq2)

Description

Each observation is assumed to be the product of a Lognormal(mu1, sigsq1) and Lognormal(mu2, sigsq2) random variable, with mu2 and sigsq2 known. Performs maximization via nlminb. mu and sigsq correspond to meanlog and sdlog^2 in Lognormal.

Usage

mle_lnorm_lnorm(x, mu2 = NULL, sigsq2 = NULL, estimate_var = FALSE, ...)
mle_lnorm_lnorm(x, mu2 = NULL, sigsq2 = NULL, estimate_var = FALSE, ...)

Arguments

`x`	Numeric vector.
`mu2`	Numeric value specifying known mu2.
`sigsq2`	Numeric value specifying known sigsq2.
`estimate_var`	Logical value for whether to return Hessian-based variance-covariance matrix.
`...`	Additional arguments to pass to `nlminb`.

Value

List containing:

Numeric vector of parameter estimates.
Variance-covariance matrix (if estimate_var = TRUE).
Returned nlminb object from maximizing the log-likelihood function.
Akaike information criterion (AIC).

Examples

# Generate 1,000 values from Lognormal(0.5, 1) x Lognormal(0.75, 1.5) and 
# estimate parameters based on known mu and sigsq for one of them
set.seed(123)
x <- rlnorm(1000, 0.5, sqrt(1)) * rlnorm(1000, 0.75, sqrt(1.5))
mle_lnorm_lnorm(x, mu2 = 0.75, sigsq2 = 1.5)


# Generate 1,000 values from Lognormal(0.5, 1) x Lognormal(0.75, 1.5) and 
# estimate parameters based on known mu and sigsq for one of them
set.seed(123)
x <- rlnorm(1000, 0.5, sqrt(1)) * rlnorm(1000, 0.75, sqrt(1.5))
mle_lnorm_lnorm(x, mu2 = 0.75, sigsq2 = 1.5)

Moving Averages

Description

Calculates moving averages or maximum moving average. For optimal speed, use integer = TRUE if x is an integer vector and integer = FALSE otherwise.

Usage

moving_mean(x, window, integer = FALSE, max = FALSE)
moving_mean(x, window, integer = FALSE, max = FALSE)

Arguments

`x`	Integer or numeric vector.
`window`	Integer value specifying window length.
`integer`	Logical value for whether `x` is an integer vector.
`max`	Logical value for whether to return maximum moving average (as opposed to vector of moving averages).

Value

Numeric value or vector depending on max.

Examples

# 5-unit moving average for integer vector of length 10
x <- rpois(10, lambda = 3)
moving_mean(x, 5)


# 5-unit moving average for integer vector of length 10
x <- rpois(10, lambda = 3)
moving_mean(x, 5)

Calculate Per-Group Sample Size for Two-Sample Equal Variance T-Test

Description

Same idea as power.t.test. Less flexible, but faster.

Usage

n_2t_equal(d, sigsq, alpha = 0.05, beta = 0.2)
n_2t_equal(d, sigsq, alpha = 0.05, beta = 0.2)

Arguments

`d`	Numeric value specifying true difference in group means.
`sigsq`	Numeric value specifying the variance of observations.
`alpha`	Numeric value specifying type-1 error rate.
`beta`	Numeric value specifying type-2 error rate.

Value

Numeric value indicating per-group sample size, rounded up to the nearest whole number.

Examples

# Per-group sample size for 90% power to detect difference of 0.2 with
# sigsq = 1
n_2t_equal(d = 0.2, sigsq = 1, beta = 0.1)


# Per-group sample size for 90% power to detect difference of 0.2 with
# sigsq = 1
n_2t_equal(d = 0.2, sigsq = 1, beta = 0.1)

Calculate Per-Group Sample Size for Two-Sample Unequal Variance T-Test

Description

Unequal variance version of n_2t_equal. Assumes an equal sample size for both groups, which is actually not optimal.

Usage

n_2t_unequal(d, sigsq1, sigsq2, alpha = 0.05, beta = 0.2)
n_2t_unequal(d, sigsq1, sigsq2, alpha = 0.05, beta = 0.2)

Arguments

`d`	Numeric value specifying true difference in group means.
`sigsq1`, `sigsq2`	Numeric value specifying the variance of observations in each group.
`alpha`	Numeric value specifying type-1 error rate.
`beta`	Numeric value specifying type-2 error rate.

Value

Numeric value indicating per-group sample size, rounded up to the nearest whole number.

Examples

# Per-group sample size for 90% power to detect difference of 0.2 with
# sigsq's of 1 and 1.25
n_2t_unequal(d = 0.2, sigsq1 = 1, sigsq2 = 1.25, beta = 0.1)


# Per-group sample size for 90% power to detect difference of 0.2 with
# sigsq's of 1 and 1.25
n_2t_unequal(d = 0.2, sigsq1 = 1, sigsq2 = 1.25, beta = 0.1)

Convert Odds to Probability

Description

Defined simply as log(x / (x + 1)).

Usage

odds_prob(x)
odds_prob(x)

Arguments

`x`	Numeric vector.

Value

Numeric vector.

Plot Log-Likelihood vs. Values of One Parameter

Description

Generates plot of log-likelihood vs. one parameter of interest while other parameters are held fixed at certain values (e.g. MLEs). This is not a profile likelihood, and is mainly intended for use with a Shiny app.

Usage

plot_ll(
  start,
  objective,
  lower = -Inf,
  upper = Inf,
  xaxis_param = 1,
  xaxis_range = NULL,
  param_values = NULL,
  mles = NULL,
  return_info = FALSE
)
plot_ll(
  start,
  objective,
  lower = -Inf,
  upper = Inf,
  xaxis_param = 1,
  xaxis_range = NULL,
  param_values = NULL,
  mles = NULL,
  return_info = FALSE
)

Arguments

`start`	See `nlminb`.
`objective`	See `nlminb`.
`lower`	See `nlminb`.
`upper`	See `nlminb`.
`xaxis_param`	Integer value specifying which parameter should be plotted on the x-axis.
`xaxis_range`	Numeric vector specifying x-axis range over which to vary the parameter of interest. Only values with likelihood ratio > 0.01 are ultimately plotted.
`param_values`	Numeric vector of values to use for other parameters in model, in case you want an additional curve for log-likelihood function vs. parameter of interest at certain non-MLE values for other parameters. For example, if there are 3 parameters and `xaxis_param = 2`, you could set `param_values = c(0, NA, 0)`.
`mles`	Numeric vector of previously obtained maximum likelihood estimates.
`return_info`	Logical value for whether to return the estimated MLEs and 99.99% confidence intervals for parameters rather than create the plot.

Details

Note that objective should be the negative log-likelihood function, since internal optimization uses (nlminb), which does minimization.

Value

Plot of log-likelihood vs. value of parameter of interest, generated by ggplot.

Examples

# Generate normal data, define log-likelihood function, and plot likelihood
set.seed(123)
x <- rnorm(100, mean = 0.5, sd = sqrt(0.25))
ll.f <- function(theta) {
  return(-sum(dnorm(x, log = TRUE, mean = theta[1], sd = sqrt(theta[2]))))
}
plot_ll(start = c(0, 1), objective = ll.f, lower = c(-Inf, 1e-6))


# Generate normal data, define log-likelihood function, and plot likelihood
set.seed(123)
x <- rnorm(100, mean = 0.5, sd = sqrt(0.25))
ll.f <- function(theta) {
  return(-sum(dnorm(x, log = TRUE, mean = theta[1], sd = sqrt(theta[2]))))
}
plot_ll(start = c(0, 1), objective = ll.f, lower = c(-Inf, 1e-6))

Pooled Sample Variance

Description

Calculates pooled sample variance used in equal variance two-sample t-test.

Usage

pooled_var(x, y, integer = FALSE)
pooled_var(x, y, integer = FALSE)

Arguments

`x`, `y`	Integer or numeric vectors.
`integer`	Logical value for whether `x` and `y` are integer vectors.

Value

Numeric value.

Calculate Power for Two-Sample Equal Variance T-Test

Description

Same idea as power.t.test. Less flexible, but faster.

Usage

power_2t_equal(n = 100, d, sigsq, alpha = 0.05)
power_2t_equal(n = 100, d, sigsq, alpha = 0.05)

Arguments

`n`	Numeric value specifying per-group sample size.
`d`	Numeric value specifying true difference in group means. Should be positive.
`sigsq`	Numeric value specifying the variance of observations.
`alpha`	Numeric value specifying type-1 error rate.

Value

Numeric value.

Examples

# Power to detect difference of 0.2 with 100 subjects per group and sigsq = 1
power_2t_equal(n = 100, d = 0.2, sigsq = 1)

# Power to detect difference of 0.2 with 100 subjects per group and sigsq = 1
power_2t_equal(n = 100, d = 0.2, sigsq = 1)

Calculate Power for Two-Sample Unequal Variance T-Test

Description

Unequal variance version of power_2t_equal. Assumes an equal sample size for both groups, which is actually not optimal.

Usage

power_2t_unequal(n = 100, d, sigsq1, sigsq2, alpha = 0.05)
power_2t_unequal(n = 100, d, sigsq1, sigsq2, alpha = 0.05)

Arguments

`n`	Numeric value specifying per-group sample size.
`d`	Numeric value specifying true difference in group means. Should be positive.
`sigsq1`, `sigsq2`	Numeric value specifying the variance of observations in each group.
`alpha`	Numeric value specifying type-1 error rate.

Value

Numeric value.

Examples

# Power to detect difference of 0.2 with 100 subjects per group and sigsq's
# of 1 and 1.25
power_2t_unequal(n = 100, d = 0.2, sigsq1 = 1, sigsq2 = 1.25)

# Power to detect difference of 0.2 with 100 subjects per group and sigsq's
# of 1 and 1.25
power_2t_unequal(n = 100, d = 0.2, sigsq1 = 1, sigsq2 = 1.25)

Convert Probability to Logit

Description

Defined simply as log(x / (1 - x)).

Usage

prob_logit(x)
prob_logit(x)

Arguments

`x`	Numeric vector.

Value

Numeric vector.

Convert Probability to Odds

Description

Defined simply as x / (1 - x).

Usage

prob_odds(x)
prob_odds(x)

Arguments

`x`	Numeric vector.

Value

Numeric vector.

Split Continuous Variable into Quantile Groups

Description

Splits a continuous variable into quantiles groups. Basically combines quantile and cut into a single function. Note that create_qgroups will likely supersede this function in future versions of dvmisc.

Usage

quant_groups(
  x,
  groups = 4,
  probs = NULL,
  quantile.list = NULL,
  cut.list = NULL
)
quant_groups(
  x,
  groups = 4,
  probs = NULL,
  quantile.list = NULL,
  cut.list = NULL
)

Arguments

`x`	Numeric vector.
`groups`	Numeric value specifying number of quantile groups.
`probs`	Numeric vector specifying probabilities.
`quantile.list`	Arguments to pass to `quantile`.
`cut.list`	Arguments to pass to `cut`.

Value

Factor variable.

Examples

# Convert values from N(0, 1) into quintiles (i.e. 5 groups)
x <- rnorm(1000)
groups <- quant_groups(x, 5)
table(groups)

# Convert values from N(0, 1) into quintiles (i.e. 5 groups)
x <- rnorm(1000)
groups <- quant_groups(x, 5)
table(groups)

Split Continuous Variable into Quantile Groups (Survey Version)

Description

Complex survey version of quant_groups. Speeds up process of creating quantile groups based on survey weighted percentiles.

Usage

quant_groups_svy(x, by = NULL, groups = 4, probs = NULL, design)
quant_groups_svy(x, by = NULL, groups = 4, probs = NULL, design)

Arguments

`x`	Formula, e.g. `~varname`.
`by`	Formula, e.g. `~varname`.
`groups`	Numeric value specifying number of quantile groups.
`probs`	Numeric vector.
`design`	A `svydesign` or `svrepdesign` object.

Value

Factor variable.

Remove All Labels

Description

Calls remove_all_labels and then restores special variable names.

Usage

ral(x)
ral(x)

Arguments

`x`	Vector or data frame.

Value

Vector or data frame.

Reverse Cut

Description

Convenience function to get reversed factor levels from cut. Currently requires specifying breaks as vector of cutpoints rather than number of desired intervals.

Usage

reverse_cut(x, breaks, include.lowest = FALSE, right = TRUE, ...)
reverse_cut(x, breaks, include.lowest = FALSE, right = TRUE, ...)

Arguments

`x`, `breaks`, `include.lowest`, `right`	See `cut`. specifying number of intervals is not currently supported).
`...`	Arguments to pass to `cut`.

Value

Factor variable.

Examples

# In mtcars dataset, create 3 mpg groups
table(cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf)))

# Repeat with reverse_cut to get factor levels ordered from high to low
table(reverse_cut(mtcars$mpg, breaks = c(Inf, 20, 15, -Inf)))

# You can specify breaks from low to high, but then include.lowest and right 
# arguments get confusing
table(reverse_cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf), right = TRUE))

# In mtcars dataset, create 3 mpg groups
table(cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf)))

# Repeat with reverse_cut to get factor levels ordered from high to low
table(reverse_cut(mtcars$mpg, breaks = c(Inf, 20, 15, -Inf)))

# You can specify breaks from low to high, but then include.lowest and right 
# arguments get confusing
table(reverse_cut(mtcars$mpg, breaks = c(-Inf, 15, 20, Inf), right = TRUE))

Moving Correlations as Short Vector Slides Across Long Vector

Description

Uses C++ code for efficiency.

Usage

sliding_cor(short, long)
sliding_cor(short, long)

Arguments

`short`	Numeric vector.
`long`	Numeric vector.

Value

Numeric vector.

Examples

short <- rnorm(4)
long <- rnorm(10)
sliding_cor(short, long)


short <- rnorm(4)
long <- rnorm(10)
sliding_cor(short, long)

Moving Covariance as Short Vector Slides Across Long Vector

Description

Uses C++ code for efficiency.

Usage

sliding_cov(short, long)
sliding_cov(short, long)

Arguments

`short`	Numeric vector.
`long`	Numeric vector.

Value

Numeric vector.

Examples

short <- rnorm(4)
long <- rnorm(10)
sliding_cov(short, long)


short <- rnorm(4)
long <- rnorm(10)
sliding_cov(short, long)

Prep for Snipping or Copy/Pasting out of R

Description

Calls kable and kable_styling as appropriate based on class of input object and creates object ready to snip or copy/paste out of R. Currently works for data frames, matrices, and tables.

Usage

snip(x)
snip(x)

Arguments

`x`	Object to snip

Examples

# Print 4-cylinder cars
data(mtcars)
mtcars %>%
  dplyr::filter(cyl == 4) %>%
  snip()

# Crosstab of gears and cylinder using mtcars dataset
snip(table(mtcars$cyl, mtcars$gear))


# Print 4-cylinder cars
data(mtcars)
mtcars %>%
  dplyr::filter(cyl == 4) %>%
  snip()

# Crosstab of gears and cylinder using mtcars dataset
snip(table(mtcars$cyl, mtcars$gear))

Sum of Integer Values

Description

Written in C++, this function runs faster than sum for large integer vectors/matrices.

Usage

sum_i(x)
sum_i(x)

Arguments

`x`	Integer vector or matrix.

Value

Numeric value.

Examples

# For very large integer objects, sum_i is faster than sum
x <- rpois(100000, lambda = 5)
sum(x) == sum_i(x)
benchmark(sum(x), sum_i(x), replications = 1000)

# For smaller integer objects, sum_i is slower than sum 
x <- rpois(1000, lambda = 5)
sum(x) == sum_i(x)
benchmark(sum(x), sum_i(x), replications = 1000)

# For very large integer objects, sum_i is faster than sum
x <- rpois(100000, lambda = 5)
sum(x) == sum_i(x)
benchmark(sum(x), sum_i(x), replications = 1000)

# For smaller integer objects, sum_i is slower than sum 
x <- rpois(1000, lambda = 5)
sum(x) == sum_i(x)
benchmark(sum(x), sum_i(x), replications = 1000)

Summarize Simulation Results

Description

Creates table summarizing results of statistical simulations, providing common metrics of performance like mean bias, standard deviation, mean standard error, mean squared error, and confidence interval coverage.

Usage

sumsim(
  estimates,
  ses = NULL,
  truth = NULL,
  theta_0 = 0,
  statistics = c("mean_bias", "sd", "mean_se", "mse", "coverage"),
  alpha = 0.05,
  digits = 3,
  listwise_deletion = TRUE
)
sumsim(
  estimates,
  ses = NULL,
  truth = NULL,
  theta_0 = 0,
  statistics = c("mean_bias", "sd", "mean_se", "mse", "coverage"),
  alpha = 0.05,
  digits = 3,
  listwise_deletion = TRUE
)

Arguments

`estimates`	Numeric matrix where each column gives the point estimates for a particular method across multiple trials.
`ses`	Numeric matrix where each column gives the standard errors for a particular method across multiple trials.
`truth`	Numeric value specifying the true value of the parameter being estimated.
`theta_0`	Numeric value specifying null value for hypothesis test `H_0: theta = theta_0`. Only used for calculating empirical power.
`statistics`	Numeric vector specifying which performance metrics should be calculated. Possible values are `"n"` for number of trials, `"mean"`, `"median"`, `"mean_bias"`, `"median_bias"`, `"sd"`, `"iqr"`, `"mean_se"` (for mean standard error), `"mse"` (for mean squared error), `"coverage"` (for confidence interval coverage), `"ci_width"` for median confidence interval width, and `"power"` for empirical power.
`alpha`	Numeric value specifying alpha for confidence interval. Set to `0.05` for the usual 95% CI, `0.1` for a 90% CI, and so forth.
`digits`	Numeric value or vector specifying the number of decimal places to include.
`listwise_deletion`	Logical value for whether to remove trials in which any of the estimators have missing values.

Value

Numeric matrix.

Examples

# For X ~ N(mu, sigma^2), the MLE for sigma^2 is the sample variance with n 
# in the denominator, but the unbiased version with (n - 1) is typically used 
# for its unbiasedness. Compare these estimators in 1,000 trials with n = 25. 
MLE <- c()
Unbiased <- c()
for (ii in 1: 1000) {
   x <- rnorm(n = 25)
   MLE[ii] <- sum((x - mean(x))^2) / 25
   Unbiased[ii] <- sum((x - mean(x))^2) / 24
 }
sumsim(estimates = cbind(MLE, Unbiased), truth = 1)

# For X ~ N(mu, sigma^2), the MLE for sigma^2 is the sample variance with n 
# in the denominator, but the unbiased version with (n - 1) is typically used 
# for its unbiasedness. Compare these estimators in 1,000 trials with n = 25. 
MLE <- c()
Unbiased <- c()
for (ii in 1: 1000) {
   x <- rnorm(n = 25)
   MLE[ii] <- sum((x - mean(x))^2) / 25
   Unbiased[ii] <- sum((x - mean(x))^2) / 24
 }
sumsim(estimates = cbind(MLE, Unbiased), truth = 1)

Trim Tail Values off of a Vector

Description

Returns input vector with tail values trimmed off of it. User can specify tail probability to trim or lower and upper cutpoints for values to retain.

Usage

trim(x, p = NULL, tails = "both", cutpoints = NULL, keep.edge = TRUE)
trim(x, p = NULL, tails = "both", cutpoints = NULL, keep.edge = TRUE)

Arguments

`x`	Numeric vector.
`p`	Numeric value giving tail probability to trim from `x`. Can leave as `NULL` if you specify `cutpoints`.
`tails`	Numeric value indicating which tail should be trimmed. Possible values are `"both"`, `"lower"`, and `"upper"`.
`cutpoints`	Numeric vector indicating what range of values should be retained. For example, set to `c(0, 1)` to trim all values below 0 or greater than 1. Can leave as `NULL` if you specify `p`.
`keep.edge`	Logical value indicating whether values in `x` that are on the edge of being trimmed (i.e. equal to one of the endpoints) should be retained.

Value

Numeric vector.

Examples

# Generate data from N(0, 1) and then trim the lower and upper 1\%
x <- rnorm(1000)
y <- trim(x, p = 0.01)

# Generate data from N(0, 1) and then trim values outside of (-1.5, 1.5)
x <- rnorm(100000)
y <- trim(x, cutpoints = c(-1.5, 1.5))

# Generate data from N(0, 1) and then trim the lower and upper 1\%
x <- rnorm(1000)
y <- trim(x, p = 0.01)

# Generate data from N(0, 1) and then trim values outside of (-1.5, 1.5)
x <- rnorm(100000)
y <- trim(x, cutpoints = c(-1.5, 1.5))

Range of a Vector (Not Min/Max!)

Description

The base R function range returns the minimum and maximum of a vector, but the "range" is actually defined as the difference between the minimum and maximum. This function calculates the actual range. It is equivalent to the base R code diff(range(x)), but a bit simpler and much faster.

Usage

truerange(x, integer = FALSE)
truerange(x, integer = FALSE)

Arguments

`x`	Integer or numeric vector.
`integer`	Logical value for whether `x` is an integer vector.

Value

Integer or numeric value.

Examples

# truerange vs. diff(range()) for integer vector
x <- rpois(1000, lambda = 5)
all.equal(diff(range(x)), truerange(x, TRUE))
benchmark(diff(range(x)), truerange(x, TRUE), replications = 2000)

# truerange vs. diff(range()) for numeric vector
x <- rnorm(1000)
all.equal(diff(range(x)), truerange(x))
benchmark(diff(range(x)), truerange(x), replications = 2000)


# truerange vs. diff(range()) for integer vector
x <- rpois(1000, lambda = 5)
all.equal(diff(range(x)), truerange(x, TRUE))
benchmark(diff(range(x)), truerange(x, TRUE), replications = 2000)

# truerange vs. diff(range()) for numeric vector
x <- rnorm(1000)
all.equal(diff(range(x)), truerange(x))
benchmark(diff(range(x)), truerange(x), replications = 2000)

Return (Row, Column) Index of (First) Maximum of an Integer Matrix

Description

Written in C++, this function tends to run much faster than the equivalent (if maximum is unique) base R solution which(x == max(x), arr.ind = TRUE).

Usage

which_max_im(x)
which_max_im(x)

Arguments

`x`	Integer matrix.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_max_nv for numeric vector.
which_max_iv for integer vector.
which_max_nm for numeric matrix.
which_max_im for integer matrix.

Value

Integer vector.

Examples

# which_max_im is typically much faster than 
# which(x == max(x), arr.ind = TRUE)
x <- matrix(rpois(100, lambda = 15), ncol = 10)
all(which(x == max(x), arr.ind = TRUE) == which_max_im(x))
benchmark(which(x == max(x), arr.ind = TRUE), which_max_im(x), 
          replications = 5000)

# which_max_im is typically much faster than 
# which(x == max(x), arr.ind = TRUE)
x <- matrix(rpois(100, lambda = 15), ncol = 10)
all(which(x == max(x), arr.ind = TRUE) == which_max_im(x))
benchmark(which(x == max(x), arr.ind = TRUE), which_max_im(x), 
          replications = 5000)

Return Index of (First) Maximum of an Integer Vector

Description

Written in C++, this function tends to run faster than which.max for large integer vectors.

Usage

which_max_iv(x)
which_max_iv(x)

Arguments

`x`	Integer vector.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_max_nv for numeric vector.
which_max_iv for integer vector.
which_max_nm for numeric matrix.
which_max_im for integer matrix.

Value

Integer value.

Examples

# For long vectors, which_max_iv is faster than which.max
x <- rpois(10000, lambda = 15)
which.max(x) == which_max_iv(x)
benchmark(which.max(x), which_max_iv(x), replications = 5000)

# For shorter vectors, which_max_iv is slower than which.max
x <- rpois(100, lambda = 15)
which.max(x) == which_max_iv(x)
benchmark(which.max(x), which_max_iv(x), replications = 20000)

# For long vectors, which_max_iv is faster than which.max
x <- rpois(10000, lambda = 15)
which.max(x) == which_max_iv(x)
benchmark(which.max(x), which_max_iv(x), replications = 5000)

# For shorter vectors, which_max_iv is slower than which.max
x <- rpois(100, lambda = 15)
which.max(x) == which_max_iv(x)
benchmark(which.max(x), which_max_iv(x), replications = 20000)

Return (Row, Column) Index of (First) Maximum of a Numeric Matrix

Description

Written in C++, this function tends to run much faster than the equivalent (if maximum is unique) base R solution which(x == max(x), arr.ind = TRUE).

Usage

which_max_nm(x)
which_max_nm(x)

Arguments

`x`	Numeric matrix.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_max_nv for numeric vector.
which_max_iv for integer vector.
which_max_nm for numeric matrix.
which_max_im for integer matrix.

Value

Integer vector.

Examples

# which_max_nm is typically much faster than 
# which(x == max(x), arr.ind = TRUE)
x <- matrix(rnorm(100), ncol = 10)
all(which(x == max(x), arr.ind = TRUE) == which_max_nm(x))
benchmark(which(x == max(x), arr.ind = TRUE), which_max_nm(x),
          replications = 5000)

# which_max_nm is typically much faster than 
# which(x == max(x), arr.ind = TRUE)
x <- matrix(rnorm(100), ncol = 10)
all(which(x == max(x), arr.ind = TRUE) == which_max_nm(x))
benchmark(which(x == max(x), arr.ind = TRUE), which_max_nm(x),
          replications = 5000)

Return Index of (First) Maximum of a Numeric Vector

Description

Written in C++, this function tends to run faster than which.max for large numeric vectors.

Usage

which_max_nv(x)
which_max_nv(x)

Arguments

`x`	Numeric vector.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_max_nv for numeric vector.
which_max_iv for integer vector.
which_max_nm for numeric matrix.
which_max_im for integer matrix.

Value

Integer value.

Examples

# For long vectors, which_max_nv is faster than which.max
x <- rnorm(100000)
which.max(x) == which_max_nv(x)
benchmark(which.max(x), which_max_nv(x), replications = 500)

# For shorter vectors, which_max_nv is slower than which.max
x <- rnorm(100)
which.max(x) == which_max_nv(x)
benchmark(which.max(x), which_max_nv(x), replications = 10000)

# For long vectors, which_max_nv is faster than which.max
x <- rnorm(100000)
which.max(x) == which_max_nv(x)
benchmark(which.max(x), which_max_nv(x), replications = 500)

# For shorter vectors, which_max_nv is slower than which.max
x <- rnorm(100)
which.max(x) == which_max_nv(x)
benchmark(which.max(x), which_max_nv(x), replications = 10000)

Return (Row, Column) Index of (First) Minimum of an Integer Matrix

Description

Written in C++, this function tends to run much faster than the equivalent (if minimum is unique) base R solution which(x == min(x), arr.ind = TRUE).

Usage

which_min_im(x)
which_min_im(x)

Arguments

`x`	Integer matrix.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_min_nv for numeric vector.
which_min_iv for integer vector.
which_min_nm for numeric matrix.
which_min_im for integer matrix.

Value

Integer vector.

Examples

# which_min_im is typically much faster than 
# which(x == min(x), arr.ind = TRUE)
x <- matrix(rpois(100, lambda = 10), ncol = 10)
all(which(x == min(x), arr.ind = TRUE) == which_min_im(x))
benchmark(which(x == min(x), arr.ind = TRUE), which_min_im(x),
          replications = 5000)

# which_min_im is typically much faster than 
# which(x == min(x), arr.ind = TRUE)
x <- matrix(rpois(100, lambda = 10), ncol = 10)
all(which(x == min(x), arr.ind = TRUE) == which_min_im(x))
benchmark(which(x == min(x), arr.ind = TRUE), which_min_im(x),
          replications = 5000)

Return Index of (First) Minimum of an Integer Vector

Description

Written in C++, this function tends to run faster than which.min for large integer vectors.

Usage

which_min_iv(x)
which_min_iv(x)

Arguments

`x`	Integer vector.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_min_nv for numeric vector.
which_min_iv for integer vector.
which_min_nm for numeric matrix.
which_min_im for integer matrix.

Value

Integer value.

Examples

# For long vectors, which_min_iv is faster than which.min 
x <- rpois(10000, lambda = 15)
which.min(x) == which_min_iv(x)
benchmark(which.min(x), which_min_iv(x), replications = 5000)

# For shorter vectors, which_min_iv is slower than which.min
x <- rpois(100, lambda = 15)
which.min(x) == which_min_iv(x)
benchmark(which.min(x), which_min_iv(x), replications = 20000)

# For long vectors, which_min_iv is faster than which.min 
x <- rpois(10000, lambda = 15)
which.min(x) == which_min_iv(x)
benchmark(which.min(x), which_min_iv(x), replications = 5000)

# For shorter vectors, which_min_iv is slower than which.min
x <- rpois(100, lambda = 15)
which.min(x) == which_min_iv(x)
benchmark(which.min(x), which_min_iv(x), replications = 20000)

Return (Row, Column) Index of (First) Minimum of a Numeric Matrix

Description

Written in C++, this function tends to run much faster than the equivalent (if minimum is unique) base R solution which(x == min(x), arr.ind = TRUE).

Usage

which_min_nm(x)
which_min_nm(x)

Arguments

`x`	Numeric matrix.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_min_nv for numeric vector.
which_min_iv for integer vector.
which_min_nm for numeric matrix.
which_min_im for integer matrix.

Value

Integer vector.

Examples

# which_min_nm is typically much faster than 
# which(x == min(x), arr.ind = TRUE)
x <- matrix(rnorm(100), ncol = 10)
all(which(x == min(x), arr.ind = TRUE) == which_min_nm(x))
benchmark(which(x == min(x), arr.ind = TRUE), which_min_nm(x), 
          replications = 5000)

# which_min_nm is typically much faster than 
# which(x == min(x), arr.ind = TRUE)
x <- matrix(rnorm(100), ncol = 10)
all(which(x == min(x), arr.ind = TRUE) == which_min_nm(x))
benchmark(which(x == min(x), arr.ind = TRUE), which_min_nm(x), 
          replications = 5000)

Return Index of (First) Minimum of a Numeric Vector

Description

Written in C++, this function tends to run faster than which.min for large numeric vectors.

Usage

which_min_nv(x)
which_min_nv(x)

Arguments

`x`	Numeric vector.

Details

For optimal speed, choose the version of this function that matches the class of your x:

which_min_nv for numeric vector.
which_min_iv for integer vector.
which_min_nm for numeric matrix.
which_min_im for integer matrix.

Value

Integer value.

Examples

# For long vectors, which_min_nv is faster than which.min
x <- rnorm(100000)
which.min(x) == which_min_nv(x)
benchmark(which.min(x), which_min_nv(x), replications = 1000)

# For shorter vectors, which_min_nv is slower than which.min
x <- rnorm(100)
which.min(x) == which_min_nv(x)
benchmark(which.min(x), which_min_nv(x), replications = 10000)

# For long vectors, which_min_nv is faster than which.min
x <- rnorm(100000)
which.min(x) == which_min_nv(x)
benchmark(which.min(x), which_min_nv(x), replications = 1000)

# For shorter vectors, which_min_nv is slower than which.min
x <- rnorm(100)
which.min(x) == which_min_nv(x)
benchmark(which.min(x), which_min_nv(x), replications = 10000)

Return Index of (First) Maximum of a Vector

Description

Returns index of maximum for vectors and index or (row, column) position for matrices. For optimal speed, use integer = TRUE if x is an integer vector/matrix and integer = FALSE otherwise. Typically faster than which.max for matrices and for large vectors.

Usage

which.max2(x, arr.ind = FALSE, integer = FALSE)
which.max2(x, arr.ind = FALSE, integer = FALSE)

Arguments

`x`	Integer or numeric vector/matrix.
`arr.ind`	Logical value for whether to return (row, col) position rather than vector position, if `x` is a matrix.
`integer`	Logical value for whether `x` is an integer vector/matrix.

Value

Numeric value.

Examples

# which.max2 vs. which.max for integer vector
x <- rpois(10000, lambda = 5)
all.equal(which.max(x), which.max2(x, integer = TRUE))
benchmark(which.max(x), which.max2(x, integer = TRUE), replications = 10000)

# which.max2 vs. which.max for numeric vector
x <- rnorm(10000)
all.equal(which.max(x), which.max2(x))
benchmark(which.max(x), which.max2(x), replications = 10000)


# which.max2 vs. which.max for integer vector
x <- rpois(10000, lambda = 5)
all.equal(which.max(x), which.max2(x, integer = TRUE))
benchmark(which.max(x), which.max2(x, integer = TRUE), replications = 10000)

# which.max2 vs. which.max for numeric vector
x <- rnorm(10000)
all.equal(which.max(x), which.max2(x))
benchmark(which.max(x), which.max2(x), replications = 10000)

Return Index of (First) Minimum of a Vector

Description

Returns index of minimum for vectors and index or (row, column) position for matrices. For optimal speed, use integer = TRUE if x is an integer vector/matrix and integer = FALSE otherwise. Typically faster than which.min for matrices and for large vectors.

Usage

which.min2(x, arr.ind = FALSE, integer = FALSE)
which.min2(x, arr.ind = FALSE, integer = FALSE)

Arguments

`x`	Integer or numeric vector/matrix.
`arr.ind`	Logical value for whether to return (row, col) position rather than vector position, if `x` is a matrix.
`integer`	Logical value for whether `x` is an integer vector/matrix.

Value

Numeric value.

Examples

# which.min2 vs. which.min for integer vector
x <- rpois(10000, lambda = 10)
all.equal(which.min(x), which.min2(x, integer = TRUE))
benchmark(which.min(x), which.min2(x, integer = TRUE), replications = 10000)

# which.min2 vs. which.min for numeric vector
x <- rnorm(10000)
all.equal(which.min(x), which.min2(x))
benchmark(which.min(x), which.min2(x), replications = 10000)


# which.min2 vs. which.min for integer vector
x <- rpois(10000, lambda = 10)
all.equal(which.min(x), which.min2(x, integer = TRUE))
benchmark(which.min(x), which.min2(x, integer = TRUE), replications = 10000)

# which.min2 vs. which.min for numeric vector
x <- rnorm(10000)
all.equal(which.min(x), which.min2(x))
benchmark(which.min(x), which.min2(x), replications = 10000)

Package 'dvmisc'

Help Index

Convert Continuous BMI Values into 3-Level Factor

Description

Usage

Arguments

Value

Convert Continuous BMI Values into 4-Level Factor

Description

Usage

Arguments

Value

Shiny App Interface with rpart::rpart

Description

Usage

Arguments

Value

Create a Clean Summary Table from a glm Object

Description

Usage

Arguments

Value

Examples

Convert Numeric to Factor with Convenient Interface

Description

Usage

Arguments

Value

Examples

Create Quantile Groups

Description

Usage

Arguments

Value

Examples

Create Quantile Groups (Complex Survey Data)

Description

Usage

Arguments

Value

References

Cut with Decreasing Factor Levels

Description

Usage

Arguments

Value

Examples

Plot Points +/- Error Bars

Description

Usage

Arguments

Value

Examples

Convenience Functions, Moving Window Statistics, and Graphics

Description

Details

Author(s)

References

Similar to expand.grid but with Sequences Reversed and Ability to Treat Variables as Sets

Description

Usage

Arguments

Value

Examples

Constant-Scale Gamma Model for Y vs. Covariates with Y Potentially Subject to Multiplicative Lognormal Errors

Description

Usage

Arguments

Value

Extract Mean Squared Error (MSE) from Fitted Regression Model

Description

Usage

Arguments

Value

Examples

Quickly Format and Save a ggplot Object for Posting to Facebook

Description

Usage

Arguments

Quickly Format and Save a ggplot Object with Reasonable Defaults