The function fits POIS-glment and applies cross-validation (CV) to choose an optimal regularization parameter.

fit_pois_glmnet_mmdcv(
  z,
  r = max(z),
  train_ratio = 0.7,
  nlam = 12,
  lambda = 10^seq(-4, -1.3, length.out = nlam),
  nreps = 2,
  agg_func = mean,
  alpha = 1,
  use_parallel = F,
  ncores = 7,
  symmetrize = TRUE
)

Arguments

z

is a potentially sparse data array of dimensions: (sample size) x (data dimension)

r

maximum number of levels (K)'

train_ratio

train/validation split ratio will be train_ratio/(1-train_ratio).

nlam

number of regularization parameters (lambda) to use; ignored if lambda is provided.

lambda

the vector of regularization parameters to use.

nreps

the number of CV splits to average over.

agg_func

the aggregation function for the MMDs.

Value

The lambda vector, the regularization curve, a list of fitted POIS models, the index of the optimal model and the optimal lambda

Details

The maximum mean discrepency (MMD) is used as the evaluation metric for CV. The MMD is computed between a sample from the original data and one from the fitted model. The MMD is computed for a sequence of Gaussian kernels with varying bandwidths, and aggregated using a user-supplied function.

Examples

out = fit_pois_glmnet_mmdcv(amazon, lambda = 10^seq(-4,-1.3, length.out = 5))
#> Loading required package: Matrix
#> --- Starting CV --- #> Rep 1/2 ...
#> Error in sample.int(length(x), size, replace, prob): invalid 'size' argument