Wrapper for lgb.train tree-based models
with some expanded/advanced options.
Usage
train_lightgbm(
x,
y,
num_iterations = 10,
max_depth = 17,
num_leaves = 31,
link_max_depth = FALSE,
add_to_linked_depth = 2L,
categorical_feature = NULL,
weight = NULL,
validation = 0,
sample_type = "random",
early_stop = NULL,
max_bin = NULL,
feature_pre_filter = FALSE,
free_raw_data = TRUE,
verbose = 0,
save_tree_error = FALSE,
...
)Arguments
- x
A matrix of predictors.
- y
A numeric vector of outcome data.
- num_iterations
Integer value for the number of iterations (trees) to grow.
- max_depth
Integer value for the maximum leaf distance from the root node.
- num_leaves
Integer value for the maximum possible number of leaves in one tree.
- link_max_depth
Logical, default FALSE. When TRUE, and when
max_depthis unconstrained-1, thenmax_depthwill be set tofloor(log2(num_leaves)) + link_max_depth_add.- add_to_linked_depth
Integer value to add to
max_depthwhen it is linked tonum_leaves.- categorical_feature
A character vector of feature names or an integer vector with the indices of the features.
- weight
A numeric vector of sample weights. Should be the same length as the number of rows of
x.- validation
A positive number on
[0, 1).validationis the proportion of data inxandythat is used for performance assessment and early stopping.- sample_type
The sampling method for the validation set. Can be either "random" (a completely random sample) or "recent" (the last X where X is the proportion specified by
validation).- early_stop
An integer or
NULL. If an integer, it is the number of iterations without improvement before stopping. Must be set whenvalidationis > 0.- max_bin
Max number of bins that feature values will be bucketed in.
- feature_pre_filter
Tell LightGBM to ignore the features that are unsplittable based on
min_data_in_leaf.- free_raw_data
LightGBM constructs its data format, called a "Dataset", from tabular data. By default, that Dataset object on the R side does not keep a copy of the raw data. This reduces LightGBM's memory consumption, but it means that the Dataset object cannot be changed after it has been constructed. If you'd prefer to be able to change the Dataset object after construction, set
free_raw_data = FALSE. Useful for debugging.- verbose
Integer. < 0: Fatal, = 0: Error (Warning), = 1: Info, > 1: Debug.
- save_tree_error
Boolean. Whether or not to use the training set to compute errors for each tree that will be stored on the record_evals attribute. Note that this parameter is mutually exclusive with
validationandearly_stopbecause otherwise it can override the set used for cross validation.- ...
Engine arguments, hyperparameters, etc. that are passed on to
lgb.train.
