Wrapper for lgb.train
tree-based models
with some expanded/advanced options.
train_lightgbm(
x,
y,
num_iterations = 10,
max_depth = 17,
num_leaves = 31,
link_max_depth = FALSE,
add_to_linked_depth = 2L,
categorical_feature = NULL,
weight = NULL,
validation = 0,
sample_type = "random",
early_stop = NULL,
max_bin = NULL,
feature_pre_filter = FALSE,
free_raw_data = TRUE,
verbose = 0,
save_tree_error = FALSE,
...
)
A matrix of predictors.
A numeric vector of outcome data.
Integer value for the number of iterations (trees) to grow.
Integer value for the maximum leaf distance from the root node.
Integer value for the maximum possible number of leaves in one tree.
Logical, default FALSE. When TRUE, and when
max_depth
is unconstrained -1
, then max_depth
will
be set to floor(log2(num_leaves)) + link_max_depth_add
.
Integer value to add to max_depth
when it
is linked to num_leaves
.
A character vector of feature names or an integer vector with the indices of the features.
A numeric vector of sample weights. Should be the same length
as the number of rows of x
.
A positive number on [0, 1)
. validation
is
the proportion of data in x
and y
that is used for
performance assessment and early stopping.
The sampling method for the validation set. Can be either
"random" (a completely random sample) or "recent" (the last X
where X is the proportion specified by validation
).
An integer or NULL
. If an integer, it is the
number of iterations without improvement before stopping.
Must be set when validation
is > 0.
Max number of bins that feature values will be bucketed in.
Tell LightGBM to ignore the features that are
unsplittable based on min_data_in_leaf
.
LightGBM constructs its data format, called a "Dataset",
from tabular data. By default, that Dataset object on the R side does not
keep a copy of the raw data. This reduces LightGBM's memory consumption,
but it means that the Dataset object cannot be changed after it has been
constructed. If you'd prefer to be able to change the Dataset object after
construction, set free_raw_data = FALSE
. Useful for debugging.
Integer. < 0: Fatal, = 0: Error (Warning), = 1: Info, > 1: Debug.
Boolean. Whether or not to use the training set
to compute errors for each tree that will be stored on the record_evals
attribute. Note that this parameter is mutually exclusive with validation
and early_stop
because otherwise it can override the set used for cross
validation.
Engine arguments, hyperparameters, etc. that are passed on to
lgb.train
.
A fitted lgb.Booster
object.