Package website: release | dev

tic CRAN Status StackOverflow Mattermost CodeFactor

This package provides hyperband tuning for mlr3.

Installation

Install the last release from CRAN:

install.packages("mlr3hyperband")

Install the development version from GitHub:

remotes::install_github("mlr-org/mlr3hyperband")

Resources

Short description

Hyperband is a budget oriented-procedure, weeding out suboptimally performing configurations early on during their training process aiming at increasing the efficiency of the tuning procedure. For this, several brackets are constructed with an associated set of configurations for each bracket. These configuration are initialized by stochastic, often uniform, sampling. Each bracket is divided into multiple stages, and configurations are evaluated for a increasing budget in each stage. Note that currently all configurations are trained completely from the beginning, so no online updates to the models are performed.

Different brackets are initialized with different number of configurations, and different budget sizes. To identify the budget for evaluating hyperband, the user has to specify explicitly which hyperparameter of the learner influences the budget by tagging a single hyperparameter in the parameter set with "budget". An alternative approach using subsampling and pipelines is described further below.

Examples

Basic

If you are already familiar with mlr3tuning, then the only change compared to other tuners is to give a numeric hyperparameter a budget tag. Afterwards, you can handle hyperband like all other tuners. Originally, hyperband was created with a “natural” learning parameter as the budget parameter in mind, like nrounds of the XGBoost learner.

library(mlr3hyperband)
library(mlr3learners)

# define hyperparameter and budget parameter
search_space = ps(
  nrounds = p_int(lower = 1, upper = 16, tags = "budget"),
  eta = p_dbl(lower = 0, upper = 1),
  booster = p_fct(levels = c("gbtree", "gblinear", "dart"))
)

# hyperparameter tuning on the pima indians diabetes data set
instance = tune(
  method = "hyperband",
  task = tsk("pima"),
  learner = lrn("classif.xgboost", eval_metric = "logloss"),
  resampling = rsmp("cv", folds = 3),
  measure = msr("classif.ce"),
  search_space = search_space
)

# best performing hyperparameter configuration
instance$result
##    nrounds       eta booster learner_param_vals  x_domain classif.ce
## 1:       2 0.4364793    dart          <list[6]> <list[3]>  0.2669271

Subsampling

Additionally, it is also possible to use mlr3hyperband to tune learners that do not have a natural fidelity parameter. In such a case mlr3pipelines can be used to define data subsampling as a preprocessing step. Then, the frac parameter of subsampling, defining the fraction of the training data to be used, can act as the budget parameter.

library(mlr3verse)
library(mlr3hyperband)

learner = po("subsample") %>>% lrn("classif.rpart")

# define subsampling parameter as budget
search_space = ps(
  classif.rpart.cp = p_dbl(lower = 0.001, upper = 0.1),
  classif.rpart.minsplit = p_int(lower = 1, upper = 10),
  subsample.frac = p_dbl(lower = 0.1, upper = 1, tags = "budget")
)

# hyperparameter tuning on the pima indians diabetes data set
instance = tune(
  method = "hyperband",
  task = tsk("pima"),
  learner = learner,
  resampling = rsmp("cv", folds = 3),
  measure = msr("classif.ce"),
  search_space = search_space
)

# best performing hyperparameter configuration
instance$result
##    classif.rpart.cp classif.rpart.minsplit subsample.frac learner_param_vals  x_domain classif.ce
## 1:       0.02258595                      4              1          <list[6]> <list[3]>  0.2421875

Successive Halving

library(mlr3hyperband)
library(mlr3learners)

# define hyperparameter and budget parameter
search_space = ps(
  nrounds = p_int(lower = 1, upper = 16, tags = "budget"),
  eta = p_dbl(lower = 0, upper = 1),
  booster = p_fct(levels = c("gbtree", "gblinear", "dart"))
)

# hyperparameter tuning on the pima indians diabetes data set
instance = tune(
  method = "successive_halving",
  task = tsk("pima"),
  learner = lrn("classif.xgboost", eval_metric = "logloss"),
  resampling = rsmp("cv", folds = 3),
  measure = msr("classif.ce"),
  search_space = search_space
)

# best performing hyperparameter configuration
instance$result
##    nrounds     eta booster learner_param_vals  x_domain classif.ce
## 1:       8 0.57244  gbtree          <list[6]> <list[3]>  0.2304688