TunerHyperband class that implements hyperband tuning. Hyperband is a budget oriented-procedure, weeding out suboptimal performing configurations early in a sequential training process, increasing tuning efficiency as a consequence.

For this, several brackets are constructed with an associated set of configurations for each bracket. Each bracket as several stages. Different brackets are initialized with different amounts of configurations and different budget sizes. To get an idea of how the bracket layout looks like for a given argument set, please have a look in the details.

To identify the budget for evaluating hyperband, the user has to specify explicitly which hyperparameter of the learner influences the budget by tagging a single hyperparameter in the paradox::ParamSet with "budget". An alternative approach using subsampling and pipelines is described below.

Naturally, hyperband terminates once all of its brackets are evaluated, so a bbotk::Terminator in the tuning instance acts as an upper bound and should be only set to a low value if one is unsure of how long hyperband will take to finish under the given settings.

## Source

Li L, Jamieson K, DeSalvo G, Rostamizadeh A, Talwalkar A (2018). “Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization.” Journal of Machine Learning Research, 18(185), 1-52. https://jmlr.org/papers/v18/16-558.html.

## Details

This sections explains the calculation of the constants for each bracket. A small overview will be given here, but for more details please check out the original paper (see references). To keep things uniform with the notation in the paper (and to safe space in the formulas), R is used for the upper budget that last remaining configuration should reach. The formula to calculate the amount of brackets is floor(log(R, eta)) + 1. To calculate the starting budget in each bracket, use R * eta^(-s), where s is the maximum bracket minus the current bracket index. For the starting configurations in each bracket it is ceiling((B/R) * ((eta^s)/(s+1))), with B = (bracket amount) * R. To receive a table with the full brackets layout, load the following function and execute it for the desired R and eta.

hyperband_brackets = function(R, eta) {

result = data.frame()
smax = floor(log(R, eta))
B = (smax + 1) * R

# outer loop - iterate over brackets
for (s in smax:0) {

n = ceiling((B/R) * ((eta^s)/(s+1)))
r = R * eta^(-s)

# inner loop - iterate over bracket stages
for (i in 0:s) {

ni = floor(n * eta^(-i))
ri = r * eta^i
result = rbind(result, c(smax - s + 1, i + 1, ri, ni))
}
}

names(result) = c("bracket", "bracket_stage", "budget", "n_configs")
return(result)
}

hyperband_brackets(R = 81L, eta = 3L)


## Parameters

eta

numeric(1)
Fraction parameter of the successive halving algorithm: With every step the configuration budget is increased by a factor of eta and only the best 1/eta configurations are used for the next stage. Non-integer values are supported, but eta is not allowed to be less or equal 1.

sampler

Object defining how the samples of the parameter space should be drawn during the initialization of each bracket. The default is uniform sampling.

## Archive

The mlr3tuning::ArchiveTuning holds the following additional columns that are specific to the hyperband tuner:

• bracket (integer(1))
The console logs about the bracket index are actually not matching with the original hyperband algorithm, which counts down the brackets and stops after evaluating bracket 0. The true bracket indices are given in this column.

• bracket_stage (integer(1))
The bracket stage of each bracket. Hyperband starts counting at 0.

• budget_scaled (numeric(1))
The intermediate budget in each bracket stage calculated by hyperband. Because hyperband is originally only considered for budgets starting at 1, some rescaling is done to allow budgets starting at different values. For this, budgets are internally divided by the lower budget bound to get a lower budget of 1. Before the learner receives its budgets for evaluation, the budget is transformed back to match the original scale again.

• budget_real (numeric(1))
The real budget values the learner uses for evaluation after hyperband calculated its scaled budget.

• n_configs (integer(1))
The amount of evaluated configurations in each stage. These correspond to the r_i in the original paper.

## Hyperband without learner budget

Thanks to mlr3pipelines, it is possible to use hyperband in combination with learners lacking a natural budget parameter. For example, any mlr3::Learner can be augmented with a mlr3pipelines::PipeOp operator such as mlr3pipelines::PipeOpSubsample. With the subsampling rate as budget parameter, the resulting mlr3pipelines::GraphLearner is fitted on small proportions of the mlr3::Task in the first brackets, and on the complete Task in last brackets. See examples for some code.

## Custom sampler

Hyperband supports custom paradox::Sampler object for initial configurations in each bracket. A custom sampler may look like this (the full example is given in the examples section):

# - beta distribution with alpha = 2 and beta = 5
# - categorical distribution with custom probabilities
sampler = SamplerJointIndep$new(list( Sampler1DRfun$new(params[[2]], function(n) rbeta(n, 2, 5)),


## Super class

mlr3tuning::Tuner -> TunerHyperband

## Methods

### Public methods

Inherited methods

### Method new()

Creates a new instance of this R6 class.

#### Arguments

deep

Whether to make a deep clone.

## Examples

if(requireNamespace("xgboost")) {
library(mlr3)
library(mlr3learners)
library(mlr3tuning)
library(mlr3hyperband)

# Define hyperparameter and budget parameter for tuning with hyperband
ps = ParamSet$new(list( ParamInt$new("nrounds", lower = 1, upper = 4, tag = "budget"),
ParamDbl$new("eta", lower = 0, upper = 1), ParamFct$new("booster", levels = c("gbtree", "gblinear", "dart"))
))

# Define termination criterion
# Hyperband terminates itself
terminator = trm("none")

# Create tuning instance
inst = TuningInstanceSingleCrit$new( task = tsk("iris"), learner = lrn("classif.xgboost"), resampling = rsmp("holdout"), measure = msr("classif.ce"), search_space = ps, terminator = terminator, ) # Load tuner tuner = tnr("hyperband", eta = 2L) # \donttest{ # Trigger optimization tuner$optimize(inst)

# Print all evaluations
as.data.table(inst\$archive)# }
}
#> Loading required namespace: xgboost#> [04:29:21] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:21] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:21] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:21] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:22] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:23] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:23] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:23] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.
#> [04:29:23] WARNING: amalgamation/../src/learner.cc:1061: Starting in XGBoost 1.3.0, the default evaluation metric used with the objective 'multi:softprob' was changed from 'merror' to 'mlogloss'. Explicitly set eval_metric if you'd like to restore the old behavior.#>            eta  booster nrounds bracket bracket_stage budget_scaled budget_real
#>  1: 0.03745117     dart       1       2             0             1           1
#>  2: 0.59355379     dart       1       2             0             1           1
#>  3: 0.23697755   gbtree       1       2             0             1           1
#>  4: 0.90629727     dart       1       2             0             1           1
#>  5: 0.03745117     dart       2       2             1             2           2
#>  6: 0.59355379     dart       2       2             1             2           2
#>  7: 0.03745117     dart       4       2             2             4           4
#>  8: 0.91766051     dart       2       1             0             2           2
#>  9: 0.97284422 gblinear       2       1             0             2           2
#> 10: 0.81908245     dart       2       1             0             2           2
#> 11: 0.91766051     dart       4       1             1             4           4
#> 12: 0.72175973   gbtree       4       0             0             4           4
#> 13: 0.86661570     dart       4       0             0             4           4
#> 14: 0.23845311 gblinear       4       0             0             4           4
#>     n_configs classif.ce                                uhash
#>  1:         4       0.08 8095ddab-98fc-43b0-941b-eff009a394c2
#>  2:         4       0.08 e66e53cd-6dea-4dbb-9e66-52cb98e2ecf7
#>  3:         4       0.08 a636914d-b302-4147-a3bc-51abc1cc1c9b
#>  4:         4       0.08 00e531b5-a64f-4f63-817a-54a552a99b82
#>  5:         2       0.08 14a355cd-fd1d-412c-b3a4-fe0a1f0dea6d
#>  8:         3       0.04 148f7095-69c6-4912-aa8f-f6645b1ce078
#>  9:         3       0.40 540e1f69-ed8f-411e-9d14-379f4f910204
#> 10:         3       0.04 0163bf92-2bae-400c-9dd7-1fa2c869dbed
#> 12:         3       0.04 2e6d8e02-4d1b-40a7-8b16-78c56d711389
#> 13:         3       0.04 c3fedf01-6fc5-48bc-a83d-a5254b5d6345
#> 14:         3       0.40 3255a53a-76d8-4b93-8977-b648e5f75c0d
#>               timestamp batch_nr x_domain_nrounds x_domain_eta x_domain_booster
#>  1: 2021-03-21 04:29:22        1                1   0.03745117             dart
#>  2: 2021-03-21 04:29:22        1                1   0.59355379             dart
#>  3: 2021-03-21 04:29:22        1                1   0.23697755           gbtree
#>  4: 2021-03-21 04:29:22        1                1   0.90629727             dart
#>  5: 2021-03-21 04:29:22        2                2   0.03745117             dart
#>  6: 2021-03-21 04:29:22        2                2   0.59355379             dart
#>  7: 2021-03-21 04:29:22        3                4   0.03745117             dart
#>  8: 2021-03-21 04:29:23        4                2   0.91766051             dart
#>  9: 2021-03-21 04:29:23        4                2   0.97284422         gblinear
#> 10: 2021-03-21 04:29:23        4                2   0.81908245             dart
#> 11: 2021-03-21 04:29:23        5                4   0.91766051             dart
#> 12: 2021-03-21 04:29:23        6                4   0.72175973           gbtree
#> 13: 2021-03-21 04:29:23        6                4   0.86661570             dart
#> 14: 2021-03-21 04:29:23        6                4   0.23845311         gblinear