BPt.CV#

class BPt.CV(splits=3, n_repeats=1, cv_strategy=None, random_state='context', only_fold=None, **cv_strategy_kwargs)[source]#

This object is used to define a BPt style custom CV strategy, e.g., as KFold

Parameters
splitsint, float, str or list of str, optional

splits allows you to specify the base of what CV strategy should be used.

Specifically, options for split are:

  • int

    The number of k-fold splits to conduct. (E.g., 3 for 3-fold CV split to be conducted at every hyper-param evaluation).

  • float

    Must be 0 < splits < 1, and defines a single train-test like split, with splits % of the current training data size used as a validation set.

  • str

    If a str is passed, then it must correspond to a loaded categorical non input variable. In this case, a leave-out-group CV will be used according to the value of the variable.

Note that the parameter n_repeats is designed to work with any of these choices.

default = 3
n_repeatsint, optional

The number of times to repeat the defined strategy as defined in splits.

For example, if n_repeats is set to 2, and splits is 3, then a twice repeated 3-fold CV will be performed

default = 1
cv_strategyNone or CVStrategy, optional

Optional cv_strategy to employ for calculating splits. If passed None, use no strategy.

See CVStrategy

Can also pass valid cv_strategy args seperately. If any passed, they will override any values set in the pass cv_strategy if any.

default = None
random_state‘context’, int or None, optional

The fixed random seed in which this CV object should adhere to.

If left as default value of ‘context’, then the random state will be set based on the context of where it is called, i.e., typically the random_state set in ProblemSpec.

default = 'context'
only_foldint, list of int, or None, optional

This parameter specifies if special subset of the requested CV folds should be used. If kept as None, normal all fold behavior will be used. Otherwise, if passed as an int, then that int must represent a valid cv fold, e.g.,

only_fold = 0

Would run the CV but only the 1st fold. Likewise if a list is passed,

only_fold = [0, 2]

Then only the first and 3rd folds will be run. This parameter is useful in cases where the base experiment is too computationally intensive, and it is desired to run a complete CV but in smaller chunks.

Warning

When used with n_repeats > 1, only_fold will index folds from the repeats, e.g., split=2, n_repeats=2, only_fold can be 0, 1, 2, or 3, but functionally for say computing summary scores, like std across repeats, n_repeats will be treated as 1.

For example if passed with the setup above only_fold=[0, 1, 2], then progress bars and summary stats will still show n_repeats=1.

default = None
cv_strategy_kwargskwargs, optional

If any additional parameters are passed in kwargs style, e.g.,

splits = 3

Then they will try to be set in the base cv_strategy.

Methods

copy()

This method returns a deepcopy of the base object.

get_params([deep])

Get parameters for this estimator.

set_params(**params)

Set the parameters of this estimator.