BPt.Ensemble#

class BPt.Ensemble(obj, models, params=0, scope='all', param_search=None, target_scaler=None, base_model=None, cv=None, single_estimator=False, n_jobs_type='ensemble', **extra_params)[source]#

The Ensemble object is a ModelPipeline (or Pipeline) piece, designed to be passed as an estimator, the same as Model. This class is used to create a variety of different ensemble based estimators.

Parameters
objstr
Each str passed to ensemble refers to a type of ensemble to train, based on also the passed input to the models parameter, and also the additional parameters passed when initializing this object.
See Ensemble Types to see all available options for ensembles.

Warning

Passing custom objects here, while technically possible, is not currently full supported. That said, there are just certain assumptions that the custom object must meet in order to work, specifically, they should have similar input params to other similar existing ensembles, e.g., in the case the single_estimator is False and needs_split is also False, then the passed object needs to be able to accept an input parameter estimators, which accepts a list of (str, estimator) tuples. Whereas if needs_split is still False, but single_estimator is True, then the passed object needs to support an init param of base_estimator, which accepts a single estimator.

modelsModel, Ensemble or list of
The models parameter is designed to accept any single model-like pipeline parameter object, i.e., Model or even another Ensemble. The passed pieces here will be used along with the requested ensemble object to create the requested ensemble.
See Model for how to create a valid base model(s) to pass as input here.
New in version 2.1.4: You may now pass instances of class Pipeline directly, without model wrapping them.
paramsint, str or dict of params, optional
The parameter params can be used to set an associated distribution of hyper-parameters, fixed parameters or combination of.
Preset parameter options can be found distributions are listed for each choice of params with the corresponding obj at Pipeline Options.
More information on how this parameter works can be found at Params.
default = 0
scopeScope, optional
The scope parameter determines the subset of features / columns in which this object should operate on within the created pipeline. For example, by specifying scope = ‘float’, then this object will only operate on columns with scope float.
See Scope for more information on how scopes can be specified.
default = 'all'
param_searchParamSearch, None, optional
This parameter optionally specifies that this object should be nested with a hyper-parameter search.
If passed an instance of ParamSearch, the underlying object, or components of the underlying object (if a pipeline) must have atleast one valid hyper-parameter distribution to search over.
If left as None, the default, then no hyper-parameter search will be performed.
default = None
target_scalerScaler, None, optional
Can optionally pass an instance of Scaler here to have properly nested target scaling / reverse scaling (before scoring) applied.

Warning

This parameter is still experimental. It has not been fully tested in complicated nesting cases, e.g., if Model is wrapping a nested Pipeline, this param will likely break.

default = None
base_modelModel, None, optional
In the case that an ensemble method which has the parameter final_estimator (not base model), for example in the case of stacking, then you may pass a Model type object here to be used as that final estimator.
Otherwise, by default this will be left as None, and if the requested ensemble has the final_estimator parameter, then it will pass None to the object (which is typically for setting the default).
default = None
cvCV or None, optional
Used for passing custom nested internal CV split behavior to ensembles which employ splits, e.g., stacking.
The passed input can be either an instance of CV or can be any valid scikit-learn style cv, e.g., the integer 5.
default = None
single_estimatorbool, optional
The parameter single_estimator is used to let the Ensemble object know if the passed models should be a single estimator, or in other words if the base ensemble object is expecting the input as just one estimator.
This parameter is used for ensemble types that requires an init param base_estimator. In the case that multiple models are passed to models, but single_estimator is True, then the models will automatically be wrapped in a voting ensemble, thus creating one single estimator.
default = False
n_jobs_type‘ensemble’ or ‘models’, optional
Valid options are either ‘ensemble’ or ‘models’.
This parameter controls how the total n_jobs are distributed, if ‘ensemble’, then the n_jobs will be used all in the ensemble object and every instance within the sub-models set to n_jobs = 1. Alternatively, if passed ‘models’, then the ensemble object will not be multi-processed, i.e., will be set to n_jobs = 1, and the n_jobs will be distributed to each base model.
For example, if you are training a stacking regressor with n_jobs = 16, and you have 16+ models, then ‘ensemble’ is likely a good choice here. If instead you have only 3 base models, and one or more of those 3 could benefit from a higher n_jobs, then setting n_jobs_type to ‘models’ might give a speed-up.
default = 'ensemble'
extra_paramsExtra Params
You may pass additional kwargs style arguments for this piece as Extra Params. Any values passed here will be used to try and set that value in the requested obj.
Any parameter value pairs specified here will take priority over any set via params. For example, lets say in the object we are initializing, ‘fake obj’ it has a parameter called size, and we want it fixed as 10, we can specify that with:
(obj='fake obj', ..., size=10)

See Extra Params for more information.

Methods

build([dataset, problem_spec])

This method is used to convert a single pipeline piece into the base sklearn style object used in the pipeline.

copy()

This method returns a deepcopy of the base object.

get_params([deep])

Get parameters for this estimator.

set_params(**params)

Set the parameters of this estimator.