BPt.evaluate#

BPt.evaluate(pipeline, dataset, problem_spec='default', cv=5, progress_bar=True, store_preds=True, store_estimators=True, store_timing=True, store_cv=True, store_data_ref=True, decode_feat_names=True, eval_verbose=1, progress_loc=None, mute_warnings=False, **extra_params)[source]#

This method is used to evaluate a model pipeline with cross-validation.

Parameters
pipelinePipeline
A BPt input class Pipeline to be intialized according to the passed dataset and problem_spec. This parameter can be either an instance of Pipeline, ModelPipeline or one of the below cases.
In the case that a single str is passed, it will assumed to be a model indicator str and the pipeline used will be:
pipeline = Pipeline(Model(pipeline))

Likewise, if just a Model passed, then the input will be cast as:

pipeline = Pipeline(pipeline)
datasetDataset
The Dataset in which this function should be evaluated in the context of. In other words, the dataset is used as the data source for this operation.
Arguments within problem_spec can be used to select just subsets of data. For example parameter scope can be used to select only some columns or parameter subjects to select a subset of subjects.
problem_specProblemSpec or ‘default’, optional

This parameter accepts an instance of the params class ProblemSpec. The ProblemSpec is essentially a wrapper around commonly used parameters needs to define the context the model pipeline should be evaluated in. It includes parameters like problem_type, scorer, n_jobs, random_state, etc…

See ProblemSpec for more information and for how to create an instance of this object.

If left as ‘default’, then will initialize a ProblemSpec with default params.

default = "default"
cvCV or sklearn CV, optional

This parameter controls what type of cross-validation splitting strategy is used. You may pass a number of options here.

  • An instance of CV representing a custom strategy as defined by the BPt style CV.

  • The custom str ‘test’, which specifies that the whole train set should be used to train the pipeline and the full test set used to validate it (assuming that a train test split has been defined in the underlying dataset)

  • Any valid scikit-learn style option: Which include an int to specify the number of folds in a (Stratified) KFold, a sklearn CV splitter or an iterable yielding (train, test) splits as arrays of indices.

default = 5
progress_barbool, optional

If True, then an progress bar will be displaying showing fit progress across evaluation splits.

default = True
store_predsbool, optional

If set to True, the returned EvalResults will store the saved predictions under BPt.EvalResults.preds. This includes a saved copy of the true target values as well.

If False, the preds parameter will be empty and it will not be possible to use some related functions.

default = True
store_estimatorsbool, optional

If True, then the returned EvalResults will store the fitted estimators from evaluation under BPt.EvalResults.estimators.

If False, the estimators parameter will be empty, and it will not be possible to access measures of feature importance as well.

default = True
store_timingbool, optional

If True, then the returned EvalResults will store the time it took to fit and score the pipeline under BPt.EvalResults.timing.

default = True
store_cvbool, optional

If True, then the returned EvalResults will store a copy of the exact CV splitter object used during evaluation.

default = True
store_data_refbool, optional

If True, then will store a shallow copy of the dataset used in evaluate, that can be re-used to calculate different post evaluate attributes.

default = True
decode_feat_namesbool, optional

If True, then the BPt.EvalResults.feat_names as computed during evaluation will try to use the original values as first loaded to inform their naming. Note that this is only relevant assuming that Dataset was used to encode one or more columns in the first place.

default = True
eval_verboseint, optional

The requested verbosity of the evaluator. 0 or greater means just warnings, 1 or greater for some info and 2 and greater for even more.

Set to negative values to mute warnings. Specifically, setting to -1 will mute all warnings as generated by the call to evaluate, and even further, you can set to -2 or lower, which will mute all warnings regardless of where they are generated from. Note: You can also optionally set the binary flag mute_warnings to accomplish the same thing.

Note: This parameter is called eval_verbose, as the pipeline has an argument called verbose, which can be used to set verbosity for the pipeline.

Changed default from 0 to 1 in version 2.0.3

default = 1
progress_locstr or None, optional

This parameter is not currently implemented.

default = None
mute_warningsbool, optional

Mute any warning regardless of where they are generated from. This can also be done by setting eval_verbose to -2 or lower.

default = False
extra_paramsproblem_spec or pipeline params, optional

You may pass as extra arguments to this function any pipeline or problem_spec argument as python kwargs style value pairs.

For example:

target=1

Would override the value of the target parameter in the passed problem_spec. Or for example:

model=Model('ridge')
Returns
evaluatorEvalResults

Returns an instance of the EvalResults class. This object stores a wealth of information, including the scores from this evaluation as well as other utilities including functions for calculating feature importances from trained models.

See also

cross_val_score

Similar sklearn style function.

cross_validate

Similar sklearn style function.

Compare

Input class for specifying comparisons between parameter options.

Notes

This function can accept within the pipeline and problem_spec parameters the special input Compare class. This option is designed for explicitly running evaluate multiple times under different configurations.
This function supports predictions on an underlying target with missing values. It does this by automatically moving any data points with a NaN in the target to the validation set (keeping in mind this is done after the fold is computed by CV, so final size may vary). While subjects with missing values will obviously not contribute to the validation score, as long as store_preds is set to True, predictions will still be made for these subjects.