BPt.Dataset.k_bin#

Dataset.k_bin(scope, n_bins=5, strategy='uniform', inplace=False)[source]#

This method is used to apply k binning to a column, or columns. On the backend this function used the scikit-learn KBinsDiscretizer.

Parameters

scopeScope

A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.

n_binsint, optional

The number of bins to discretize the passed columns to. This same value is applied for all columns within scope.

default = 5

strategy‘uniform’, ‘quantile’ or ‘kmeans’, optional

The strategy in which the binning should be adhere to. Options are:

‘uniform’
All bins in each feature have identical widths.
‘quantile’
All bins in each feature have the same number of points.
‘kmeans’
Values in each bin have the same nearest center of a 1D k-means cluster.

default = 'uniform'

inplacebool, optional

If True, perform the current function inplace and return None.

default = False

Examples

import BPt as bp
data = bp.Dataset([.1, .2, .3, .4, .5, .6, .7, .8, .9],
                  columns=['feat'])

# Apply k_bin, not in place, then plot
data.k_bin('feat', n_bins=3, strategy='uniform').plot('feat')

# Apply with dif params
data.k_bin('feat', n_bins=5, strategy='quantile').plot('feat')