BPt.Dataset.k_bin#

Dataset.k_bin(scope, n_bins=5, strategy='uniform', inplace=False)[source]#

This method is used to apply k binning to a column, or columns. On the backend this function used the scikit-learn KBinsDiscretizer.

Parameters
scopeScope

A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.

n_binsint, optional

The number of bins to discretize the passed columns to. This same value is applied for all columns within scope.

default = 5
strategy‘uniform’, ‘quantile’ or ‘kmeans’, optional

The strategy in which the binning should be adhere to. Options are:

  • ‘uniform’

    All bins in each feature have identical widths.

  • ‘quantile’

    All bins in each feature have the same number of points.

  • ‘kmeans’

    Values in each bin have the same nearest center of a 1D k-means cluster.

default = 'uniform'
inplacebool, optional

If True, perform the current function inplace and return None.

default = False

Examples

import BPt as bp
data = bp.Dataset([.1, .2, .3, .4, .5, .6, .7, .8, .9],
                  columns=['feat'])

# Apply k_bin, not in place, then plot
data.k_bin('feat', n_bins=3, strategy='uniform').plot('feat')

# Apply with dif params
data.k_bin('feat', n_bins=5, strategy='quantile').plot('feat')
../../_images/BPt-Dataset-k_bin-1.png