BPt.Dataset.k_bin#
- Dataset.k_bin(scope, n_bins=5, strategy='uniform', inplace=False)[source]#
This method is used to apply k binning to a column, or columns. On the backend this function used the scikit-learn KBinsDiscretizer.
- Parameters
- scopeScope
A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.
- n_binsint, optional
The number of bins to discretize the passed columns to. This same value is applied for all columns within scope.
default = 5
- strategy‘uniform’, ‘quantile’ or ‘kmeans’, optional
The strategy in which the binning should be adhere to. Options are:
- ‘uniform’
All bins in each feature have identical widths.
- ‘quantile’
All bins in each feature have the same number of points.
- ‘kmeans’
Values in each bin have the same nearest center of a 1D k-means cluster.
default = 'uniform'
- inplacebool, optional
If True, perform the current function inplace and return None.
default = False
Examples
import BPt as bp data = bp.Dataset([.1, .2, .3, .4, .5, .6, .7, .8, .9], columns=['feat']) # Apply k_bin, not in place, then plot data.k_bin('feat', n_bins=3, strategy='uniform').plot('feat') # Apply with dif params data.k_bin('feat', n_bins=5, strategy='quantile').plot('feat')