BPt.Dataset.to_binary#
- Dataset.to_binary(scope, drop=True, inplace=False)[source]#
This method works by setting all columns within scope to just two binary categories. This works by setting the two values as the top two categories, any others will have their subjects either dropped or replaced with NaN.
This method is designed for converting from already categorical data to explicitly type binary.
- Parameters
- scopeScope
A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.
- dropbool, optional
If set to True, default, then if more than two categories are found when converting a column to binary, then the subjects / rows with these extra values will be dropped from the Dataset. If False, then these values will be set to NaN and no rows dropped.
default = True
- inplacebool, optional
If True, perform the current function inplace and return None.
default = False
See also
binarize
For converting float data to binary.
Notes
This function with not work on columns of type Data Files.
Examples
Simple example with drop True and False below:
In [1]: data = bp.read_csv('data/example1.csv') In [2]: data Out[2]: animals numbers 0 'cat' 1.0 1 'cat' 2.0 2 'dog' 1.0 3 'dog' 2.0 4 'elk' NaN In [3]: data.to_binary('all', drop=True) Out[3]: animals numbers 0 0 0 1 0 1 2 1 0 3 1 1 In [4]: data.to_binary('all', drop=False) Out[4]: animals numbers 0 0 0 1 0 1 2 1 0 3 1 1 4 NaN NaN