BPt.Dataset.to_binary#

Dataset.to_binary(scope, drop=True, inplace=False)[source]#

This method works by setting all columns within scope to just two binary categories. This works by setting the two values as the top two categories, any others will have their subjects either dropped or replaced with NaN.

This method is designed for converting from already categorical data to explicitly type binary.

Parameters
scopeScope

A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.

dropbool, optional

If set to True, default, then if more than two categories are found when converting a column to binary, then the subjects / rows with these extra values will be dropped from the Dataset. If False, then these values will be set to NaN and no rows dropped.

default = True
inplacebool, optional

If True, perform the current function inplace and return None.

default = False

See also

binarize

For converting float data to binary.

Notes

This function with not work on columns of type Data Files.

Examples

Simple example with drop True and False below:

In [1]: data = bp.read_csv('data/example1.csv')

In [2]: data
Out[2]: 
  animals  numbers
0   'cat'      1.0
1   'cat'      2.0
2   'dog'      1.0
3   'dog'      2.0
4   'elk'      NaN

In [3]: data.to_binary('all', drop=True)
Out[3]: 
  animals numbers
0       0       0
1       0       1
2       1       0
3       1       1

In [4]: data.to_binary('all', drop=False)
Out[4]: 
  animals numbers
0       0       0
1       0       1
2       1       0
3       1       1
4     NaN     NaN