BPt.Dataset.set_role#

Dataset.set_role(scope, role, inplace=False)[source]#

This method is used to set a role for either a single column or multiple, as set through the scope parameter. See Role for more information about how roles are used within BPt.

Parameters
scopeScope

A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.

roleRole

A valid role in which to set all columns in the passed scope to. Input must be either ‘input data’ / ‘data’, ‘target’ or ‘non input’. Note: By default all columns will default to role of ‘input data’, which can be referenced by reserved keys ‘data’ or ‘input data’.

See Role for more information on how each role differs.

inplacebool, optional

If True, perform the current function inplace and return None.

default = False

See also

set_target

Specifically for setting target role.

set_non_input

Specifically for setting non input role.

get_roles

Returns a dictionary with saved roles.

Examples

Setting columns role’s within the Dataset is an essential part of using the object.

In [1]: data = bp.read_csv('data/example1.csv')

In [2]: data = data.set_role('animals', 'target')

In [3]: data
Out[3]: 
  animals  numbers
0   'cat'      1.0
1   'cat'      2.0
2   'dog'      1.0
3   'dog'      2.0
4   'elk'      NaN

In [4]: data.get_roles()
Out[4]: {'animals': 'target', 'numbers': 'input data'}

We can also use the method to set columns to role non input, which has the additional constraint that no NaN values can be present in that column. So we can see below that one row is dropped.

In [5]: data = data.set_role('numbers', 'non input')
Dropped 1 Rows

In [6]: data
Out[6]: 
  animals  numbers
0   'cat'      1.0
1   'cat'      2.0
2   'dog'      1.0
3   'dog'      2.0

In [7]: data.get_roles()
Out[7]: {'animals': 'target', 'numbers': 'non input'}