BPt.Dataset.split_by#

Dataset.split_by(scope, decode_values=True)[source]#

This method allows splitting the dataset into sub datasets by the different unique values of a passed scope. A dictionary is returned with the different splits. Note this method is simmilar to the native pandas.DataFrame.groupby(), but difers slightly in practice.

Parameters
scopeScope
Any valid BPt style Scope used to select the combinition of columns, or a single column, in which the dataset should be split by. If multiple columns, the unique overlap will be created, and that used to split by.
Note that any column(s) selected should be categorical.
decode_valuesbool, optional

Boolean argument, if True, then when splitting try to use the original value names before any encoding for the names in the returned dictionary.

default = True
Returns
splitsdict of Dataset

Returned from this method is a dictionary of splits from the original Dataset, where each element is index’ed by the unique value.