BPt.Dataset.to_data_file#

Dataset.to_data_file(scope, load_func=<function load>, inplace=False)[source]#

This method can be used to cast any existing columns where the values are file paths, to a data file.

Parameters
scopeScope

A BPt style Scope used to select a subset of column(s) in which to apply the current function to. See Scope for more information on how this can be applied.

load_funcpython function, optional
Fundamentally columns of type ‘data file’ represent a path to a saved file, which means you must also provide some information on how to load the saved file. This parameter is where that loading function should be passed. The passed load_func will be called on each file individually and whatever the output of the function is will be passed to the different loading functions.
You might need to pass a user defined custom function in some cases, e.g., you want to use numpy.load(), but then also numpy.stack(). Just wrap those two functions in one, and pass the new function.
def my_wrapper(x):
    return np.stack(np.load(x))
Note that in this case where a custom function is defined it is reccomended that you define this function in a separate file from where the main script will be run, and then import the function.
By default this function assumes data files are passed as numpy arrays, and uses the default function numpy.load(), when nothing else is specified.
default = np.load
inplacebool, optional

If True, perform the current function inplace and return None.

default = False

See also

add_data_files

Method for adding new data files

consolidate_data_files

Merge existing data files into one column.

Examples

This method can be used as a the primary way to prepare data files. We will perform a simple example here.

In [1]: import BPt as bp

In [2]: data = bp.Dataset()

In [3]: data['files'] = ['data/loc1.npy', 'data/loc2.npy']

In [4]: data
Out[4]: 
           files
0  data/loc1.npy
1  data/loc2.npy

We now have a Dataset, but out column ‘files’ is not quite ready, as by default it won’t know what to do with str. To get it to treat it as as a data file we will cast it.

In [5]: data = data.to_data_file('files')

In [6]: data
Out[6]: 
  files
0     0
1     1

What’s happened here? Now it doesn’t show paths anymore, but instead shows integers. That’s actually the desired behavior though, we can check it out in file_mapping.

In [7]: data.file_mapping
Out[7]: 
{0: DataFile(loc='/home/runner/work/BPt/BPt/doc/data/loc1.npy'),
 1: DataFile(loc='/home/runner/work/BPt/BPt/doc/data/loc2.npy')}

The file_mapping is then used internally with Loader to load objects on the fly.