BPt.Dataset#

class BPt.Dataset(data=None, index=None, columns=None, dtype=None, copy=None, roles=None, scopes=None, targets=None, non_inputs=None, verbose=1)[source]#
The BPt Dataset class is the main class used for preparing data into a compatible format to work with machine learning. This class is new as of BPt version 2 (replacing the building in loading functions of the old BPt_ML).
See Loading Data for more a comprehensive guide on this object.
This class can be initialized like a pandas.DataFrame, or typically from a pandas.DataFrame. This class has some constraints relative to using DataFrames. Some of these are that columns must be strings (if passed as int-like will be cast to strings), and that there cannot be duplicate column names.
This class can be initialized in most of the same ways that a pandas DataFrame can be initialized, for example
In [1]: data = bp.Dataset(np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]]),
   ...:                   columns=['a', 'b', 'c'])
   ...: 

In [2]: data
Out[2]: 
   a  b  c
0  1  2  3
1  4  5  6
2  7  8  9

Or from a pandas DataFrame.

In [3]: import pandas as pd

In [4]: df = pd.DataFrame([1, 2, 3], columns=['a'])

In [5]: df
Out[5]: 
   a
0  1
1  2
2  3

In [6]: data = bp.Dataset(df)

In [7]: data
Out[7]: 
   a
0  1
1  2
2  3

The Dataset also has some extra optional constructor parameters: roles, scopes, targets and non_inputs, which are just helpers for setting parameters at the time of construction. For example:

In [8]: data = bp.Dataset([1, 2, 3], columns=['1'], targets=['1'])

In [9]: data.get_roles()
Out[9]: {'1': 'target'}

New in version 2.0.0.

Attributes

at

Access a single value for a row/column label pair.

attrs

Dictionary of global attributes of this dataset.

axes

Return a list representing the axes of the DataFrame.

columns

The column labels of the DataFrame.

dtypes

Return the dtypes in the DataFrame.

empty

Indicator whether Series/DataFrame is empty.

flags

Get the properties associated with this pandas object.

iat

Access a single value for a row/column pair by integer position.

iloc

Purely integer-location based indexing for selection by position.

index

The index (row labels) of the DataFrame.

loc

Access a group of rows and columns by label(s) or a boolean array.

ndim

Return an int representing the number of axes / array dimensions.

reserved_roles

The dataset class has three fixed and therefore reserved (i.e., cannot be set as a scope) roles, these are .

reservered_scopes

There are a number of reserved fixed scopes in the Dataset class, these are .

shape

Return a tuple representing the dimensionality of the DataFrame.

size

Return an int representing the number of elements in this object.

style

Returns a Styler object.

values

Return a Numpy representation of the DataFrame.

verbose

This parameter takes a verbosity level as an integer, where level 0 is just warnings, a value lower than 0 can be set to mute warnings and then higher values for more verbosity.

T

Methods

abs()

Return a Series/DataFrame with absolute numeric value of each element.

add(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator add).

add_data_files(files[, file_to_subject, ...])

This method allows adding columns of type 'data file' to the Dataset class.

add_prefix(prefix)

Prefix labels with string prefix.

add_scope(scope, scope_val[, inplace])

This method is designed as helper for adding a new scope val to a number of columns at once, using the existing scope system.

add_suffix(suffix)

Suffix labels with string suffix.

add_unique_overlap(cols, new_col[, ...])

This function is designed to add a new column

agg([func, axis])

Aggregate using one or more operations over the specified axis.

aggregate([func, axis])

Aggregate using one or more operations over the specified axis.

align(other[, join, axis, level, copy, ...])

Align two objects on their axes with the specified join method.

all([axis, bool_only, skipna, level])

Return whether all elements are True, potentially over an axis.

any(*[, axis, bool_only, skipna, level])

Return whether any element is True, potentially over an axis.

append(other[, ignore_index, ...])

(DEPRECATED) Append rows of other to the end of caller, returning a new object.

apply(func[, axis, raw, result_type, args])

Apply a function along an axis of the DataFrame.

apply_exclusions(subjects[, inplace])

This method will drop all subjects that overlap with the passed subjects to this function.

apply_inclusions(subjects[, inplace])

This method will drop all subjects that do not overlap with the passed subjects to this function.

applymap(func[, na_action])

Apply a function to a Dataframe elementwise.

asfreq(freq[, method, how, normalize, ...])

Convert time series to specified frequency.

asof(where[, subset])

Return the last row(s) without any NaNs before where.

assign(**kwargs)

Assign new columns to a DataFrame.

astype(dtype[, copy, errors])

Cast a pandas object to a specified dtype dtype.

at_time(time[, asof, axis])

Select values at particular time of day (e.g., 9:30AM).

auto_detect_categorical([scope, obj_thresh, ...])

This function will attempt to automatically add scope "category" to any loaded categorical variables.

backfill(*[, axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='bfill'.

between_time(start_time, end_time[, ...])

Select values between particular times of the day (e.g., 9:00-9:30 AM).

bfill(*[, axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='bfill'.

binarize(scope, threshold[, replace, drop, ...])

This method contains a utilities for binarizing a variable.

bool()

Return the bool of a single element Series or DataFrame.

boxplot([column, by, ax, fontsize, rot, ...])

Make a box plot from DataFrame columns.

clip([lower, upper, axis, inplace])

Trim values at input threshold(s).

combine(other, func[, fill_value, overwrite])

Perform column-wise combine with another DataFrame.

combine_first(other)

Update null elements with value in the same location in other.

compare(other[, align_axis, keep_shape, ...])

Compare to another DataFrame and show the differences.

consolidate_data_files(save_dr[, ...])

This function is designed as helper to consolidate all or a subset of the loaded data files into one column.

convert_dtypes([infer_objects, ...])

Convert columns to best possible dtypes using dtypes supporting pd.NA.

copy([deep])

Creates and returns a dopy of this dataset, either a deep copy or shallow.

copy_as_non_input(col, new_col[, ...])

This method is a used for making a copy of an existing column, ordinalizing it and then setting it to have role = non input.

corr([method, min_periods, numeric_only])

Compute pairwise correlation of columns, excluding NA/null values.

corrwith(other[, axis, drop, method, ...])

Compute pairwise correlation.

count([axis, level, numeric_only])

Count non-NA cells for each column or row.

cov([min_periods, ddof, numeric_only])

Compute pairwise covariance of columns, excluding NA/null values.

cummax([axis, skipna])

Return cumulative maximum over a DataFrame or Series axis.

cummin([axis, skipna])

Return cumulative minimum over a DataFrame or Series axis.

cumprod([axis, skipna])

Return cumulative product over a DataFrame or Series axis.

cumsum([axis, skipna])

Return cumulative sum over a DataFrame or Series axis.

describe([percentiles, include, exclude, ...])

Generate descriptive statistics.

diff([periods, axis])

First discrete difference of element.

display_scopes()

Display an HTML representation of the Dataset, as split by scope, instead of the default repr html as split by role.

div(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

divide(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

dot(other)

Compute the matrix multiplication between the DataFrame and other.

drop([labels, axis, index, columns, level, ...])

Drop specified labels from rows or columns.

drop_cols([scope, exclusions, inclusions, ...])

This method is designed to allow dropping columns based on some flexible arguments.

drop_cols_by_nan([scope, threshold, inplace])

This method is used for dropping columns based on the amount of missing values per column, dropping any which exceed a user defined threshold.

drop_cols_by_unique_val([scope, threshold, ...])

This method will drop any columns with less than or equal to the number of unique values.

drop_duplicate_cols([scope, inplace])

This method is used for checking to see if there are any columns loaded with duplicate values.

drop_duplicates([subset, keep, inplace, ...])

Return DataFrame with duplicate rows removed.

drop_id_cols([scope, inplace])

This method will drop any str-type / object type columns where the number of unique columns is equal to the length of the dataframe.

drop_nan_subjects(scope[, inplace])

This method is used for dropping all of the subjects which have NaN values for a given scope / column.

drop_subjects_by_nan([scope, threshold, inplace])

This method is used for dropping subjects based on the amount of missing values found across a subset of columns as selected by scope.

droplevel(level[, axis])

Return Series/DataFrame with requested index / column level(s) removed.

dropna(*[, axis, how, thresh, subset, inplace])

Remove missing values.

duplicated([subset, keep])

Return boolean Series denoting duplicate rows.

eq(other[, axis, level])

Get Equal to of dataframe and other, element-wise (binary operator eq).

equals(other)

Test whether two objects contain the same elements.

eval(expr, *[, inplace])

Evaluate a string describing operations on DataFrame columns.

ewm([com, span, halflife, alpha, ...])

Provide exponentially weighted (EW) calculations.

expanding([min_periods, center, axis, method])

Provide expanding window calculations.

explode(column[, ignore_index])

Transform each element of a list-like to a row, replicating index values.

ffill(*[, axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

fillna([value, method, axis, inplace, ...])

Fill NA/NaN values using the specified method.

filter([items, like, regex, axis])

Subset the dataframe rows or columns according to the specified index labels.

filter_categorical_by_percent([scope, ...])

This method is designed to allow performing outlier filtering on categorical type variables.

filter_outliers_by_percent([scope, fop, ...])

This method is designed to allow dropping a fixed percent of outliers from the requested columns.

filter_outliers_by_std([scope, n_std, drop, ...])

This method is designed to allow dropping outliers from the requested columns based on comparisons with that columns standard deviation.

first(offset)

Select initial periods of time series data based on a date offset.

first_valid_index()

Return index for first non-NA value or None, if no non-NA value is found.

floordiv(other[, axis, level, fill_value])

Get Integer division of dataframe and other, element-wise (binary operator floordiv).

from_dict(data[, orient, dtype, columns])

Construct DataFrame from dict of array-like or dicts.

from_records(data[, index, exclude, ...])

Convert structured or record ndarray to DataFrame.

ge(other[, axis, level])

Get Greater than or equal to of dataframe and other, element-wise (binary operator ge).

get(key[, default])

Get item from object for given key (ex: DataFrame column).

get_Xy([problem_spec])

This function is used to get a sklearn-style grouping of input data (X) and target data (y) from the Dataset as according to a passed problem_spec.

get_cols(scope[, limit_to])

This method is the main internal and external facing way of getting the names of columns which match a passed scope from the Dataset.

get_file_mapping([cols])

This function is used to access the up to date file mapping.

get_nan_subjects(scope)

TODO - write docstring.

get_non_nan_subjects(scope)

TODO - write docstring

get_permuted_Xy([problem_spec, ...])

This method is otherwise identical to Dataset.get_Xy(), except a version of X, y where the values in y are permuted is returned.

get_roles()

This function can be used to get a dictionary with the currently loaded roles, See Role for more information on how roles are defined and used within BPt.

get_scopes()

This returns the up to date scopes for the Dataset.

get_subjects(subjects[, return_as, only_level])

Method to get a set of subjects, from a set of already loaded ones, or from a saved location.

get_values(col[, dropna, decode_values, ...])

This method is used to obtain the either normally loaded and stored values from a passed column, or in the case of a data file column, the data file proxy values will be loaded.

groupby([by, axis, level, as_index, sort, ...])

Group DataFrame using a mapper or by a Series of columns.

gt(other[, axis, level])

Get Greater than of dataframe and other, element-wise (binary operator gt).

head([n])

Return the first n rows.

hist([column, by, grid, xlabelsize, xrot, ...])

Make a histogram of the DataFrame's columns.

idxmax([axis, skipna, numeric_only])

Return index of first occurrence of maximum over requested axis.

idxmin([axis, skipna, numeric_only])

Return index of first occurrence of minimum over requested axis.

infer_objects()

Attempt to infer better dtypes for object columns.

info([verbose, buf, max_cols, memory_usage, ...])

Print a concise summary of a DataFrame.

insert(loc, column, value[, allow_duplicates])

Insert column into DataFrame at specified location.

interpolate([method, axis, limit, inplace, ...])

Fill NaN values using an interpolation method.

isetitem(loc, value)

Set the given value in the column with position 'loc'.

isin(values)

Whether each element in the DataFrame is contained in values.

isna()

Detect missing values.

isnull()

DataFrame.isnull is an alias for DataFrame.isna.

items()

Iterate over (column name, Series) pairs.

iteritems()

(DEPRECATED) Iterate over (column name, Series) pairs.

iterrows()

Iterate over DataFrame rows as (index, Series) pairs.

itertuples([index, name])

Iterate over DataFrame rows as namedtuples.

join(other[, on, how, lsuffix, rsuffix, ...])

Join columns of another DataFrame.

k_bin(scope[, n_bins, strategy, inplace])

This method is used to apply k binning to a column, or columns.

keys()

Get the 'info axis' (see Indexing for more).

kurt([axis, skipna, level, numeric_only])

Return unbiased kurtosis over requested axis.

kurtosis([axis, skipna, level, numeric_only])

Return unbiased kurtosis over requested axis.

last(offset)

Select final periods of time series data based on a date offset.

last_valid_index()

Return index for last non-NA value or None, if no non-NA value is found.

le(other[, axis, level])

Get Less than or equal to of dataframe and other, element-wise (binary operator le).

lookup(row_labels, col_labels)

(DEPRECATED) Label-based "fancy indexing" function for DataFrame.

lt(other[, axis, level])

Get Less than of dataframe and other, element-wise (binary operator lt).

mad([axis, skipna, level])

(DEPRECATED) Return the mean absolute deviation of the values over the requested axis.

mask(cond[, other, inplace, axis, level, ...])

Replace values where the condition is True.

max([axis, skipna, level, numeric_only])

Return the maximum of the values over the requested axis.

mean([axis, skipna, level, numeric_only])

Return the mean of the values over the requested axis.

median([axis, skipna, level, numeric_only])

Return the median of the values over the requested axis.

melt([id_vars, value_vars, var_name, ...])

Unpivot a DataFrame from wide to long format, optionally leaving identifiers set.

memory_usage([index, deep])

Return the memory usage of each column in bytes.

merge(right[, how, on, left_on, right_on, ...])

Merge DataFrame or named Series objects with a database-style join.

min([axis, skipna, level, numeric_only])

Return the minimum of the values over the requested axis.

mod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator mod).

mode([axis, numeric_only, dropna])

Get the mode(s) of each element along the selected axis.

mul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

multiply(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator mul).

nan_to_class([scope, inplace])

This method will cast any columns that were not categorical that are passed here to categorical.

ne(other[, axis, level])

Get Not equal to of dataframe and other, element-wise (binary operator ne).

nlargest(n, columns[, keep])

Return the first n rows ordered by columns in descending order.

notna()

Detect existing (non-missing) values.

notnull()

DataFrame.notnull is an alias for DataFrame.notna.

nsmallest(n, columns[, keep])

Return the first n rows ordered by columns in ascending order.

nunique([axis, dropna])

Count number of distinct elements in specified axis.

ordinalize(scope[, nan_to_class, inplace])

This method is used to ordinalize a group of columns.

pad(*[, axis, inplace, limit, downcast])

Synonym for DataFrame.fillna() with method='ffill'.

pct_change([periods, fill_method, limit, freq])

Percentage change between the current and a prior element.

pipe(func, *args, **kwargs)

Apply chainable functions that expect Series or DataFrames.

pivot(*[, index, columns, values])

Return reshaped DataFrame organized by given index / column values.

pivot_table([values, index, columns, ...])

Create a spreadsheet-style pivot table as a DataFrame.

plot(scope[, subjects, cut, decode_values, ...])

This function creates plots for each of the passed columns (as specified by scope) seperately.

plot_bivar(scope1, scope2[, subjects, ...])

This method can be used to plot the relationship between two variables.

plots(scope[, subjects, ncols, figsize, ...])

This function creates a multi-figure plot containing all of the passed columns (as specified by scope) in their own axes.

pop(item)

Return item and drop from frame.

pow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator pow).

prod([axis, skipna, level, numeric_only, ...])

Return the product of the values over the requested axis.

product([axis, skipna, level, numeric_only, ...])

Return the product of the values over the requested axis.

quantile([q, axis, numeric_only, ...])

Return values at the given quantile over requested axis.

query(expr, *[, inplace])

Query the columns of a DataFrame with a boolean expression.

radd(other[, axis, level, fill_value])

Get Addition of dataframe and other, element-wise (binary operator radd).

rank([axis, method, numeric_only, ...])

Compute numerical data ranks (1 through n) along axis.

rdiv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

reindex([labels, index, columns, axis, ...])

Conform Series/DataFrame to new index with optional filling logic.

reindex_like(other[, method, copy, limit, ...])

Return an object with matching indices as other object.

remove_scope(scope, scope_val[, inplace])

This method is used for removing scopes from an existing column or subset of columns, as selected by the scope parameter.

rename([mapper, index, columns, axis, copy, ...])

Calls method according to: https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.rename.html

rename_axis([mapper, inplace])

Set the name of the axis for the index or columns.

reorder_levels(order[, axis])

Rearrange index levels using input order.

replace([to_replace, value, inplace, limit, ...])

Replace values given in to_replace with value.

resample(rule[, axis, closed, label, ...])

Resample time-series data.

reset_index([level, drop, inplace, ...])

Reset the index, or a level of it.

rfloordiv(other[, axis, level, fill_value])

Get Integer division of dataframe and other, element-wise (binary operator rfloordiv).

rmod(other[, axis, level, fill_value])

Get Modulo of dataframe and other, element-wise (binary operator rmod).

rmul(other[, axis, level, fill_value])

Get Multiplication of dataframe and other, element-wise (binary operator rmul).

rolling(window[, min_periods, center, ...])

Provide rolling window calculations.

round([decimals])

Round a DataFrame to a variable number of decimal places.

rpow(other[, axis, level, fill_value])

Get Exponential power of dataframe and other, element-wise (binary operator rpow).

rsub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator rsub).

rtruediv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator rtruediv).

sample([n, frac, replace, weights, ...])

Return a random sample of items from an axis of object.

save_test_split(loc)

Saves the currently defined test subjects in a text file with one subject / index per line.

save_train_split(loc)

Saves the currently defined train subjects in a text file with one subject / index per line.

select_dtypes([include, exclude])

Return a subset of the DataFrame's columns based on the column dtypes.

sem([axis, skipna, level, ddof, numeric_only])

Return unbiased standard error of the mean over requested axis.

set_axis(labels, *[, axis, inplace, copy])

Assign desired index to given axis.

set_flags(*[, copy, allows_duplicate_labels])

Return a new object with updated flags.

set_index(keys, *[, drop, append, inplace, ...])

Set the DataFrame index using existing columns.

set_non_input(scope[, inplace])

This method is used to set either a single column, or multiple, specifically with role non input.

set_role(scope, role[, inplace])

This method is used to set a role for either a single column or multiple, as set through the scope parameter.

set_roles(scopes_to_roles[, inplace])

This method is used to set multiple roles across multiple scopes as specified by a passed dictionary with keys as scopes and values as the role to set for all columns corresponding to that scope.

set_target(scope[, inplace])

This method is used to set either a single column, or multiple, specifically with role target.

set_test_split([size, subjects, ...])

Defines a set of subjects to be reserved as test subjects. This

set_train_split([size, subjects, ...])

Defines a set of subjects to be reserved as train subjects. This

shift([periods, freq, axis, fill_value])

Shift index by desired number of periods with an optional time freq.

skew([axis, skipna, level, numeric_only])

Return unbiased skew over requested axis.

slice_shift([periods, axis])

(DEPRECATED) Equivalent to shift without copying data.

sort_index(*[, axis, level, ascending, ...])

Sort object by labels (along an axis).

sort_values(by, *[, axis, ascending, ...])

Sort by the values along either axis.

sparse

alias of pandas.core.arrays.sparse.accessor.SparseFrameAccessor

split_by(scope[, decode_values])

This method allows splitting the dataset into sub datasets by the different unique values of a passed scope.

squeeze([axis])

Squeeze 1 dimensional axis objects into scalars.

stack([level, dropna])

Stack the prescribed level(s) from columns to index.

std([axis, skipna, level, ddof, numeric_only])

Return sample standard deviation over requested axis.

sub(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator sub).

subtract(other[, axis, level, fill_value])

Get Subtraction of dataframe and other, element-wise (binary operator sub).

sum([axis, skipna, level, numeric_only, ...])

Return the sum of the values over the requested axis.

summary(scope[, subjects, measures, ...])

This method is used to generate a summary across some data.

swapaxes(axis1, axis2[, copy])

Interchange axes and swap values axes appropriately.

swaplevel([i, j, axis])

Swap levels i and j in a MultiIndex.

tail([n])

Return the last n rows.

take(indices[, axis, is_copy])

Return the elements in the given positional indices along an axis.

test_split([size, subjects, cv_strategy, ...])

This method defines and returns a Train and Test Dataset

to_binary(scope[, drop, inplace])

This method works by setting all columns within scope to just two binary categories.

to_clipboard([excel, sep])

Copy object to the system clipboard.

to_csv([path_or_buf, sep, na_rep, ...])

Write object to a comma-separated values (csv) file.

to_data_file(scope[, load_func, inplace])

This method can be used to cast any existing columns where the values are file paths, to a data file.

to_dict([orient, into])

Convert the DataFrame to a dictionary.

to_excel(excel_writer[, sheet_name, na_rep, ...])

Write object to an Excel sheet.

to_feather(path, **kwargs)

Write a DataFrame to the binary Feather format.

to_gbq(destination_table[, project_id, ...])

Write a DataFrame to a Google BigQuery table.

to_hdf(path_or_buf, key[, mode, complevel, ...])

Write the contained data to an HDF5 file using HDFStore.

to_html([buf, columns, col_space, header, ...])

Render a DataFrame as an HTML table.

to_json([path_or_buf, orient, date_format, ...])

Convert the object to a JSON string.

to_latex([buf, columns, col_space, header, ...])

Render object to a LaTeX tabular, longtable, or nested table.

to_markdown([buf, mode, index, storage_options])

Print DataFrame in Markdown-friendly format.

to_numpy([dtype, copy, na_value])

Convert the DataFrame to a NumPy array.

to_orc([path, engine, index, engine_kwargs])

Write a DataFrame to the ORC format.

to_parquet([path, engine, compression, ...])

Write a DataFrame to the binary parquet format.

to_period([freq, axis, copy])

Convert DataFrame from DatetimeIndex to PeriodIndex.

to_pickle(path[, compression, protocol, ...])

Pickle (serialize) object to file.

to_records([index, column_dtypes, index_dtypes])

Convert DataFrame to a NumPy record array.

to_sql(name, con[, schema, if_exists, ...])

Write records stored in a DataFrame to a SQL database.

to_stata(path, *[, convert_dates, ...])

Export DataFrame object to Stata dta format.

to_string([buf, columns, col_space, header, ...])

Render a DataFrame to a console-friendly tabular output.

to_timestamp([freq, how, axis, copy])

Cast to DatetimeIndex of timestamps, at beginning of period.

to_xarray()

Return an xarray object from the pandas object.

to_xml([path_or_buffer, index, root_name, ...])

Render a DataFrame to an XML document.

train_split([size, subjects, cv_strategy, ...])

This method defines and returns a Train and Test Dataset

transform(func[, axis])

Call func on self producing a DataFrame with the same axis shape as self.

transpose(*args[, copy])

Transpose index and columns.

truediv(other[, axis, level, fill_value])

Get Floating division of dataframe and other, element-wise (binary operator truediv).

truncate([before, after, axis, copy])

Truncate a Series or DataFrame before and after some index value.

tshift([periods, freq, axis])

(DEPRECATED) Shift the time index, using the index's frequency if available.

tz_convert(tz[, axis, level, copy])

Convert tz-aware axis to target time zone.

tz_localize(tz[, axis, level, copy, ...])

Localize tz-naive index of a Series or DataFrame to target time zone.

unstack([level, fill_value])

Pivot a level of the (necessarily hierarchical) index labels.

update(other[, join, overwrite, ...])

Modify in place using non-NA values from another DataFrame.

update_data_file_paths(old, new)

Go through and update saved file paths within the Datasets file mapping attribute.

value_counts([subset, normalize, sort, ...])

Return a Series containing counts of unique rows in the DataFrame.

var([axis, skipna, level, ddof, numeric_only])

Return unbiased variance over requested axis.

where(cond[, other, inplace, axis, level, ...])

Replace values where the condition is False.

xs(key[, axis, level, drop_level])

Return cross-section from the Series/DataFrame.

nan_info