BPt.EvalResultsSubset.run_permutation_test#

EvalResultsSubset.run_permutation_test(n_perm=100, dataset=None, random_state=None, blocks=None, within_grp=True, plot=False)[source]#

Compute signifigance values for the original results according to a permutation test scheme. In this setup, we estimate the null model by randomly permuting the target variable, and re-evaluating the same pipeline according to the same CV. In this manner, a null distribution of size n_perm is generated in which we can compare the real, unpermuted results to.

Note: If using a custom scorer, w/ no sign_ attribute, this method will assume that higher values for metrics are better.

Parameters
n_permint, optional

The number of permutations to test.

default = 100
datasetDataset

The instance of Dataset originally passed to evaluate().

Note

If a different dataset is passed, then unexpected behavior may occur.

If left as default=None, then will try to use a shallow copy of the dataset passed to the original evaluate call (assuming evaluate was run with store_data_ref=True).
default = None
random_stateint, or None, optional

Pseudo-random number generator to control the permutations of each feature. If left as None, then initialize a new random state for each permutation.

default = None
blocksNone, array, pd.Series or pd.DataFrame, optional

This parameter is only available when the neurotools library is installed. See: sahahn/neurotools

This parameter represents the underlying exchangability-block structure of the data passed. It is also used to constrain the possible permutations in some way.

See PALM’s documentation for an introduction on how to format ExchangeabilityBlocks: https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/PALM/ExchangeabilityBlocks

This parameter accepts the same style input as PALM, except it is passed here as an array or DataFrame instead of as a file. The main requirement is that the shape of the structure match the number of subjects / data points in the first dimension.

default = None
within_grpbool, optional

This parameter is only relevant when a permutation structure / blocks is passed, in that case it describes how the left-most exchanability / permutation structure column should act. Specifically, if True, then it specifies that the left-most column should be treated as groups to act in a within group swap only manner. If False, then it will consider the left-most column groups to only be able to swap at the group level with other groups of the same size.

default = True
plotbool, optional

Can optionally add a plot visualizing the true result in comparison to the generated null distribution.

default = False
Returns
p_valuesdict of float

A dictionary, as indexed by all of the valid metrics, with the computed p-values.

p_scoresdict of array

The null distribution, as indexed by all of the valid metrics, of scores.