BPt.CompareDict.pairwise_t_stats#
- CompareDict.pairwise_t_stats(metric='first')[source]#
This method performs pair-wise t-test comparisons between all different options, assuming this object holds instances of
EvalResults
. The method used to generate t-test comparisons here is based off the example code from: https://scikit-learn.org/stable/auto_examples/model_selection/plot_grid_search_stats.htmlNote
In the case that the sizes of the training and validation sets at each fold vary dramatically, it is unclear if this statistics are still valid. In that case, the mean train size and mean validation sizes are employed when computing statistics.
- Parameters
- metricstr, optional
This method compares the metrics produced for only one valid metric / scorer. Notably all
EvalResults
must have been evaluated with respect to this scorer. By default the reserved key, ‘first’ indicates that just whatever scorer is first should be used to produce the pairwise t statistics.default = 'first'
- Returns
- stats_dfpandas DataFrame
A DataFrame comparing all pairwise combinations of the original
Compare
options. ‘t_stat’ and ‘p_val’ columns will be generated for each comparison representing the corrected t_stat for non-independence of folds and the corresponding Bonferroni correctted p values (for multiple comparisons from comparing all pairwise combinations). See the referenced scikit-learn example for more information.