Why does ensembling provide a reliable performance boost?
Ensembling over multiple parcellations shows a clear advantage over single parcellations, despite employing random parcellations. So… why?
One obvious potential explanation to the observed ensemble performance gain is that it is due solely to an inherent utility of ensembling, which has been shown to reliably increase performance across a wide range of ML applications (Dietterich 2000, Zhou 2009). This is likely true in our case, noting the outsized performance of the ‘All’ ensemble relative to the SVM based ensemble as seen on page Whats Best.
On the other hand, ensembles as employed in this work are specifically designed to capture information from multiple overlapping parcellations. It is plausible that the performance boost obtained by this methodology may be related to the boost from increasing resolution; this could indicate that the “true” best parcellations are not neat and uniform. Instead, by allowing overlapping parcellations, more predictive information can be extracted despite noisy ground truth data. Alternatively, it could also be that ensembling over multiple views provides benefit by forcing different classifiers to exploit different unique predictive signals (Allen-Zhu 2020). The cortical surface exhibits high covariance between different brain regions on measures employed as inputs features (e.g., cortical thickness); it may therefore be reasonable to assume that there are more than one multivariate predictive patterns capable of performing well out of sample on the target of interest (Alexander-Bloch 2013). In this case, different instances of random parcellations may help base estimators of the ensemble learn distinct predictive patterns that when combined can exploit a larger region of competency when generating predictions for new samples.