K. Mnich, A. Polewko-Klim, A. Kitlas Golińska, W. Lesiński and W. R. Rudnicki, “Super Learning with Repeated Cross Validation,” 2020 International Conference on Data Mining Workshops (ICDMW), Sorrento, Italy, 2020, pp. 629-635, doi: 10.1109/ICDMW51313.2020.00089.
Super learner algorithm was created to combine results of multiple base learners with the use of cross validation. However, in many cases it does not outperform significantly a simple average of the base results. We propose to apply multiple repeats of cross validation to improve the performance of super learning. Two approaches to application of repeated cross validation were tested on artificial data sets and on real-life, biomedical data sets. One of the approaches, MEAN OUTPUT strategy, proved to significantly improve the results. To reduce the computational complexity of the algorithm, we suggest the use of 3-fold, rather than the previously recommended 10-fold validation. The tests showed, that this simplification does not affect the super learning results.