This repo allows to
- download the (preprocessed) data used in the study
- reproduce the results (table, figures etc.) presented in the paper
- rerun the entire benchmark experiment
If you use the code or data please cite:
Moritz Herrmann, Philipp Probst, Roman Hornung, Vindi Jurinovic, Anne-Laure Boulesteix, Large-scale benchmark study of survival prediction methods using multi-omics data, Briefings in Bioinformatics, Volume 22, Issue 3, May 2021, bbaa167, https://doi.org/10.1093/bib/bbaa167
- The preprocessed data (described in the study) is available via OpenML
- The OpenML dataset IDs can be found in
data/datset_ids.txtordata/datset_ids.RData - Note that the datasets had to be split into two to three parts in order to be uploaded to OpenML
- R users can use the code in
R/bench_experiment.R(lines 44-81) to directly download the data (and convert it tomlrtasks)
- to only reproduce the tables, figures etc. displayed in the paper without rerunning the benchmark experiments use
reproduce_table-and-figures.Rmd - to rerun the full experiments (this takes several days or weeks, depending on the available resources) use
R/bench_experiment.R- see the instructions in
R/packages.R! - make sure the required packages are installed
- make sure to use correct package versions via checkpoint
- not all packages are covered by checkpoint, this is specifically relevant for
mlr(s.R/packages.R)!
- see the instructions in
- to merge the benchmark results use
R/merge_bmr_results.R
Note, mlr has deprecated (https://github.com/mlr-org/mlr) in the meantime. There is now the new framework mlr3 (https://mlr3.mlr-org.com/).