Project Details
Projekt Print View

Resampling-based comparison studies of prediction methods with emphasis on high-dimensional biological data

Subject Area Epidemiology and Medical Biometry/Statistics
Term from 2009 to 2018
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 158005760
 
Resampling-based comparison studies of prediction methods using real data sets are routinely conducted in computational research, including, for instance, that for biostatistics, bioinformatics, or machine learning. For example, supervised classification methods may be compared with respect to their cross-validation error on, say, five exemplary real data sets. However, many methodological issues related to such comparison studies, and resampling-based methods in general, remain unanswered. This project deals with such issues with emphasis on applications to high-dimensional biological data, among others. In the first part of the project we address the design of comparison studies from a statistical testing perspective by drawing parallels between comparison studies in computational sciences (in which the performances of the methods are compared in real data sets) and clinical trials (in which therapy effects are compared in patients). In light of this metaphor we develop methods to address issues such as the choice of the resampling procedure with respect to its variability and the relationship between the performance of a method and the characteristics of the datasets, with emphasis on statistical inference and power. We extend this statistical framework to issues other than prediction models. The second part deals with scientific practice and the interpretation of literature in the field of computational science with focus on prediction methods while borrowing concepts from biomedical/clinical research. More precisely, we extend concepts such as meta-analysis, inclusion criteria for patients in clinical studies, and the degree of freedom of the researcher (related to multiple testing issues and fishing for significance) to the world of computational research. The goal of this part is to suggest first concepts and to investigate their feasibility. In the third part we develop methods related to parameter tuning for prediction methods, which is often performed using resampling methods in practice. We propose methods to measure the impact of tuning parameters on the performance of a method, we systematically assess and compare various resampling-based procedures for tuning, and we develop alternatives to resampling.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung