Detailseite
Projekt Druckansicht

Stability Analysis for Clustering

Fachliche Zuordnung Mathematik
Förderung Förderung von 2008 bis 2012
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 40095828
 
A new model validation principle based on information theory is developed and analyzed in this project. Discrete structures like data partitions in clustering or graph cuts are infered from noisy data according to an objective function. Due to the noise in the measurements (data), learning algorithms have to return a set of approximate partitionings which are considered to be statistically indistinguishible. The uncertainty in the data induce a quantization of the space of partitionings and, thereby, defines a coding scheme. An information theoretic analysis of this code yields an approximation capacity of the underlying model represented by an objective function. This selection criterion trades informativeness against stability and controls the model complexity by the approximation precision. Approximate solutions are sampled by Gibbs sampling at a finite temperature. This novel information theoretic model selection principle will be applied to correlation clustering in the context of clustering protein interaction data. Furthermore, we will apply this principle for learning dynamical systems in systems biology and for infering user roles in information security applications.
DFG-Verfahren Forschungsgruppen
Internationaler Bezug Schweiz
 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung