Project Details
Projekt Print View

Stability Analysis for Clustering

Subject Area Mathematics
Term from 2008 to 2012
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 40095828
 
A new model validation principle based on information theory is developed and analyzed in this project. Discrete structures like data partitions in clustering or graph cuts are infered from noisy data according to an objective function. Due to the noise in the measurements (data), learning algorithms have to return a set of approximate partitionings which are considered to be statistically indistinguishible. The uncertainty in the data induce a quantization of the space of partitionings and, thereby, defines a coding scheme. An information theoretic analysis of this code yields an approximation capacity of the underlying model represented by an objective function. This selection criterion trades informativeness against stability and controls the model complexity by the approximation precision. Approximate solutions are sampled by Gibbs sampling at a finite temperature. This novel information theoretic model selection principle will be applied to correlation clustering in the context of clustering protein interaction data. Furthermore, we will apply this principle for learning dynamical systems in systems biology and for infering user roles in information security applications.
DFG Programme Research Units
International Connection Switzerland
 
 

Additional Information

Textvergrößerung und Kontrastanpassung