Lernalgorithmen zur Konstruktion neuer relationaler Strukturen einer gegebenen Klasse
Zusammenfassung der Projektergebnisse
We introduced and investigated the problem of learning to construct novel structured objects with properties implicitly defined by training data or a labelling oracle. In contrast, traditional machine learning has focussed on learning to predict properties of novel objects. The new constructive machine learning setting captures formally a variety of high-potential application areas of machine learning like computational drug design and content creation. Our main objective has been to contribute theoretical understanding and algorithmic approaches relevant to this setting and to explore its applications. Constructive machine learning processes typically iterate through analyse, design, make, and test phases. The analyse phase extracts a hypothesis from available data, the design phase constructs a digital model of a promising structure from the design space, the make phase realises the designed structure in its context, and the test phase evaluates the quality of the realised structure. A challenging and important example for this is the drug design cycle. The way traditional machine learning has been applied to constructive processes is often by a sample and filter approach: An uninformed sampler produces candidate structures and a learned hypothesis is used to filter these. This approach is often inefficient and in iterative processes not wellfounded. Indeed, the assumptions of traditional machine learning approaches are often violated in constructive settings, e.g., the iterative nature of this process causes dependent samples. Our research concentrated mainly along two lines. On the one hand, to cope with the iterative and interactive nature of the analyse phase, we investigated knowledge-based, online, and active machine learning approaches with a focus on structured domains. On the other hand, to cope with the huge number of candidate structures in the design phase, we investigated data mining techniques that can be used to break the structures into fragments for which the space of recombinations can be searched effectively. We have also explored different application areas of constructive machine learning, including drug design and adaptive games. Our work on online learning has initially focussed on spaces structured by a partially ordered set and later turned to abstract convexity spaces; it was mostly explored in a game context. Our work on active learning has initially focussed on model selection and later turned to active search; it was mostly explored in a drug design context. Our work on knowledge-based algorithms concentrated on interactive visualisations for data exploration that can be manipulated more intuitively by domain experts. Our work on fragmentation focussed initially on enumeration and random sampling algorithms and eventually led to a separate investigation in another project funded by the DFG. The project made contributions to machine learning and data mining research, to research in application areas of machine learning, was further promoted by workshops, and improved the qualifications of several involved researchers.
Projektbezogene Publikationen (Auswahl)
- Formal concept sampling for counting and threshold-free local pattern mining. In Proceedings of the SIAM International Conference on Data Mining, 2010
Mario Boley, Thomas Gärtner, and Henrik Grosskreutz
- A fixed parameter tractable integer program for finding the maximum order preserving submatrix. In 11th IEEE International Conference on Data Mining, 2011
Jens Humrich, Thomas Gärtner, and Gemma C. Garriga
- Direct local pattern sampling by efficient two–step random procedures. In: KDD '11 Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining
Pages 582-590. San Diego, California, USA — August 21 - 24, 2011. ACM New York c 2011. ISBN: 978-1-4503-0813-7
Mario Boley, Claudio Lucchese, Daniel Paurat, and Thomas Gärtner
(Siehe online unter https://dx.doi.org/10.1145/2020408.2020500) - Predicting dynamic difficulty. In Advances in Neural Information Processing Systems 24, 2011
Olana Missura and Thomas Gärtner
- Discriminative clustering for market segmentation. In Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2012
Peter Haider, Luca Chiarandini, and Ulf Brefeld
- Dynamic difficulty for checkers and chinese chess. In IEEE Conference on Computational Intelligence and Games, 2012
Laurentiu Ilici, Jiaojian Wang, Olana Missura, and Thomas Gärtner
- Interactive knowledge-based kernel PCA. In Proceedings of the European Conference on Machine Learning and Knowledge Discovery in Databases, 2014
Dino Oglic, Daniel Paurat, and Thomas Gärtner
(Siehe online unter https://doi.org/10.1007/978-3-662-44851-9_32) - Predicting unexpected influxes of players in EVE online. In IEEE Conference on Computational Intelligence and Games, 2014
Roman Garnett, Thomas Gärtner, Timothy Ellersiek, Eyjolfur Gudmondsson, and Petur Oskarsson
(Siehe online unter https://doi.org/10.1109/CIG.2014.6932878) - Sampling for inference in probabilistic models with fast Bayesian quadrature. In Advances in Neural Information Processing Systems 27, 2014
Tom Gunter, Michael A. Osborne, Roman Garnett, Philipp Hennig, and Stephen J. Roberts
- Introducing the ‘active search’ method for iterative virtual screening. Journal of Computer-Aided Molecular Design, 2015
Roman Garnett, Thomas Gärtner, Martin Vogt, and Jürgen Bajorath
(Siehe online unter https://doi.org/10.1007/s10822-015-9832-9)