Effective Higher-Order Automated Theorem Proving
Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Final Report Abstract
Research Context: The primary focus in the automated theorem proving (ATP) community has for decades been on classical first-order logic (FOL), and very powerful ATPs for FOL, such as Vampire, E and SPASS, have been developed. Practical applications, however, often motivate the use of more expressive logics. Multiple sorts, formulas embedded at term-level, λ-abstraction and higher-order quantifiers, as provided in classical higher-order logic (HOL), enable more intuitive and often more adequate problem encodings, and they may even facilitate shorter and more efficient proofs. Adding (restricted forms) of polymorphism further enhances the expressivity of HOL. Because of its virtues HOL has been selected as the base logic of choice in many modern interactive proof assistants, including HOL4, HOL Light, Isabelle/HOL and PVS, which have prominent applications in computer science and mathematics. When utilised as a meta-logic, HOL even allows for semantical embeddings of a multitude of expressive (quantified) non-classical logics, which in turn facilitate many further applications in AI, philosophy and natural language processing. These observations have triggered a recent shift of interest in the ATP community from FOL to HOL. Research Results: In the DFG funded project Effective Higher-Order Automated Theorem Proving, we have developed the new ATP system Leo-III for (polymorphic) HOL. An independent, recent study empirically evaluated the world’s leading ATPs for FOL and HOL, including Leo-III, on a large and representative benchmark set of theorem proving problems called the “Grand Unified ATP Challenge”. This benchmark set was obtained by translating 12140 theorems from the HOL4 standard library into multiple logical formalisms. Leo-III outperformed all its competitor systems in this evaluation study, including most prominent theorem provers such as Vampire, E, CVC4, SPASS, Prover9, Zipperposition, Satallax and HOLyHammer: Leo-III automatically proved 57% (7062) of the 12140 theorems, while the next best system, Vampire, only proved 47% (5929). This independent study confirms that the Leo-III project has well met its core objective, namely to leverage the state-of-the-art in practical higher-order automated theorem and to contribute an effective new automated theorem proving systems that is applicable to a wide range of practically relevant proof problems. The applicability range of Leo-III has in fact been extended beyond the above benchmark set: the prover natively supports a range of quantified non-classical logics based on an adoption of semantical embedding techniques that has been developed in Benzmüller’s related DFG project Studies in Computational Metaphysics. Leo-III’s success rests on a range of theoretical and practical contributions, including (i) a novel proof calculus for extensional higher-order ordered paramodulation, (ii) an extension of this calculus for prefix polymorphic HOL, (iii) the design and implementation of effective data structures for (polymorphic) HOL, (iv) an efficient implementation of Leo-III in Scala, and (v) the incorporation of the shallow semantical embedding technique in Leo-III to enable the automation of prominent (quantified) non-classical logics.
Publications
- “There Is No Best Beta-Normalization Strategy for Higher-Order Reasoners”. In: Logic for Programming, Artificial Intelligence, and Reasoning (LPAR). ed. by M. Davis, A. Fehnker, A. McIver, and A. Voronkov. Vol. 9450. LNAI. Suva, Fiji: Springer, 2015, pp. 329–339. isbn: 978-3-662-48898-0
A. Steen and C. Benzmüller
(See online at https://doi.org/10.1007/978-3-662-48899-7_23) - “Effective Normalization Techniques for HOL”. in: Automated Reasoning. 8th Intl. Joint Conference, IJCAR 2016, Coimbra, Portugal, June 27–July 2, 2016, Proceedings. Ed. by N. Olivetti and A. Tiwari. Vol. 9706. LNCS. Springer, 2016, pp. 362–370. isbn: 978-3-319-40228-4
M. Wisniewski, A. Steen, K. Kern, and C. Benzmüller
(See online at https://doi.org/10.1007/978-3-319-40229-1_25) - “Going Polymorphic—TH1 Reasoning for Leo-III”. in: IWIL@LPAR 2017 Workshop and LPAR-21 Short Presentations, Maun, Botswana, May 7-12, 2017. Ed. by T. Eiter, D. Sands, G. Sutcliffe, and A. Voronkov. Vol. 1. Kalpa Publications in Computing. Maun, Botswana: EasyChair, 2017, pp. 100–112
A. Steen, M. Wisniewski, and C. Benzmüller
(See online at https://doi.org/10.29007/jgkw) - “Theorem Provers for Every Normal Modal Logic”. In: LPAR-21. 21st International Conference on Logic for Programming, Artificial Intelligence and Reasoning. Ed. by T. Eiter and D. Sands. Vol. 46. EPiC Series in Computing. Maun, Botswana: EasyChair, 2017, pp. 14–30
T. Gleißner, A. Steen, and C. Benzmüller
(See online at https://doi.org/10.29007/jsb9) - “Extensional Paramodulation for Higher-Order Logic and its Effective Implementation Leo-III”. PhD thesis. Freie Universität Berlin, 2018. Published in Dissertations in Artificial Intelligence (DISKI), volume 345, EAN/ISBN 978-3-89838-739-2, AKA-Verlag
A. Steen
- “The Higher-Order Prover Leo-III”. in: Automated Reasoning. IJCAR 2018. Ed. by D. Galmiche, S. Schulz, and R. Sebastiani. Vol. 10900. LNCS. Springer, Cham, 2018, pp. 108–116. isbn: 978-3-319-94204-9
A. Steen and C. Benzmüller
(See online at https://doi.org/10.1007/978-3-319-94205-6_8)