Representing Sets in Embeddings of Relational Information
General and Comparative Linguistics, Experimental Linguistics, Typology, Non-European Languages
Final Report Abstract
This project addressed the gaps and limitations in current research on representing relational knowledge, specifically focusing on how relational knowledge graph vector embeddings and relational fact-knowledge from natural language are modeled. Over the course of the project, novel developments emerged in knowledge graph embeddings and artificial intelligence, and we adapted to those developments and picked up exciting and relevant trends. We created an evaluation set specifically designed for the evaluation of reasoning capabilities and prediction of sets of answers (rather than ranking of hypotheses). We proposed and studied a prediction model that includes the notion of a region in embedding space, modeling variance and allowing for regularization and priors, as an integral part of learning. We published a method of post-hoc thresholding of trained graph embedding models for set-of-facts prediction rather than ranking, and we based our thresholding on Gaussian processes, a Bayesian method for estimation from a few observations. The project also proposed new methods to enhance the modeling techniques that can deal with sets of entities as an integral part of probabilistic models of structured knowledge and language. We studied knowledge-graph-based link prediction with Transformers in a multimodal setting (text, knowledge graphs, and images represented as scene graphs), we focusing on a challenging dataset of internet memes (images and text). We studied counterfactual reasoning with knowledge graphs and compared it to the counterfactual reasoning capabilities of state-of-the-art large language models, and we found that our graph-based methods, outlined in the original proposal, are competitive with capabilities of current large language models such as ChatGPT in a controlled setting. During the project, we published five peer-reviewed articles in international venues, including prestigious conferences papers.
Publications
-
“Ranking vs. Classifying: Measuring Knowledge Base Completion Quality”. In: Automated Knowledge Base Construction. 2020
M. Speranskaya, M. Schmitt & B. Roth
-
Knodle: Modular Weakly Supervised Learning with PyTorch. Proceedings of the 6th Workshop on Representation Learning for NLP (RepL4NLP-2021), 100-111. Association for Computational Linguistics.
Sedova, Anastasiia; Stephan, Andreas; Speranskaya, Marina & Roth, Benjamin
-
ACTC: Active Threshold Calibration for Cold-Start Knowledge Graph Completion. Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 1853-1863. Association for Computational Linguistics.
Sedova, Anastasiia & Roth, Benjamin
-
MemeGraphs: Linking Memes to Knowledge Graphs. Lecture Notes in Computer Science, 534-551. Springer Nature Switzerland.
Kougia, Vasiliki; Fetzel, Simon; Kirchmair, Thomas; Çano, Erion; Baharlou, Sina Moayed; Sharifzadeh, Sahand & Roth, Benjamin
-
ReInform: Selecting paths with reinforcement learning for contextualized link prediction
M. Speranskaya, M. Schmitt & B. Roth
-
Counterfactual Reasoning with Knowledge Graph Embeddings. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), 2753-2772. Association for Computational Linguistics.
Zellinger, Lena; Stephan, Andreas & Roth, Benjamin
