Project Details
Projekt Print View

Joining graph- and vector-based sense representations for semantic end-user information access (JOIN-T 2)

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2014 to 2019
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 259256643
 
In recent years, research in Natural Language Processing (NLP) has led to major breakthroughs in language understanding. Computational semantics is one of the key areas in NLP, and accordingly a plethora of work focused on the representations of machine-readable knowledge along orthogonal dimensions such as manual vs. automatic acquisition, lexical vs. conceptual as well as dense vs. sparse representations. However, much work still lies ahead on combining these different dimensions together to complement their strengths and provide a unified semantic model and knowledge resource to tackle complex high-end NLP tasks. We propose an approach to meaning representation in context that is based on the graph-vector duality, namely a hypothesis stating that both graph and vector representations of lexical entities should be used at the same time to describe the semantics of these entities. To this end, we propose a computational framework and resource that integrate all these dimensions and combine the interpretability of manually crafted resources and sparse representations with the accuracy and high coverage of dense neural embeddings. We build upon our previous work on joining ontologies with graph-based distributional semantics, and take it to the next level by: i) joining it with dense semantic vector representations (a.k.a. embeddings) of text and knowledge bases (KBs) in a unified graph-vector semantic model; ii) extending the coverage to the long tail of (infrequent) named entities, including emerging ones, by leveraging extractions from Web-scale corpora; iii) exploring the benefits of a joint lexical, distributional and ontological representation for a high-end NLP task such as the browsing of document collections along structures such as entities and events. This is an application for the continuation of our previous project "JOIN-T". We successfully have addressed most work packages from the first project phase and are planning to complete the remaining work packages in the months during proposal review. The choice of topics for this continuation is informed by the key takeaways from the first project phase, namely i) linking of distributional semantic information with lexical ontologies is possible with high accuracy ii) disambiguation of lexical items in context towards distributional senses or ontological senses is possible with high accuracy using graph-based representations, but at the expense of computational efficiency, which hampers their scaling to very large corpora.
DFG Programme Research Grants
 
 

Additional Information

Textvergrößerung und Kontrastanpassung