Project Details
Projekt Print View

QASciInf: Question Answering for Scientific Information

Subject Area Image and Language Processing, Computer Graphics and Visualisation, Human Computer Interaction, Ubiquitous and Wearable Computing
Term from 2014 to 2025
Project identifier Deutsche Forschungsgemeinschaft (DFG) - Project number 252295018
 
Final Report Year 2025

Final Report Abstract

The number of scientific articles is increasing exponentially, making it challenging for scientists to keep up with the latest research and benefit from all relevant work. Novel technologies must be developed to support researchers in accessing the rapidly expanding knowledge. Natural Language Processing (NLP), including Information Retrieval (IR), Question Answering (QA) systems, and generative methods, can play a pivotal role in addressing this challenge. Retrieval systems are used to find relevant information for downstream QA systems, but can also be used directly, e.g., in literature review. Given a question, QA systems can efficiently extract information from a publication. Table understanding is a core skill of any system when applied to scientific content. In a generative model, it can be utilized, e.g., to summarize the results presented in a table. However, at the submission time of the proposal, no resources to evaluate models on scientific tasks were available. Therefore, in the QASciInf project, we published datasets and benchmarks that introduced relevant tasks enabling the systematic evaluation of models. We introduced a zero-shot benchmark for retrieval to evaluate systems on domains they have not been trained on (e.g., in scientific domains where training data is scarce). Further, we proposed a context-aware retrieval benchmark to measure how well a system can take into account long contexts, such as scientific publications. We introduced a dialog dataset over publications, and we published a QA dataset over papers with expert questions and answers. Also, we introduced a table-to-text dataset over scientific tables and their descriptions. Together, these datasets and benchmarks allow a comprehensive evaluation of NLP methods in the scientific domain. Further, we developed methods to improve the performance on these tasks that support scientists. We introduced a few-shot information retrieval task and proposed a method where a system learns refined query representations from a few user demonstrations, which is helpful for literature review. In 2024, we evaluated baseline systems to take document context into account when representing a single passage, which improves retrieval in scientific QA scenarios. Finally, we introduced a pre-training method enhancing the numerical reasoning ability of Large Language Models (LLMs), subsequently improving table-to-text generation tasks. Finally, we evaluated QA systems in collaboration with the DFG-funded UKP-SQuARE project. We developed a public demonstrator, including various datastores and QA models, allowing users to compose and analyze methods (Baumgärtner et al., 2022; Sachdeva et al., 2022; Puerto et al., 2023).

Publications

 
 

Additional Information

Textvergrößerung und Kontrastanpassung