Detailseite
Projekt Druckansicht

CHORUS - Top-k Komposition von Browsing-Skripten für effiziente sozial-bewusste Nutzung von Web-basierte Diensten

Antragsteller Dr. Sudhir Agarwal
Fachliche Zuordnung Softwaretechnik und Programmiersprachen
Förderung Förderung von 2013 bis 2015
Projektkennung Deutsche Forschungsgemeinschaft (DFG) - Projektnummer 241316025
 
Erstellungsjahr 2019

Zusammenfassung der Projektergebnisse

Oftentimes, users need to quickly integrate and explore the data in an ad hoc manner from multiple such sources to perform planning tasks, make data-driven decisions, verify or falsify hypotheses, or gain entirely new insights. The data can be public or private as well as structured or semi-structured. In particular, deep web pages can also contain useful personal data. We have presented a novel approach for enabling end users to extract data from any web page, structure it, and store it locally. This is already useful since it allows structured search over the visited information at a later stage without needing to visit the same pages again. We have also presented how end users can integrate the extracted data with the help of Datalog rules and how queries over the unified view of the data can be answered. The case of web pages that have semantic data available is a special case in which the extraction can be done automatically, cleaning becomes almost obsolete, and integration step is much easier. The proposed approach has been implemented and evaluated as a browser plugin. Our proposed formalization of cleaning steps and integration rules enables their reuse, and thus accelerates the cleaning and integration of extracted data. We presented Dexter, a tool to empower users to expressively explore in ad hoc manner structured data from various sources such as local files, Web-APIs and databases. Dexter stores user’s data locally inside his or her browser thus ensuring that users can combine their private and confidential data with public data without compromising on their privacy. Dexter is also equipped with a client-side parallel algorithm for efficient computation of answers to queries that require data from multiple, possibly remote, sources. Finally, we presented Jabberwocky to address the gap of a tool to enable individuals to make use of the vast amount of data released under open data provision by more and more companies and organizations. Jabberwocky integrates structured data from authoritative data sources as a triples-based conceptual model and can be operated completely through any popular web browser. Jabberwocky enables its users to browse the integrated open data as a graph, supports highly expressive queries as well as provide modern styles and visualizations for different types of objects in a flexible manner. In order to ensure adequate data exploration performance and high interactivity, Jabberwocky employs novel hybrid (server and client side) data caching techniques. An element of surprise was that almost everyone realized what all they can’t do with popular search engines once they saw what they could do with Dexter and Jabberwocky.

Projektbezogene Publikationen (Auswahl)

  • Extraction and integration of web data by end-users. CIKM 2013: 2405-2410
    Sudhir Agarwal, Michael R. Genesereth
    (Siehe online unter https://doi.org/10.1145/2505515.2505635)
  • Dexter: Plugging-n-Playing with Data Sources in Your Browser AAAI Workshop on Semantic Cities, 2014
    Abhijeet Mohapatra, Sudhir Agarwal, and Michael Genesereth
  • Rule-Based Exploration of Structured Data in the Browser. RuleML 2015: 161-175
    Sudhir Agarwal, Abhijeet Mohapatra, Michael R. Genesereth, Harold Boley
    (Siehe online unter https://doi.org/10.1007/978-3-319-21542-6_11)
 
 

Zusatzinformationen

Textvergrößerung und Kontrastanpassung