Analyzing real-world SPARQL queries and ontology-based data access in the context of probabilistic data

Schönfisch, Jörg ; Stuckenschmidt, Heiner

Document Type: Article
Year of publication: 2017
The title of a journal, publication series: International journal of approximate reasoning
Volume: 90
Page range: 374-388
Place of publication: Amsterdam [u.a.]
Publishing house: Elsevier
ISSN: 0888-613X
Publication language: English
Institution: School of Business Informatics and Mathematics > Praktische Informatik II (Stuckenschmidt 2009-)
Subject: 004 Computer science, internet
Keywords (English): Safeness of probabilistic queries ; SPARQL ; Probabilistic data ; Linked data
Abstract: Handling uncertain knowledge is crucial for modeling many real world domains. Ontologies and ontology-based data access (OBDA) have proven to be versatile methods to capture this knowledge. Multiple systems for OBDA have been developed and there is theoretical work towards probabilistic OBDA, namely identifying efficiently processable queries. These queries are called safe queries. However, there is no analysis on the safeness of probabilistic queries in real-world applications, and there exists no tool support for applying the existing formalisms to the standard query language of SPARQL. In this paper we investigate queries collected from several public SPARQL endpoints and determine the distribution of safe and unsafe queries. This analysis shows that many queries in practice are safe, making probabilistic OBDA feasible and practical to fulfill real-world users' information needs. Furthermore, we design and conduct benchmarks on real-world and generated data sets which show that the approach of answering safe queries in the appropriate way is scalable to large amounts of data.

