Estimating missing temporal meta-information using Knowledge-Based-Trust

Oulabi, Yaser ; Bizer, Christian

Document Type: Conference or workshop publication
Year of publication: 2017
Book title: KDWEB 2017 : proceedings of the 3rd International Workshop on Knowledge Discovery on the WEB Cagliari, Italy, September 11 to 12, 2017
The title of a journal, publication series: CEUR Workshop Proceedings
Volume: 1959
Page range: Paper 4
Conference title: 3rd International Workshop on Knowledge Discovery on the WEB
Location of the conference venue: Cagliari, Italy
Date of the conference: September 11-12, 2017
Armano, Giuliano
Place of publication: Aachen
Publishing house: RWTH
ISSN: 1613-0073
Publication language: English
Institution: School of Business Informatics and Mathematics > Wirtschaftsinformatik V (Bizer)
Subject: 004 Computer science, internet
Keywords (English): temporal meta-information , time-dependent data , knowledge-based-trust , knowledge bases
Abstract: A large number of HTML Tables on the Web contain relational data which can be used to augment knowledge bases such as DBpedia, Yago, or Wikidata. A large part of this data is time-dependent, i.e., the correctness of a fact depends on a specific temporal scope. In order to use this data for knowledge base augmentation, we need temporal meta-information. Existing methods rely on timestamps within the table itself or its context as temporal meta-information. Yet, the relationship between these timestamps and data within a table is often unclear. Additionally, timestamps are rather sparse, and there are many web tables for which no timestamps exist. Knowledge-Based-Trust (KBT) uses the overlap with ground-truth to estimate the trustworthiness of a dataset. This paper introduces TimedKBT, which overcomes the dependence on sparse and possibly misinterpreted timestamps by propagating temporal meta-information from a knowledge base to web table data using KBT. It also derives a trust score that estimates the correctness of the data and the assigned temporal meta-information. We evaluate Timed-KBT on the use case of fusing data from a large corpus of web tables for filling missing facts in a knowledge base. Our evaluation shows that Timed-KBT yields an increase in F0.25-Measure of 19.01 % when compared to KBT and 9.44 % when compared to a method that relies solely on timestamps extracted from the table and its context.

