This repository contains all the code and gold standards for creating the DBkWik dataset.
While popular knowledge graphs such as DBpedia and YAGO are built from Wikipedia, Wikifarms like Fandom contain Wikis for specific topics, which are often complementary to the information contained in Wikipedia, and thus DBpedia and YAGO. Extracting these Wikis with the DBpedia extraction framework is possible, but results in many isolated knowledge graphs. In this paper, we show how to create one consolidated knowledge graph, called DBkWik, from thousands of Wikis. We perform entity resolution and schema matching, and show that the resulting large-scale knowledge graph is complementary to DBpedia.
- to the gold standards (in alignment format - see alignment api:
- to a csv file with all wikis available in Wikia/Fandom: Link
- to the unprocessed wiki dumps
- to the processed dump
- to the MTurk survey example (it can be also found in folder e_gold_mapping_interwiki)
- to the public available endpoint