Sie sind auf Seite 1von 13

E10 – Knowledge Graphs in Practice

Lecturer: Tobias Käfer


Slides: Tobias Käfer, Kian Schmalenbach
CC-BY4.0
INSTITUT AIFB – WEB SCIENCE

KIT – The Research University in the Helmholtz Association


www.kit.edu
Quiz (1)
Decide whether the following statements are true or false:

a) Open data means that anyone should be able to purchase the


data.
FALSE: Open means anyone can freely access, use, modify, and share the data
for any purpose.

b) A comma-separated values (CSV) file accessible via HTTP is 4-


star open data.
FALSE: It is 3-star open data.

c) Openly licensed data published according to the Linked Data


principles is 5-star open data.
TRUE

2 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Quiz (2)
Decide whether the following statements are true or false:
d) The knowledge graph movement embraces a graph-structured approach
to representing data.
TRUE

5) Loose restrictions on syntax and technology for publishing structured data


as put forward by Google facilitate both, easy publication and
consumption of the data.
FALSE: While data publication is made easy (publishers can see their data processed
even if it's not strictly following the standards), consumers need a sophisticated non-
standard pipeline to clean the data before processing.

e) Extract-Transform-Load pipelines materialize data at a central location.


TRUE

f) In virtual integration systems, the data remains at the source.


TRUE

3 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Quiz (3)
Decide whether the following statements are true or false:
7) All SPARQL query processors evaluate queries against a set of
endpoints.
FALSE: They can also use a link traversal engine to retrieve the data to be queried via
GET requests from HTTP servers.

8) SPARQL queries can be executed against a set of RDF graphs.


TRUE

9) According to the SPARQL recommendation, users must specify an


endpoint for a federated SPARQL query using the SERVICE clause.
TRUE. If there is a SERVICE clause in a SPARQL query, then federated execution will
be performed, there is no other way to perform federated queries. Note that a FROM
clause that causes dereferencing is different from SERVICE, as FROM in that case
sends a GET request to retrieve a graph, whereas SERVICE sends a query to get
SPARQL query results. If there is automatic source selection going on for a SPARQL
query without SERVICE, then the source selection mechanism will compose a
SPARQL query with SERVICE, and then execute it (in a federated fashion).

4 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 2
Consider the RDF document available at
https://www.govdata.de/ckan/dataset/stlae-service-14341-01-03-2.rdf,
printed partially in the following in Turtle serialization:
@prefix dcat: <http://www.w3.org/ns/dcat#> .
@prefix dct: <http://purl.org/dc/terms/> .
@prefix schema: <http://schema.org/> .
<https://ckan.govdata.de/dataset/246fb390-a74a-5155-9ec1-f123896b2e5d>
rdf:type dcat:Dataset ;
dct:title """Landtagswahlen: Berlin - Wahlberechtigte und –beteiligung,
Gültige Stimmen nach Parteien - Wahl-/Stichtag - regionale
Tiefe: Land""" ;
dct:issued "2017-1214T00:00:00"^^<xsd:dateTime> ;
dct:modified "2019-02-07T02:01:25.760067"^^<xsd:dateTime> ;
dct:temporal [
rdf:type dct:PeriodOfTime ;
schema:startDate "1995-10-22T00:00:00"^^<xsd:dateTime> ;
schema:endDate "2016-09-18T23:59:59"^^<xsd:dateTime> ] .

5 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 2 (2)
Consider the RDF document available at
https://www.govdata.de/ckan/dataset/stlae-service-14341-01-03-2.rdf,
printed partially in the following in Turtle serialization (continued):
<https://ckan.govdata.de/dataset/246fb390-a74a-5155-9ec1-f123896b2e5d/
resource/0ed92201-030c-4f1b-a2d0-6574719ee357>
rdf:type dcat:Distribution ;
dcat:accessURL <https://www.regionalstatistik.de/genesisws/
downloader/00/14341-01-03-2_00.csv> ;
dct:description "CSV-Datei der Tabelle ’14341-01-03-2’" ;
dct:format "CSV" ;
dct:language "http://publications.europa.eu/resource/authority/
language/DEU" ;
dct:license <http://dcat-ap.de/def/licenses/dl-by-de/2.0> .

6 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 2 (3)
Consider the RDF document available at
https://www.govdata.de/ckan/dataset/stlae-service-14341-01-03-2.rdf.
Based on the five-star schema, how many stars would you assign?

Solution:

Since the data published on the web in RDF format, but does not
include links to other Linked Data sources, we can assign 4 stars.

7 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 3
Consider the CSV file at https://www.regionalstatistik.de/genesisws/
downloader/00/14341-01-03 -2_00.csv, printed partially in the
following:
GENESIS-Tabelle: 14341-01-03-2
Landtagswahlen: Berlin - Wahlberechtigte und;;;;;;;;;;
-beteiligung, Gültige Stimmen nach Parteien;;;;;;;;;;
- Wahl-/Stichtag - regionale Tiefe: Land;;;;;;;;;;
Berlin;;;;;;;;;;
Landtagswahlen: Berlin;;;;;;;;;;
; Gültige Stimmen;Gültige Stimmen;Gültige Stimmen;Gültige Stimmen;Gültige
Stimmen;…;;;;Parteien;Parteien;Parteien;Parteien;Parteien;Parteien;Parteien
;;;;CDU/CSU;SPD;GRÜNE;FDP;DIE LINKE;AfD;Sonstige Parteien
;Anzahl;Prozent;Anzahl;Anzahl;Anzahl;Anzahl;Anzahl;Anzahl;Anzahl;Anzahl
18.09.2016;2485379;66,9;1635169;287997;352430;248324;109500;255701;231492;149725
18.09.2011;2469716;60,2;1461185;341158;413332;257063;26943;171050;-;251639
17.09.2006;2425480;58,0;1377355;294026;424054;180865;104584;185185;-;188641
21.10.2001;2417574;68,1;1623338;385692;481772;148066;160953;366292;-;80563
10.10.1999;2414493;65,5;1563576;637311;349731;155322;34280;276869;-;110063
22.10.1995;2479735;68,6;1669186;625005;393245;219990;42391;244196;-;144359

8 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 3 (2)
Consider the CSV file at https://www.regionalstatistik.de/genesisws/
downloader/00/14341-01-03 -2_00.csv.
Based on the five-star schema, how many stars would you assign?

Solution:

Since the file is available in a non-proprietary format on the web


and has a machine-readable structure, we can assign 3 stars.

9 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 4
Represent the information about parties and their election results
from Question 3 in RDF and describe how you would publish the
RDF graph as Linked Data.

Solution (Example):
_:bn1 :date "2016-09-18"^^xsd:date ;
:eligible 2485379 ;
:participation 0.669 ;
:results [
:afd 231492 ; :cdu 287997 ; :fdp 109500 ; :grüne 248324 ;
:linke 255701 ; :sonstige 149725 ; :spd 352430
] ;
:validVotes 1635169 .
_:bn2 :date "2011-09-18"^^xsd:date ;

10 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 4 (2)
Represent the information about parties and their election results
from Question 3 in RDF and describe how you would publish the
RDF graph as Linked Data.

Solution (continued):

In order to publish the RDF graph as Linked Data, one would…

save the RDF graph to a file, e.g. a Turtle file,

upload the Turtle file to a web server,

and make sure to include URIs to other sources (e.g. Wikidata,


GovData.de, …).

11 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 5

The first task of the Bonus Exercise was:


1. Pick a tram station in Karlsruhe (please not everybody
Durlacher Tor)
2. Create a Turtle file to describe the tram station (incl.
name, latitude and longitude - I don't limit your
creativity regarding predicates)
3. Publish the file on your web space
Name and describe the steps necessary to build a
Linked Data integration system (virtual or
warehoused)

12 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)
Question 5 (2)

Steps
1. Provide mappings between the sources (schema and
instance level)
2. Gather and reason over the data
3. Provide an interface
The difference between virtual and warehoused
integration is in step 2

13 13.07.2020 Knowledge Graphs in Practice Institut für Angewandte Informatik und Formale
Beschreibungsverfahren (AIFB)

Das könnte Ihnen auch gefallen