Beruflich Dokumente
Kultur Dokumente
Krzysztof Bartoszek
(slides by Leif Jonsson and Måns Magnusson)
Linköping University
krzysztof.bartoszek@liu.se
18 September 2017
1/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Today
Basic I/O
Cloud storage
web scraping
Shiny
Relational Databases
2/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
3/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Format, localization
4/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
http://www.joelonsoftware.com/articles/Unicode.html
The Absolute Minimum Every Software Developer Absolutely, Positively
Must Know About Unicode and Character Sets (No Excuses!)
5/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
”Formats”
6/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Localization
7/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
load ()
save ()
8/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
software/data package
Excel XLConnect
SAS, SPSS, STATA, ... foreign
XML xml
JSON (GeoJSON) rjsonio, RJSON
Documents tm
Maps sp
Images raster
Table: Format - R package
9/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Cloud storage
10/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Why?
Robust
Backups
Cloud computing
but
11/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Why?
Robust
Backups
Cloud computing
BUT
11/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Why?
Robust
Backups
Cloud computing
Localization
12/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
API Packages
Remote package
General downloader
GitHub repmis, downloader
Dropbox rdrop
Amazon RAmazonS3
Google Docs googlesheets
13/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
web APIs
application program interface using http
examples:
github
Riksdagen
Statistics Sweden
14/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
RESTful
Basic principles:
15/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
16/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
17/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Verbs
Verb Description
GET Get ”data” from server.
POST Post ”data” to server (to get something)
PUT Update ”data” on server
DELETE Delete resource on server
18/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Status codes
Code Description
1XX Information from server
2XX Yay! Gimme’ data!
3XX Redirections
4XX You failed
5XX Server failed
19/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
http://www.linkoping.se/open/data/Luftkvalitet/
Linköping Luftkvalitet API
https://developers.google.com/maps/documentation/geocoding/intro
Google Map Geocode API
20/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
21/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
JSON
{
" firstName " : " John " ,
" lastName " : " Smith " ,
" age " : 25 ,
" address " : {
" streetAddress " : " 21 2 nd Street " ,
" city " : " New York " ,
" state " : " NY " ,
" postalCode " : " 10021 "
},
" phoneNumber " : [
{ " type " : " home " , " number " : " 212 555 " } ,
{ " type " : " fax " , " number " : " 646 555 " }
],
" newSubscription " : false ,
" companyName " : null
} 22/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
XML
<? xml version = " 1.0 " encoding = " utf -8 " ? >
< wikimedia >
< projects >
< project name = " Wikipedia " launch = " 2001 -01 -05 " >
< editions >
< edition language = " English " > en . wikipedia . org </ edition >
< edition language = " German " > de . wikipedia . org </ edition >
< edition language = " French " > fr . wikipedia . org </ edition >
< edition language = " Polish " > pl . wikipedia . org </ edition >
< edition language = " Spanish " > es . wikipedia . org </ edition >
</ editions >
</ project >
< project name = " Wiktionary " launch = " 2002 -12 -12 " >
< editions >
< edition language = " English " > en . wiktionary . org </ edition >
< edition language = " French " > fr . wiktionary . org </ edition >
< edition language = " Vietnamese " > vi . wiktionary . org </ edition >
< edition language = " Turkish " > tr . wiktionary . org </ edition >
< edition language = " Spanish " > es . wiktionary . org </ edition >
</ editions >
</ project >
</ projects >
</ wikimedia >
23/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
web scraping
24/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
HTML
25/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
(har)rvest
Download data
Parse data
Follow links
Fill out forms
Store crawling history
26/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Scraping is fragile!
Difficulties and bad spiders
www.domain.se/robot.txt
Politeness
robot traps
javascript
delays
27/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Shiny?
online or local
R as ”backend”
28/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Shiny?
https://www.rstudio.com/products/shiny/shiny-user-showcase/
Shiny Examples
29/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
How it works
Application
Reactive
MyAppName/server.R
MyAppName/ui.R
30/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Shiny Example
library ( shiny )
# Examples with code
runExample ( " 01 _hello " )
runExample ( " 03 _reactivity " )
31/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Publish Shiny
locally
zip-file in cloud
github (see runGithub() )
32/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Publish Shiny
locally
zip-file in cloud
github (see runGithub() )
33/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
Relational Databases
local or online
difficult to design
34/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
35/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
36/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
A good database
37/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
A good database
37/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
A good database
Easy to query
37/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
38/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5
Input and output Basic I/O Cloud storage web APIs web scraping Shiny Relational Databases
39/ 39
K. Bartoszek (STIMA LiU) STIMA LiU
Lecture 5