Sie sind auf Seite 1von 63

Introduction to open data journalism: finding stories in data

The Open Data Institute, London, 23 April, 2013


The SKOR Codex (2012), La Socit Anonyme, ODI Commission Data as Culture
Slides by Lisa Evans and Kathryn Corrick

Introductions
Lisa Evans Data Wrangler, School of Data, Open Knowledge Foundation Former data journalist, Guardian Kathryn Corrick Training Business Manager, ODI UK Chair, Online News Association

Slides by Lisa Evans and Kathryn Corrick

Introductions

Slides by Lisa Evans and Kathryn Corrick

Data journalism packs


Go to: http://tinyurl.com/odi-dj

Slides by Lisa Evans and Kathryn Corrick

Telling stories with data


The SKOR Codex (2012), La Socit Anonyme, ODI Commission Data as Culture
Slides by Lisa Evans and Kathryn Corrick

http://understandinguncertainty.org
Slides by Lisa Evans and Kathryn Corrick

http://understandinguncertainty.org/files/animations/Nightingale11/Nightingale1.html
Slides by Lisa Evans and Kathryn Corrick

Where does my money go?

http://wheredoesmymoneygo.org/bubbletree-map.html#/~/total/health
Slides by Lisa Evans and Kathryn Corrick

Data digging

Slides by Lisa Evans and Kathryn Corrick

Big leaks

Source: http://www.icij.org/offshore/how-icijs-project-team-analyzed-offshore-files
Slides by Lisa Evans and Kathryn Corrick

Gunter Sachs offshore network


http://www.icij.org/offshore/interactive-gunter-sachs-network
Slides by Lisa Evans and Kathryn Corrick

Data on maps can be enough


http://www.guardian.co.uk/news/datablog/interactive/2011/aug/09/uk-riots-incident-map
Slides by Lisa Evans and Kathryn Corrick

Riots and poverty mapped


http://www.guardian.co.uk/news/datablog/interactive/2011/aug/16/riots-poverty-map
Slides by Lisa Evans and Kathryn Corrick

http://www.guardian.co.uk/news/datablog/2012/feb/29/uk-hospital-heart-surgery-mortality-rate
Slides by Lisa Evans and Kathryn Corrick

Funnel plots
www.ncbi.nlm.nih.gov/pubmed/15568194
Slides by Lisa Evans and Kathryn Corrick

Exercise
Discuss in your groups what you have just seen.
Any surprises?

Any things to note?

Slides by Lisa Evans and Kathryn Corrick

What is data?

Photo: Lisa Evans


Slides by Lisa Evans and Kathryn Corrick

Data is a record of some information e.g. written, digital


Digital data is something you can keep on your computer

Defining data

Raw data is exactly as it was collected from a source


Structured data is organised so it's easier to use e.g. data in a spreadsheet Big data is too big to be stored on one computer, instead you need parallel servers

Personal data relates to an individual who can be identified from that information
Slides by Lisa Evans and Kathryn Corrick

Open Definition
Open data is data that can be freely used, reused and redistributed by anyone subject only, at most, to the requirement to attribute and sharealike. OpenDefinition.org

Slides by Lisa Evans and Kathryn Corrick

Break

Photo: Kathryn Corrick


Slides by Lisa Evans and Kathryn Corrick

Discussion: What makes a trusted (data) source?


Slides by Lisa Evans and Kathryn Corrick

Trusted data sources


Show their methods
Are open to inquiries and timely in their replies

The team includes a statistician


Have a good track record

Slides by Lisa Evans and Kathryn Corrick

Trusted data sources


Question everything

Slides by Lisa Evans and Kathryn Corrick

How to stay up to date with data releases


Office for National Statistics release calendar Parliamentary releases mailing list Planning alerts mailing list RSS feeds Press releases (see your packs for links)
Slides by Lisa Evans and Kathryn Corrick

UK law & licensing*

* What follows should not be taken or used as legal advice.


Photo Jason Morrison: http://www.sxc.hu/photo/952313
Slides by Lisa Evans and Kathryn Corrick

Key laws affecting data journalism


Intellectual Property - copyright and database rights

Computer Misuse
Data Protection

Freedom of Information Act

Slides by Lisa Evans and Kathryn Corrick

What are intellectual property rights?


Rights which are given which allow ownership of creations Patents Trade marks Design rights Copyright Database rights Many creations are a bundle of rights protected by more than one or all of the above
Slides by Lisa Evans and Kathryn Corrick

Copyright Designs & Patents Act 1988


Original works - e.g. content, graphics, text, music Gives exclusive rights to the author of the work allowing the author to control the copying and exploitation of it Arises automatically Fair dealing - criticism or review, reporting current events, non-commercial research, educational use Beware public domain assumption and myth

Slides by Lisa Evans and Kathryn Corrick

Database definition
A collection of independent works, data or other materials which are arranged in a systematic or methodical way and are individually accessible by electronic or other means
See: http://www.out-law.com/page-5698 The Copyright and Rights in Databases (Amendment) Regulations 2003 http://www.legislation.gov.uk/uksi/2003/2501/contents/made The Copyright and Rights in Databases Regulations 1997 http://www.legislation.gov.uk/uksi/1997/3032/contents/made
Slides by Lisa Evans and Kathryn Corrick

Databases
Copyright
Creative effort and substantial investment in the selection and presentation Individual components of the database

Database rights
Substantial investment in obtaining, verifying and presenting the database

Slides by Lisa Evans and Kathryn Corrick

Rule of thumb
Do you have rights or permission to publish?

Do you have rights to use the information/data?


Is the data derived from other sources?
(see licensing)

Slides by Lisa Evans and Kathryn Corrick

Computer Misuse Act


Offences
Unauthorised access to computer material

Unauthorised access with intent to commit or facilitate further offences


Unauthorised modification of computer material

Penalties
2 10 years imprisonment

Fines
Slides by Lisa Evans and Kathryn Corrick

Rule of thumb
Leaks get the legal team in

Slides by Lisa Evans and Kathryn Corrick

Data Protection
Personal Data
UK Data Protection Act 1998

Data relating to a living identifiable person must be processed fairly and lawfully
Processing that is not immediately apparent to users e.g. cookies (new laws and guidance) damages available to data subjects

Slides by Lisa Evans and Kathryn Corrick

Rule of thumb
Does this data contain personal identifiable data?
Could this data be used combined with another data set to create personal identifiable data? Anonymisation is hard
Further reading: ODI Friday lectures on these topics
http://www.scribd.com/doc/128356210/Business-considerations-for-privacy-and-open-datahow-not-to-get-caught-out http://www.scribd.com/doc/125638490/Getting-to-grips-withthe-National-Pupil-Database-personal-data-in-an-open-data-world

Slides by Lisa Evans and Kathryn Corrick

Licenses: what to look for


Licenses identify the scope and limited of how intellectual property can be used Commonly used in the UK:

All rights reserved


Royalty free license

Paid-for license
Open Government License

Creative Commons License


Slides by Lisa Evans and Kathryn Corrick

Rule of thumb
If you are uncertain about what rights you may have over a piece of content, data or dataset or how you can use it
Contact the owner. Ask.

Slides by Lisa Evans and Kathryn Corrick

Exercise:
See journalist pack for trusted sources exercise:

http://tinyurl.com/odi-dj

Slides by Lisa Evans and Kathryn Corrick

Freedom of information Act 2000


Provides public access to recorded information held by public authorities
The Act does not necessarily cover every organisation that receives public money Recorded information includes printed documents, computer files, letters, emails, photographs, and sound or video recordings
Slides by Lisa Evans and Kathryn Corrick

FOIA tips
Sign up to 'What Do They Know?

https://www.whatdotheyknow.com/ Always check commercial confidentiality. See Information Commissioner Office advice:
http://www.ico.org.uk/~/media/documents/library/Environmental_info_reg/Practical_application/eir_con fidentiality_of_commercial_or_industrial_information.ashx

Slides by Lisa Evans and Kathryn Corrick

Finding the story: choosing your data


Exercise: http://tinyurl.com/dj-tax-exercise
Slides by Lisa Evans and Kathryn Corrick

Exercise (optional)
Create a decision/story tree for Local council spending Or NHS reforms

Slides by Lisa Evans and Kathryn Corrick

Exercise
Find your data at: http://tinyurl.com/odi-dj
See: The Data The Source

1. What does your data tell you?


2. Add a new sheet with your names and email addresses
Slides by Lisa Evans and Kathryn Corrick

Exercise: data cleaning


Remove people or signs from your data
Check spelling and clarity

Remove the dummy data column


Bold headings and freeze 1st row and maybe 1st column

Slides by Lisa Evans and Kathryn Corrick

Exercise: using your data


Does it make sense to sum your data or is it already summed?
If it is summed move to bottom of sheet if not summed make them.

Slides by Lisa Evans and Kathryn Corrick

Exercise: using your data


Take two columns of your data and copy them to a new sheet
Go to the chart icon when on your new sheet Choose a suitable chart and give it a title and label the axes

Does the chart show what you wanted it to?


What difficulties did you encounter and how did you solve them?

Slides by Lisa Evans and Kathryn Corrick

Break

Photo: Kathryn Corrick


Slides by Lisa Evans and Kathryn Corrick

Crowdsourcing data

Slides by Lisa Evans and Kathryn Corrick

Haiti 2010

http://blog.ushahidi.com/2012/01/12/haiti-and-the-power-of-crowdsourcing/ http://www.guardian.co.uk/technology/2010/feb/04/mapping-open-source-victor-keegan http://wiki.openstreetmap.org/wiki/WikiProject_Haiti/Earthquake_map_resources Images : http://irevolution.files.wordpress.com/2010/01/ex41.png


Slides by Lisa Evans and Kathryn Corrick

Boston bombing 2012

http://www.bbc.co.uk/news/technology-22214511 http://www.reddit.com/r/findbostonbombers http://blog.reddit.com/2013/04/reflections-on-recent-bostoncrisis.html?m=1


Slides by Lisa Evans and Kathryn Corrick

Selection of tools
Ushahidi.com
Swiftriver.com

Crowdmap.com (closed beta at the moment)


Google Drive Forms and Spreadsheets

Twitter

Slides by Lisa Evans and Kathryn Corrick

Google Drive Demo

Slides by Lisa Evans and Kathryn Corrick

Ushahidi
Ushahidi was designed to easily crowdsource information using multiple channels, including SMS, email, Twitter and the web.

Ushahidi.com, http://vimeo.com/7838030
Slides by Lisa Evans and Kathryn Corrick

Crowdmap.com

https://womenundersiegesyria.crowdmap.com/
Slides by Lisa Evans and Kathryn Corrick

Exercise: Crowdsourcing data


Find the crowd sourcing exercise at:

http://tinyurl.com/odi-dj
Complete the form to see the results

Slides by Lisa Evans and Kathryn Corrick

Crowdsourcing tips

State in one sentence what you want to achieve with crowdsourcing


Have a clear procedure for verifying data

Can you identify individuals from your data presentation? What effect will this have? People are more likely to join in if they feel in safe hands

Slides by Lisa Evans and Kathryn Corrick

Visualising your data

https://google-developers.appspot.com/chart/interactive/docs/gallery
Slides by Lisa Evans and Kathryn Corrick

Exercise: Google Charts


Find the Google Charts exercise at:

http://tinyurl.com/odi-dj

Slides by Lisa Evans and Kathryn Corrick

https://github.com/mbostock/d3/wiki/Gallery
Slides by Lisa Evans and Kathryn Corrick

https://github.com/mbostock/d3/wiki/Gallery
Slides by Lisa Evans and Kathryn Corrick

Time for questions

?
Slides by Lisa Evans and Kathryn Corrick

Thank you
Lisa Evans @objectgroup

Lisa.Evans@okfn.org Kathryn Corrick @kcorrick


Kathryn.Corrick@theodi.org

Slides by Lisa Evans and Kathryn Corrick

Links mentioned on the course


http://OpenCorporates.com http://Scribd.com http://Slideshare.net http://www.Prescribinganalytics.com www.alltrials.net RSS Readers https://docs.google.com/spreadsheet/ccc?key=0ApTo6f5Yj1iJd FRfWmhUVjV0WkktTjJhUUE4dGR5WUE#gid=0 More data http://data.worldbank.org/ http://Openstreetmap.org http://Datawrapper.de
Slides by Lisa Evans and Kathryn Corrick

Das könnte Ihnen auch gefallen