Beruflich Dokumente
Kultur Dokumente
Toolbox
{ A Brief Introduction to Data Science
Understand some basic concepts on Data
Science and Data Analytics
Describe the three important parts of the
Data Analysis Process
Enumerate some commercial and open
source tools and systems used for Data
Analytics
Objectives
Basic Concepts on Data Science
The Data Analysis Process
Topic Coverage
Why Bother
with Data
Analysis?
{ Basic Concepts on Data Science
Technology has made everything
measureable, available
{
But who is capable of understanding and
managing large quantities of data?
Its possible that a person can have vast
quantities of data
{
but does he UNDERSTAND their value?
{
What is Data Analysis?
Basic Concepts on Data Science
Data Analysis is the process of:
Designing a
Obtaining Raw Converting data model / analysis
Data to information from the
information
Data Analysis is a Closed-Loop Process
The Goal of Data Analytics:
Data Visualization
Data Collection
Data Analysis Process
{
What is Data Collection?
DATA ACCURACY
{
DATA INTEGRITY
Data Collection Tools /
Sources
* Google Forms
* KoBoToolBox
* ThinkUp
* Piwik
* Commercial Reporting
{
Management Systems
* Company Archives
Data Processing & Analysis
Data Analysis Process
{
Data Processing is the manipulation of raw data -- using
techniques such as aggregation, conversion, sorting in
order to derive with a desired information set.
{
Data Processing also involves the creation of data models
which is used to prune the data and extract only the
significant parts.
{
Big Data Analysis
Platforms and Tools
1. Hadoop
2. MapReduce
3. GridGain
4. HPCC
5. Storm
6. Cassandra
7. HBase
8. MongoDB {
9. Neo4j
10. CouchDB
Data Analysis and
Presentation
{Data Analysis Process
Data Visualization
Data Analysis Process
Once data has been analyzed and
processed, its important to present
the information in a clear, concise
manner
which is in the form of Visualizations.
Dashboards: For presenting multiple view points at
once
Infographic:
An artistic
version of a
dashboard
Infographic:
An artistic
version of a
dashboard
Word Cloud: A replacement to the bar graph
Sample Visualization
Tools:
1. Dygraphs
2. ZingChart
3. InstantAtlas Data Analysis and
4. Wolfram Alpha
5. Tableau Public Presentation
6. Visual.ly
{ Data Analysis Process
Data Analysis == Information
== Intelligence
Questions?
End of Presentation.
Illustration Sources:
http://www.datanami.com/wp-content/uploads/2014/04/fast_data_brain_tree.png
http://kpcw.org/post/compiled-data-keep-wasatch-being-loved-death
http://citdev.com/wp-content/uploads/2015/02/data.jpg
https://rwconnect.esomar.org/wp-content/uploads/2011/10/5-things-featured-image.jpg
http://www.corebehavior.com/wp-content/uploads/2013/08/too-much-data.png
http://upload.wikimedia.org/wikipedia/commons/8/81/UML_Diagrams.jpg
http://www.irissoftinc.com/upload/images/masthead/Solutions_DataAggregation_1a.jpg
http://technologist-work.com/wp-content/uploads/2014/09/Hadoop.png
https://gpo.co/wp-content/uploads/2013/09/Screen-Shot-2013-09-05-at-2.38.06-PM.png
http://blogs-images.forbes.com/davefeinleib/files/2014/06/big-data-landscape-jul-4-2012-
00111.pnghttps://pierstransportation.files.wordpress.com/2013/02/chemical-dashboard.jpg
http://www.conceptdraw.com/How-To-Guide/picture/Sale_Dashboard_Meter_Dashboard.png
http://atlassian.wpengine.netdna-cdn.com/devtools/fisheye-source-crucible-code-review-
dashboard.png
http://e.fastcompany.net/multisite_files/codesign/imagecache/inline-zoom/post-inline/inline-
zoom-1-Pinterest_Infographic.jpg
https://s3.amazonaws.com/ten-star/mobclix-oct2011-infographic.jpg
http://www.gender-focus.com/wp-content/uploads/2013/12/word-cloud.png
Information Sources:
http://en.wikipedia.org/wiki/Data_analysis
http://www.emc.com/microsites/bigdata/infographic.htm
http://www-01.ibm.com/software/data/bigdata/images/4-Vs-of-big-data.jpg
https://columbiadatascience.files.wordpress.com/2013/09/screen-shot-2013-09-16-at-1-33-52-
pm.png
http://nesurv.com/pdfs/multimode_data_collection.pdf
http://ori.hhs.gov/education/products/n_illinois_u/datamanagement/dctopic.html
http://en.wikipedia.org/wiki/Data_processing_system
http://www.datamation.com/data-center/50-top-open-source-tools-for-big-data-1.html