Beruflich Dokumente
Kultur Dokumente
Ian Crosland
Jan 2016
Accelerate Big Data ROI
Qliks platform drives higher ROI by delivering big data in context with
other data to ensure that Big Data stays relevant.
Partner for Success
Broad range
of technology partnerships
And more
Hadoop
Apache Hadoop v2
HDFS2
Stinger Initiative
Hive on Tez
YARN integration
Distributed execution framework Query Optimisations
Eliminate extra map reads Vectorised query execution
Dataflow model on DAG of nodes Filter at storage layer vs SQL engine
SQL cost based optimiser
ORCFile format
Higher compression 145 developers 44 companies
Columnar Connect via
Ideal for frequent fact filters
ODBC
Source: http://hortonworks.com/labs/stinger/
Impala
Parquet file format
Driven from Twitter use cases
Columnar data storage
Limits IO to data needed
Space saving
Metastore
Can be same DB as Hive
metastore e.g. MySQL
Query optimiser can use
table/column stats
Can use Hbase/HDFS with
several file formats e.g.
Impala RCfile/Parquet
SQL Cost based optimisations
Authentication, AD/Kerberos
YARN integration
In memory caching
Impala Roadmap
Additional SQL support Connect via
S3 integration ODBC
Nested data
Source: http://blog.cloudera.com/blog/2014/08/whats-next-for-impala-focus-on-advanced-sql-functionality/
Other Sources: http://blog.cloudera.com/blog/2015/02/how-to-do-real-time-big-data-discovery-using-cloudera-enterprise-
and-qlik-sense/
Apache Drill
Alpha/Pre-alpha
Source: https://databricks.com/spark/about
Deeper Dive
Qlik Big Data Methodologies
Different data volumes and complexities are best met
using different methods
On Demand
App Generation
Different methods ensure an
optimized experience for the user for
every situation
In-Memory Segmentation
On Demand App Generation
Dimension selection to
generate filtered analytics
2 5
User makes Highly interactive user
personalized experience within a
selections from purpose built analysis
across many data app!
sources
Case Study - Telco
EDX
Publisher
Qlik Sense and Elastic Tweets Example
2
1. Tweets are populated into a Elastic APIs
DB via Logstash Proxy,
Engine and
2. User searches for Tweets stored in Repository
Web Page
the Elastic DB from a custom web Index.html
page
1
Architecture
External
Authentication 4242
4243