Beruflich Dokumente
Kultur Dokumente
18 Free Exploratory Data Analysis Tools For People who don’t code so
well
BIG DATA BUSINESS ANALYTICS BUSINESS INTELLIGENCE
Introduction
Some of these tools are even better than programming (R, Python, SAS) tools.
All of us are born with special talents. It’s just a matter of time until we discover it and start believing in ourselves. We all
have limitations, but should we stop there? No.
When I started coding in R, I struggled. Sometimes a lot more than one can ever think! Because I had never ever coded
even <Hello World> in my entire life. My situation was similar to a guy who’s didn’t know swimming but was manhandled
into deep ocean, who somehow saved himself from drowning but ended up gulping lot of salty water.
Now when I look back, I laugh at myself. Do you know why? Because, I could have chosen one of several non-coding
tools available for data analysis, and could’ve avoided the suffering.
Data exploration is an inevitable part of predictive modeling. You can’t make predictions unless you know what happened
in the past. The most important skill to master data exploration is ‘curiosity’, which is free of cost yet isn’t owned by
everyone.
I have written this article to help you acknowledge various free tools available for exploratory data analysis. Now a days,
ample of tools are available in the market which are free & quite interesting to work with. These tools doesn’t require you
to code explicitly but simple drag – drop clicks does the job.
It supports all the important features like summarizing data, visualizing data, data wrangling etc. which are powerful
enough to inspect data from all possible angles. No matter how many tools you know, excel must feature in your armory.
Though, Microsoft excel is paid but you can still try various other spreadsheet tools like open o ce, google docs, which
are certainly worth a try!
Free Download:
2. Trifacta
Trifacta’s Wrangler tool is challenging the traditional methods of data cleaning and manipulation.
Since, excel possess limitations on data size, this tool has no such boundaries and you can securely
work on big data sets. This tool has incredible features such as chart recommendations, inbuilt
algorithms, analysis insights using which you can generate reports in no time. It’s an intelligent tool
focused on solving business problems faster, thereby allowing us to be more productive at data
related exercises.
Availability of such open source tools make us feel more con dent and supportive, that there are good people also,
around the world who are working extremely hard to make our lives better.
Free Download:
3. Rapid Miner
This tool emerged as a leader in 2016 Gartner Magic Quadrant for Advanced Analytics. Yes, it’s more
than a data cleaning tool. It extends its expertise in building machine learning models. Yes, it comprises
all the ML algorithms which we use frequently. Not just a GUI, it also extends support to people using
Python & R for model building.
It’s continues to fascinate people around the world with its remarkable capabilities. Above all, it claims to provide
analytics experience at lightning fast level. Their product line has several products built for big data, visualizations, model
deployment, some of which (enterprise) include a subscription fee. In short, we can say it’s a complete tool for any
business which requires performing all tasks from data loading to model deployment.
Free Download:
4. Rattle GUI
If you tried using R, but couldn’t get a knack of what’s going in, Rattle should be your rst choice.
This GUI is built on R and gets launched by typing install.packages("rattle") followed by library(rattle)
then rattle() in R. Therefore, to use rattle you must install R. It’s also more than just data mining tool.
Rattle supports various ML algorithms such as Tree, SVM, Boosting, Neural Net, Survival, Linear models etc.
It’s being widely used these days. According to CRAN, rattle is being installed 10000 times every month. It provides
enough options to explore, transform and model data is just few clicks. However, it has fewer options than SPSS for
statistical analysis. But, SPSS is a paid tool.
Free Download:
5. Qlikview
5. Qlikview
Qlikview is one of the most popular tool in business intelligence industry around the world. Deriving
business insights and presenting it in an awesome manner, it what this tool does. With it’s state of art
visualization capabilities, you’d be amazed by the amount of control you get while working on data.
It has an inbuilt recommendation engine to update you from time to time about best visualization
methods while working on data sets.
However, it is not a statistical software. Qlikview is incredible at exploring data, trend, insights but it can’t prove anything
statistically. In that case, you might want to look at other softwares.
Free Download:
6. Weka
An advantage of using Weka is that it is easy to learn. Being a machine learning tool, its interface is
intuitive enough for you to get the job done quickly. It provides options for data pre-processing,
classi cation, regression, clustering, association rules and visualization. Most of the steps you think
of while model building can be achieved using Weka. It’s built on Java.
Primarily, it was designed for research purposes at University of Wakaito, but later it got accepted
by more and more people around the world. However, overtime I haven’t seen an enthusiastic weka community like of R
and Python. The tutorial listed below should help you more.
Free Tutorial:
7. KNIME
Similar to RapidMiner, KNIME o ers an open source analytics platform for analyzing data,
which can later be deployed, scaled using other supportive KNIME products. This tool has
abundance of features on data blending, visualization and advanced machine learning
algorithms. Yes, using this tool you can build models also. Though, there hasn’t be enough talk about this tool, but
considering its state of art design, I think it will soon catch up much needed limelight.
Moreover, quick training lessons are available on their website to get you started with this tool right now.
Free Download:
8. Orange
As cool as its sounds, this tool is designed to produce interactive data visualizations and data
mining tasks. There are enough youtube tutorial to learn this tool. It has an extensive library of data
mining tasks which includes all classi cation, regression, clustering methods. Along with, the
versatile visualizations which get formed during data analysis allows us to understand the data
more closely.
To build any model, you’ll be required to create a owchart. This is interesting as it would help us further understand the
exact procedure of data mining tasks.
Free Download:
9. Tableau Public
9. Tableau Public
Tableau is a data visualization software. We can say, tableau and qlikview are the most
powerful sharks in business intelligence ocean. The of superiority is never ending.
It’s a fast visualization software which let’s you explore data, every observation using various
possible charts. It’s intelligent algorithms figure out by self about the type of data, best method available etc.
If you want to understand data in real time, tableau can get the job done. In a way, tableau imparts a colorful life to data
and let’s us share our work with others.
Free Download:
Free Download:
12. OpenRefine
It started as Google Re ne but looks like google plummeted this project due to reasons unclear.
However, this tool is still available renamed as Open Re ne. Among the generous list of open source
tools, openre ne specializes in messy data; cleaning, transforming and shaping it for predictive
modeling purposes. As an interesting fact, during model building, 80% time of an analyst is spent in data cleaning. Not so
pleasant, but it’s the fact. Using openrefine, analysts can not only save their time, but put it to use for productive work.
Free Download:
13. Talend
Decision making these days is largely driven by data. Managers & professionals no longer
make gut-based decision. They require a tool which can help them quickly. Talend can help
them to explore data and support their decision making. Precisely, it’s a data collaboration
tool capable of clean, transform and visualize data.
Moreover, it also o ers an interesting automation feature where you can save and redo your previous task on a new data
set. This feature is unique and haven’t been found in many tools. Also, it makes auto discovery, provides smart suggestion
to the user for enhanced data analysis.
to the user for enhanced data analysis.
Free Download:
A unique advantage of this tool is, the data set used for analysis doesn’t get stored in computer memory. This means you
can work on large data sets without having any speed or memory troubles.
Free Download:
15. DataCracker
It’s a data analysis software which specializes on survey data. Many companies do survey but
they struggle to analyze it statistically. Survey data are never clean. It comprises of lot of
missing & inappropriate value. This tool reduces our agony and enhances our experience of working on messy data. This
tool is designed such that it can load data from all major internet survey programs like surveymonkey, survey gizmo etc.
There are several interactive features which helps to understand data better.
Free Download:
Free Download:
Along with supervised learning algorithms, it is enabled with paradigms such as clustering, factorial analysis, parametric
and nonparametric statistics, association rule, feature selection and construction algorithms etc. Some of its limitations
include unavailability of wide set of data sources, direct access to datawarehouses and databases, data cleansing,
interactive utilization etc.
Free Download:
18. H2o
H2o is one of the most popular software in analytics industry today. In few years, this organization has
succeeded in evangelizing the analytics community around the world. With this open source software,
they bring lighting fast analytics experience, which is further extended using API for programming languages. Not just data
analysis, but you can build advanced machine learning models in no time. The community support is great, hence learning
this tool isn’t a worry. If you live in US, chances are they would be organizing a meetup nearby you. Do drop by!
Free Download:
Bonus Additions:
In addition to the awesome tools above, I also found some more tools which I thought you might be interested to look at.
However, these tools aren’t free but you can still avail them for trial:
1.
2.
3.
4.
End Notes
Once you start working on these tools (your choice), you’d understand that knowing programming for predictive modeling
isn’t much advantageous. You can accomplish the same thing with these open source tools. Therefore, until now, if you
were get disappointed at your lack of non-coding, now is the time you channelize your enthusiasm on these tools. You
may be interested to check .
The only limitation I see with these tools (some of them) is, lack of community support. Except few tools, several of them
don’t have a community to seek help and suggestions. Still, it’s worth a try!
Did you like reading this article? Have you worked on any of the tools listed above? Which one do you think is the most
versatile? Drop your suggestions / opinions in the comments below.
Got expertise in Business Intelligence / Machine Learning / Big Data / Data Science? Showcase your
knowledge and help Analytics Vidhya community by .
Share this:
RELATED
TAGS: DATA VISUALIZATION, H2O, MODEL BUILDING, NON CODING TOOLS , ORANGE, PREDICTIVE MODELING, QLIKVIEW, RAPIDMINER, STATISTICAL ANALYSIS, STATISTICAL MODELING, SURVEY DATA, TABLEAU, TOOLS FOR DATA
ANALYSIS
Next Article
Solutions for Skill test: Data Science in Python
Previous Article
Senior Database Administrator – Bengaluru ( 7-8 Years of Experience )
Author
2 1
C O M M E N T S
Manish, I enjoy your articles a lot, comprehensive list and it would make life so much easier for non coders? Great job!
Very Informative. I will try Rattle GUI in R. Thanks for the information.
Keep posting….
Great article, I believe BGML is another one to lookout for as it is picking up pretty good pace with analysts and data
scientists. The awesome thing about this tool is that it lets you download the algorithm as a code which can be used
directly for predictions.
Can you provide the URL for BGML , I am not able to find it on internet
Thank you Manish for the nice and informative article for individuals like me who are not from coding background.
Hello Manish,
More than any of the above mentioned tools, I found Microsoft Azure ML studio very useful, user-friendly and easy to
learn.
It is free, cloud based and has support for R and Python.
Thank you for the article, Manish. I have Tableau Desktop, but I am always keeping my eye on what tools are available. I
enjoyed the article.
The section on DataWrapper is missing a link. Thanks for this excellent article.
I was wondering if we have similar kind of tools to do exploration of data in text format, specifically for NLP related
problems.
Thanks Manish
I wonder what about Nvivo and spss ? Are they involved?
Hari Galla says: REPLY
S E P T E M B E R 2 8 , 2 0 1 6 A T 7 : 2 2 A M
Thank you manish, very informative article as always, will check out the trifacta tool.
Ramdas
Apache OpenOffice is not developed actively but LibreOffice is a good reincarnation and contains already several data
analysis tools – https://help.libreoffice.org/Calc/Data_Statistics_in_Calc
hi
i think this is a great website
LEAVE A REPLY
Your email address will not be published.
Comment
Name (required)
Email (required)
Website
SUBMIT COMMENT
1 vopani 8714
1 vopani 8714
2 SRK 8287
3 aayushmnit 7419
4 mark12 6269
5 sonny 5937
More Rankings
POPULAR POSTS
RECENT POSTS
Automatic Image Captioning using Deep Learning (CNN and LSTM) in PyTorch
FAIZAN SHAIKH , APRIL 2, 2018
25 Open Datasets for Deep Learning Every Data Scientist Must Work With
PRANAV DAR , MARCH 29, 2018
AVBytes: AI & ML Developments this week – IBM’s Library 46 Times Faster than TensorFlow, Baidu’s Massive Self-Driving Dataset, the
Technology behind AWS SageMaker, etc.
PRANAV DAR , MARCH 26, 2018
GET CONNECTED
15,284
FOLLOWERS
44,830
FOLLOWERS
2,689
FOLLOWERS
Email
SUBSCRIBE
DATA SCIENTISTS
COMPANIES
JOIN OUR COMMUNITY :
Don't have an account? Sign up here.
44896 © Copyright 2013-2018 Analytics Vidhya.
15295
2691
5065