Beruflich Dokumente
Kultur Dokumente
Volume
4 v s of Variety
Big data analytics investigate the lowest granular details of
big
data
business operations and find their way in standard reports.
Veracity
Data at rest and data in motion both come under big data,
accordingly technology for handling such data have been
grouped: batch processing analytic for data at rest and stream
Fig. 1 4 Vs of Big Data processing analytics for data under motion.
Manuscript received May 20, 2014. Large volumes (gigabytes, terabytes, petabytes) of data are
Mrs.S.V.Balshetwar, Information Technology, Satara College of stored in memory or disk which we can call data at rest.
Engineering and Management, Limb, Satara, India. Hadoop is one of the most populat technology for batch
Dr. R.M. Tugnayat, Principal, Shri Shankarprasad Agnihotri College of
Engineering,Wardha, India
processing. The Hadoop framework make use of HDFS for
317 www.erpublication.org
Big Data: Insight
Data source- n
Data store +
Procedural models
IV. DATABASES FOR BIG DATA VIII. BIG DATA PROCESSING REQUIREMENTS
Increasing volume of data is beyond the traditional RDBMS For placing data, a proper structure is required and the
models but it is easier to map such data objects onto NOSQL requirements are:
databases. Databases used for big data storage are viz: 1. Reduced time for loading data.
Key-value databases, Document Storage database, Graph 2. Speed must be increased for processing of queries.
Database,Object Database, Multi-model Database, 3. Utilizing the storage space efficiently.
Column-oriented databases, Schema-less databases or 4. Handling dynamic, unstructured data patterns.
NoSQL databases[2].
For efficient querying and storage of big data databases such
as Cassandra, Couchdb, Greenplum Database, Hbase,
Mongodb, Vertica, Aster Data, Hypertable, Big Table, IX. BIG DATA STORAGE ARCHITECTURE CONSIDERATIONS
Saphana, Infogrid, Hypergraphdb, Allegrograph, Bigdata,
Versant, Db4-O, Allegrograph, Virtuoso, Terrastore, In order to get proper solution for storage of big data
Leveldb, Couchbase, Server,Berkeley Db,Voldemort, following factors can be considered:
Memcachedb, Amazon Dynamodb, Dynomite. 1. Big data comes at high speed so the bandwidth
V. TOP LANGUAGES FOR BIG DATA ANALYTICS requirement as per application is an important
The most popular languages continue to be R (used by 61% of consideration.
KDnuggets readers), Python (39%), and SQL (37%). SAS is 2. What is the data type: structured, unstructured or
stable at around 20%. The highest growth was for mix.
Pig/Hive/Hadoop-based languages, R, and SQL, while Perl, 3. Is the big data considered for application is
C/C++, and Unix tools and Clojure are also on high rate in distributed or concentrated at one physical location?
popularity. 4. How is the data? is it stratum or not.
5. How is the access to data? Is old data required often,
VI. BIG DATA SOFTWARE does it get mix with new data access very often.
318 www.erpublication.org
International Journal of Engineering and Technical Research (IJETR)
ISSN: 2321-0869, Volume-2, Issue-5, May 2014
2. Massively Parallel Processing (MPP) data warehousing Data in todays world has raised importance of Big Data in
appliances: In this infrastructure every processor has its academic research also.
own operating system and memory; it can grow
horizontally and is used mainly for structured data. ACKNOWLEDGMENT
3. NoSQL platforms : NoSQL database, also called Not
Only SQL, is an method to data management and I would like to thank Prof. (Dr.) R.M. Tugnayat for his
database design that's useful for very large sets of encouragement and comments on earlier drafts of this paper.
distributed data. NoSQL is especially useful when an Without the support of my loving husband V.P. Balshetwar
enterprise needs to access and analyze massive amounts (B.E. Electronics ) & our kids Siddhesh and Katyayani it
of unstructured data or data that's stored remotely on would have been impossible to complete this work.
multiple virtual servers. I would also like to thanks Dhiraj Mothghare, Vijay
Urade, Raman Bane who have encouraged in selecting this
topic for research.
XI. BETTER USE OF BIG DATA
REFERENCES
Like bacteria, big data are lurking in the stomachs of cows.
Some farmers are using sensors and software to analyze it [1] Improving Decision Making in the World of Big
and predict when a cow is getting ill [8]. Datahttp://www.forbes.com/sites/christopherfrank/2012/03/25/improvin
gdecision-making-in-the-world-of-big-data/
The casino company Caesars Entertainment uses data to spot
[2] 10 emerging technologies for Big Data By Thoran Rodrigues in Big Data
when gamblers have lost so many times at the slot machines Analytics, December 4, 2012
that they might not come back: "If the company can present, [3] Analytic platforms: Beyond the Traditional Data Warehouse by By Merv
say, a free meal coupon to such customers while they're still Adrian and Colin White Beye NETWORK Custom Research Report
Prepared for Vertica.
at the slot machine, they are much more likely to return to the [4] http://hadoop.apache.org/.
casino later." [5] http://www.facebook.com/press/info.php? statistics.
London's Heathrow airport increased the number of on-time [6] J. Dean and S. Ghemawat, MapReduce: Simplified data processing on
large clusters, in OSDI, 2004, pp. 137150.
flights from 65% to 80% in just two months after using an [7]http://www.gartner.com/technology/research/big-data /
algorithm to coordinate everything that goes into a flight [8] How companies can make better use of big data - Los Angeles Times
turnaround process.
Telecom company Verizon has a unit that analyzes location Mrs.BalshetwarS.V. is working as Head of Information Technology
department at satara college of engineering and management, limb, satara. She
data for other businesses for example, telling a basketball has completed her M.Tech (Computer Sci. & technology.) at Shivaji University,
team where the fans at their stadium came from. Kolhapur, India. She received her AMIE (Computer Sci. & Engg.) from
Manufacturing companies commonly embed sensors in their Institute of Engineers (India), Kolkata, India in 2008. Her research interest
includes data security & data mining, Artificial Intelligence, Big Data.
machinery to monitor usage patterns, predict maintenance
problems, and enhance build quality. Studying these data Dr. R.M. Tugnayat, Principal, Shri Shankarprasad Agnihotri College of
streams allows them to improve their products and devise Engineering,Wardha, India.
more accurate service cycles.
Insurance companies are now asking drivers to voluntarily
contribute data that tracks their movement, locations, and
where they are at various times of the day so they can develop
better risk profiles for each customer. By showing that they
drive the speed limit, travel in areas that incur fewer
accidents, and avoid high crime areas customer can qualify
for a lower cost insurance plan.
Multi-Channel Marketing and Sentiment Analysis ,
companies combine social media feeds, customer
demographic information, psychographic data (values,
attitudes, interests, or lifestyles), purchase data, and network
usage data to paint a complete picture of each customers
behavior, likes, and dislikes. Harnessing this information
helps retailers to understand each potential buyer as a
market of one and to present personalized, tailored
offerings to individual customers.
XII. CONCLUSION
319 www.erpublication.org