Sie sind auf Seite 1von 3

International Journal of Engineering and Technical Research (IJETR)

ISSN: 2321-0869, Volume-2, Issue-5, May 2014

Big Data: Insight


Mrs. S.V. Balshetwar, Dr. R.M.Tugnayat
Business people are capturing and analyzing big data for
Abstract In this world of information technology the ever adding value in the process of decision making. Nevertheless
growing thing is data, it can be set of facts, observations it can be used by Consumer services, government agencies,
anything in the form of digitized manner. Now a day the data capital markets, healthcare and etc.
has been in trillion gigabytes. Every person making use of
social media has abundant of data on internet, companies make
Web based companies are more interested in analysis of this
use of this data to analyze many things right from sentiments of
people in purchasing a product to fraud detection or data, it is possible because of decreasing cost of storage
technically for securing the company data. This paper gives an devices, flexibility and cost effectiveness of data centers and
idea regarding big data and the technology around big data. the recent developments of new frameworks that can get
integrated with new data management systems that has
improved analytical capabilities .
Index Terms Big data, Big data analytics, NoSQL,
unstructured data. II. BIG DATA ANALYTICS

Traditionally the data in DW are structured, that are


I. INTRODUCTION
extracted from operational systems. Whereas big data comes
from tweets, sensors, mobile which are not of the same
The world around has large amount of data, classified data, structure rather they are multi structured or unstructured.
structured data, unstructured data, homogeneous data, Analysis of structured data is easier as compared to semi
heterogeneous data, out of the available data which can have structured, multi structured or unstructured data. In order to
abundant information in it, that can be extracted on three extract value out of big data, analysts make use of many
main parameters: accuracy, timeliness and completeness of advanced analytical techniques.
data.
Traditional structured data is stored in RDBMS and SQL is
In this digital world every now an than data is generated used as analysis tool which cannot be used in case of big data
from, sensors , social media that is responsible for explosive because it comes in variety and volume from various sources.
growth of data .An survey [1] shows that the rate of data Thus there are multiple trends for management and
creation has increased so much that 90% of data in the world processing of big data, Hadoop and MapReduce, offer
today has been created in recent two years thus data today alternatives to traditional data warehousing. ADBMSs
known as Big data is the large and rapidly growing volume, (analytic RDBMSs) and non-relational systems (sometimes
variety and velocity, veracity of information that cannot be called NoSQL systems) are available for processing
leveraged by existing RDBMS and data warehousing multi-structured data [2][3].
systems. Fig 1. Shows 4 Vs of Big Data
Big data analytics is process or technique of applying
Velocity advanced analytic techniques to very large unstructured
data, analyzing such data can produce operational and
business value data.

Fig. 2 shows the components of Big Data Analytic process.

Volume
4 v s of Variety
Big data analytics investigate the lowest granular details of
big
data
business operations and find their way in standard reports.

III. BIG DATA TECHNOLOGIES TO HANDLE DATA

Veracity
Data at rest and data in motion both come under big data,
accordingly technology for handling such data have been
grouped: batch processing analytic for data at rest and stream
Fig. 1 4 Vs of Big Data processing analytics for data under motion.

Manuscript received May 20, 2014. Large volumes (gigabytes, terabytes, petabytes) of data are
Mrs.S.V.Balshetwar, Information Technology, Satara College of stored in memory or disk which we can call data at rest.
Engineering and Management, Limb, Satara, India. Hadoop is one of the most populat technology for batch
Dr. R.M. Tugnayat, Principal, Shri Shankarprasad Agnihotri College of
Engineering,Wardha, India
processing. The Hadoop framework make use of HDFS for

317 www.erpublication.org
Big Data: Insight

storing large files and MapReduce programming model which


data that come every second, minute or hour is called data in
is used to handle large scale data processing problems that motion. Stream processing is a growing area of research it
are distributed and parallelized. Large Streams of responded does not have a single dominant technology like Hadoop.

Data source- 1 Data extract, Database management Application for


transformation and system (Analytic) analyzing big data
loading application

Data source- n

Data store +
Procedural models

Fig 2 Big Data Analytics: Components

IV. DATABASES FOR BIG DATA VIII. BIG DATA PROCESSING REQUIREMENTS

Increasing volume of data is beyond the traditional RDBMS For placing data, a proper structure is required and the
models but it is easier to map such data objects onto NOSQL requirements are:
databases. Databases used for big data storage are viz: 1. Reduced time for loading data.
Key-value databases, Document Storage database, Graph 2. Speed must be increased for processing of queries.
Database,Object Database, Multi-model Database, 3. Utilizing the storage space efficiently.
Column-oriented databases, Schema-less databases or 4. Handling dynamic, unstructured data patterns.
NoSQL databases[2].
For efficient querying and storage of big data databases such
as Cassandra, Couchdb, Greenplum Database, Hbase,
Mongodb, Vertica, Aster Data, Hypertable, Big Table, IX. BIG DATA STORAGE ARCHITECTURE CONSIDERATIONS
Saphana, Infogrid, Hypergraphdb, Allegrograph, Bigdata,
Versant, Db4-O, Allegrograph, Virtuoso, Terrastore, In order to get proper solution for storage of big data
Leveldb, Couchbase, Server,Berkeley Db,Voldemort, following factors can be considered:
Memcachedb, Amazon Dynamodb, Dynomite. 1. Big data comes at high speed so the bandwidth
V. TOP LANGUAGES FOR BIG DATA ANALYTICS requirement as per application is an important
The most popular languages continue to be R (used by 61% of consideration.
KDnuggets readers), Python (39%), and SQL (37%). SAS is 2. What is the data type: structured, unstructured or
stable at around 20%. The highest growth was for mix.
Pig/Hive/Hadoop-based languages, R, and SQL, while Perl, 3. Is the big data considered for application is
C/C++, and Unix tools and Clojure are also on high rate in distributed or concentrated at one physical location?
popularity. 4. How is the data? is it stratum or not.
5. How is the access to data? Is old data required often,
VI. BIG DATA SOFTWARE does it get mix with new data access very often.

Platfora, Datameer, Hadoop, Spark, HP Vertica , MongoDB ,


Splunk , Tableau. X. HARDWARE AND SOFTWARE ARCHITECTURE TO HANDLE
BIG DATA:
VII. VTECHNOLOGIES ASSOCIATED WITH BIG DATA
ANALYTICS
The three primary architectures used to handle big data are
[7]:
NoSQL databases, Hadoop and MapReduce. These 1. Symmetric Multiprocessing Solutions (SMP): This
technologies form the hub of an open source software infrastructure uses multiple processors that share a
framework that supports the processing of large data sets common operating system and memory, it is used as the
across gathered systems [4]. basis of most BI/DW environment.

318 www.erpublication.org
International Journal of Engineering and Technical Research (IJETR)
ISSN: 2321-0869, Volume-2, Issue-5, May 2014
2. Massively Parallel Processing (MPP) data warehousing Data in todays world has raised importance of Big Data in
appliances: In this infrastructure every processor has its academic research also.
own operating system and memory; it can grow
horizontally and is used mainly for structured data. ACKNOWLEDGMENT
3. NoSQL platforms : NoSQL database, also called Not
Only SQL, is an method to data management and I would like to thank Prof. (Dr.) R.M. Tugnayat for his
database design that's useful for very large sets of encouragement and comments on earlier drafts of this paper.
distributed data. NoSQL is especially useful when an Without the support of my loving husband V.P. Balshetwar
enterprise needs to access and analyze massive amounts (B.E. Electronics ) & our kids Siddhesh and Katyayani it
of unstructured data or data that's stored remotely on would have been impossible to complete this work.
multiple virtual servers. I would also like to thanks Dhiraj Mothghare, Vijay
Urade, Raman Bane who have encouraged in selecting this
topic for research.
XI. BETTER USE OF BIG DATA
REFERENCES
Like bacteria, big data are lurking in the stomachs of cows.
Some farmers are using sensors and software to analyze it [1] Improving Decision Making in the World of Big
and predict when a cow is getting ill [8]. Datahttp://www.forbes.com/sites/christopherfrank/2012/03/25/improvin
gdecision-making-in-the-world-of-big-data/
The casino company Caesars Entertainment uses data to spot
[2] 10 emerging technologies for Big Data By Thoran Rodrigues in Big Data
when gamblers have lost so many times at the slot machines Analytics, December 4, 2012
that they might not come back: "If the company can present, [3] Analytic platforms: Beyond the Traditional Data Warehouse by By Merv
say, a free meal coupon to such customers while they're still Adrian and Colin White Beye NETWORK Custom Research Report
Prepared for Vertica.
at the slot machine, they are much more likely to return to the [4] http://hadoop.apache.org/.
casino later." [5] http://www.facebook.com/press/info.php? statistics.
London's Heathrow airport increased the number of on-time [6] J. Dean and S. Ghemawat, MapReduce: Simplified data processing on
large clusters, in OSDI, 2004, pp. 137150.
flights from 65% to 80% in just two months after using an [7]http://www.gartner.com/technology/research/big-data /
algorithm to coordinate everything that goes into a flight [8] How companies can make better use of big data - Los Angeles Times
turnaround process.
Telecom company Verizon has a unit that analyzes location Mrs.BalshetwarS.V. is working as Head of Information Technology
department at satara college of engineering and management, limb, satara. She
data for other businesses for example, telling a basketball has completed her M.Tech (Computer Sci. & technology.) at Shivaji University,
team where the fans at their stadium came from. Kolhapur, India. She received her AMIE (Computer Sci. & Engg.) from
Manufacturing companies commonly embed sensors in their Institute of Engineers (India), Kolkata, India in 2008. Her research interest
includes data security & data mining, Artificial Intelligence, Big Data.
machinery to monitor usage patterns, predict maintenance
problems, and enhance build quality. Studying these data Dr. R.M. Tugnayat, Principal, Shri Shankarprasad Agnihotri College of
streams allows them to improve their products and devise Engineering,Wardha, India.
more accurate service cycles.
Insurance companies are now asking drivers to voluntarily
contribute data that tracks their movement, locations, and
where they are at various times of the day so they can develop
better risk profiles for each customer. By showing that they
drive the speed limit, travel in areas that incur fewer
accidents, and avoid high crime areas customer can qualify
for a lower cost insurance plan.
Multi-Channel Marketing and Sentiment Analysis ,
companies combine social media feeds, customer
demographic information, psychographic data (values,
attitudes, interests, or lifestyles), purchase data, and network
usage data to paint a complete picture of each customers
behavior, likes, and dislikes. Harnessing this information
helps retailers to understand each potential buyer as a
market of one and to present personalized, tailored
offerings to individual customers.

XII. CONCLUSION

We have entered an era of Big Data. Analyzing new and


diverse digital data streams can reveal new sources of
economic value and provide fresh insights and identify
market trends. Hopefully by highlighting several
technologies related to Big Data and the importance of Big

319 www.erpublication.org

Das könnte Ihnen auch gefallen