Beruflich Dokumente
Kultur Dokumente
ISSN:2320-0790
I.
INTRODUCTION
Big data analytics is the area where advanced
analytic Techniques operate on big data sets. It is
really about two things, Big data and analytics and
how the two have teamed up to Create one of the
most profound trends in business intelligence [2].
2. Number of Blocks
3. Heap Size
4. HDFS Bytes Read
5. Spilled Records
PERFORMANCE
HADOOP
ANALYSIS
S.
no
1
2
3
4
5
6
7
8
9
10
11
OF
1714
Size
of
data
(MB)
49.7
50
59.8
60.3
61.6
62.1
62.9
63.4
70
70.5
71.5
No of
blocks
6013
6013
6005
6013
6215
6198
6578
5360
6012
6023
6031
Heap
Size
(MB)
55.19
55.19
55.19
55.25
55.12
55.12
37.25
53.64
55.25
55.25
37.25
HDFS
bytes
read
50257
50550
60481
62018
62331
62758
63572
64122
70870
71293
72331
Spilled
records
10926
10948
13122
13170
13164
13108
13130
13130
15072
15300
15252
CPU
time
spent
5860
5810
6960
6960
6620
6950
6880
6810
7700
7530
8140
COMPUSOFT, An international journal of advanced computer technology, 4 (5), May-2015 (Volume-IV, Issue-V)
III.
VS
CPU
TIME
SPENT
SPILLED RECOREDS:
1715
COMPUSOFT, An international journal of advanced computer technology, 4 (5), May-2015 (Volume-IV, Issue-V)
IV.
FUTURE SCOPE
Though HDFS resolves the Big Data problem
efficiently but there is always scope for
improvements. Hadoop provides the Replication of
data to avoid data loss, thereby making the system
Fault Tolerant. The more number of copies are
created, the more the system becomes fault tolerant.
If a DataNode fails then map and reduce process of
that node is shifted to the other nodes carrying the
replicated data. But we have only one NameNodes
which contains the metadata of all the DataNodes,
and if this node gets crashes, then the processing
cannot takes place further. This weakness of Hadoop
is being resolved in Hadoop 2, by providing multiple
namespaces and multiple NameNodes in it.
REFRENCES
1716