Sie sind auf Seite 1von 24


The Hadoop FAQ for Oracle DBAs

by Gwen Shapira (@gwenshap)

January 06, 2014


Oracle DBAs, get answers to many of your most common questions about getting started
with Hadoop.
As a former Oracle DBA, I get a lot of questions (most welcome!) from current DBAs in the
Oracle ecosystem who are interested in Apache Hadoop. Here are few of the more frequently
asked questions, along with my most common replies.
How much does the IT industry value Oracle DBA professionals who have switched to Hadoop
administration, or added it to their skill set?
Right now, a lot. There are not many experienced Hadoop professionals around (yet)!
In many of my customer engagements, I work with the DBA team there to migrate parts of their
data warehouse from Teradata or Netezza to Hadoop. They dont realize it at the time, but while
working with me to write Apache Sqoop export jobs, Apache Oozie workflows, Apache Hive
ETL actions, and Cloudera Impala reports, they are learning Hadoop. A few months later, Im
gone, but a new team of Hadoop experts who used to be DBAs is left in place.
My solutions architect team at Cloudera also hires ex-DBAs as solutions consultants or system
engineers. We view DBA experience as invaluable for those roles.
What do you look for when hiring people with no Hadoop experience?
I strongly believe that DBAs have the skills to become excellent Hadoop experts but not just
any DBAs. Here are some of the characteristics I look for:

Comfort with the command line. Point-and-click DBAs and ETL developers need not
Experience with Linux. Hadoop runs on Linux so thats where much of the
troubleshooting will happen. You need to be very comfortable with Linux OS, filesystem,

tools, and command line. You should understand OS concepts around memory
management, CPU scheduling, and IO.

Knowledge of networks. ISO layers, what ssh is really doing, name resolution, basic
understanding of switching.

Good SQL skills. You know SQL and you are creative in your use of it. Experience with
data warehouse basics such as partitioning and parallelism is a huge plus. ETL experience
is a plus. Tuning skills are a plus.

Programming skills. Not necessarily Java (see below). But, can you write a bash script?
Perl? Python? Can you solve few simple problems in pseudo-code? If you cant code at
all, thats a problem.

Troubleshooting skills. This is huge, as Hadoop is far less mature than Oracle. Youll
need to Google error messages like a pro, but also be creative and knowledgeable about
where to look when Google isnt helpful.

For senior positions, we look for systems and architecture skills too. Prepare to
explain how youll design a flight-scheduling system or something similar.

And since our team is customer facing, communication skills are a must. Do you
listen? Can you explain a complex technical point? How do you react when I challenge
your opinion?

Is that maybe too much to ask? Possibly. But I cant think of anything I could remove and still
expect success with our team.
How do I start learning Hadoop?
The first task we give new employees is to set up a five-node cluster in the AWS cloud. Thats a
good place to start. Neither Cloudera Manager nor Apache Whirr is allowed; they make things
too easy.
The next step is to load data into your cluster and analyze it. I recommend following the tutorials
here, which show how to load Twitter data using Apache Flume and analyze it using Hive:

Also, Clouderas QuickStart VM (download here) includes TPC-H data and queries. You can run
your own TPC-H benchmarks in the VM.
There are also some good books to help you get started. My favorite is Eric Sammers Hadoop
Operations its concise and practical, and I think DBAs will find it very useful. The chapter on
troubleshooting is very entertaining. Other books that DBAs will find useful are Hadoop: The
Definitive Guide, Programming Hive, and Apache Sqoop Cookbook (all of which are authored or
co-authored by Clouderans).

I also recommend taking a Cloudera University training course or two and perhaps even getting
certified. Talking to a live instructor often provides insights that you cant find on your own.
For even more resources, see the New to Hadoop page on
Do I need to know Java?
Yes and no :)
You dont need to be a master Java programmer. Im not, and many of my colleagues are not.
Some never write Java code at all.
You do need to be comfortable reading Java stack traces and error messages. Youll see many of
those. Youll also need to understand basic concepts like jars and classpath.
Being able to read Java source code is useful. Hadoop is open source, and digging into the code
often helps you understand why something works the way it does.
Even without mastery required, the ability to write Java is often useful. For example, Hive UDFs
are typically written in Java (and its easier to do that than you think).

If youre an Oracle DBA interested in learning Hadoop (or working for Cloudera), this post
should get you started.
Im happy to answer any other questions in comments!
Gwen Shapira is a Solutions Architect for Cloudera, and a former Oracle DBA.
Filed under:


11 Responses
Rakesh Tripathi / January 07, 2014 / 11:29 PM
Good and relevant information for Database Administrators. Thanks !

Egidio Ndabagoye / January 08, 2014 / 1:00 AM

Thanks Gwen.Very informative post.

Sridhar Govardhanan / January 15, 2014 / 8:56 AM

Thanks Gwen, its a wonderful post. I am an Oracle DBA with 4 years of experience, I am
stuck in a dilemma whether to choose the path of Data Analytics or Hadoop Admin.
Given my DBA experience, can you guide me which would be my best optionThank

Ramnath / February 06, 2014 / 11:40 PM

Thanks Gwen. I am an Oracle DBA struggling for job. but unfortunately i got an offer in
BigData based company.These information will help me a lot .

Amp / February 08, 2014 / 7:11 PM

Hi Gwen,
I am a 8+ years experienced Programmer with Java and open source technology and
looking to move into Hadoop ecosystem.
Will getting into Hadoop will work work for me being I dont have any DW or ETL
related experience.

Robin Dong / March 09, 2014 / 7:56 PM

Hi Gwen,
Your article here is very encourage, I am looking for Hadoop admin/developer job right
I am an Oracle prod/dev DBA and Sql developer for a long time, I just got my Hadoop
admin and developer certified.
I have a question for you. I tried to setup 2 nodes hadoop cluster at home. I have my
Internet(all IPs is dynamic IP).
I had no problem install Centos 6.2 and Cloudera Manager 4.5. However once I log onto
Cloudera manager, I add these 2 nodes to Hadoop, there is a error always popup: Cant
find scm server.
1. First of all, I like to know if possible I can just use my Internet and 3 machines at home
to setup this Hadoop cluster? what else I need for this setup?
2. Seemed, when log onto Cloudera Manager and add nodes, CM connect to to find package to load or so, I am afraid this dynamic IP on my
Internet service would not work that way.
3. If I am wrong on the above, please let me know what I can do to make this 2 nodes
cluster installed.

4. If I dont use Cloudera Manager, just pure Hadoop/java installed, do you think I can
make it?
Anyhow, basically, I am looking for a way to install Multiple nodes Hadoop at home.
Hope I dont need static IP for this.
BTW, I had no problem to install single note Hadoop on, not multiple
nodes yet.
Do you have any suggestion on how to find a Hadoop admin/developer job?
your help is greatly appreciate.

Justin Kestelyn (@kestelyn) / March 10, 2014 / 1:53 PM

I recommend you post these questions at

Robin Dong / July 24, 2014 / 4:29 PM

I am looking for the best tool for transfer data from Oracle SAP databases, do you have
any recommendation?
1. Tansfer the log file
2. Extract Oracle/SAP data from tables
Looking forward to hearing from you.

shashi / March 04, 2015 / 3:52 PM

Hello Gwen,
I am very much impressed with your article.
I am working as Oracle DBA with 4 yrs of experience and maintaining huge database.
and also having a knwledge on Hadoop..
I have one question. Is it possible to integrate my Oracle database with Hadoop. If so can
you please suggest, if we can go ahead with this. Please suggest any documents i can
Thank you very much.

Justin Kestelyn (@kestelyn) / March 05, 2015 / 12:16 PM

The best way to move data between Oracle and HSFD is via Apache Sqoop
(which ships inside CDH). There is also a certified Sqoop-based connector

Justin Kestelyn (@kestelyn) / February 26, 2015 / 9:16 AM

Or, consider the cloud-based approach:

Hadoop FAQ But What About the DBAs?

Jan 24, 2013 / By Gwen Shapira
Tags: Big Data, DBA Lounge, Group Blog Posts, Hadoop
There is one question I hear every time I make a presentation about Hadoop to an audience of
DBAs. This question was also recently asked in LinkedIns DBA Manager forum, so I finally
decided to answer it in writing, once and for all.
As we all see there are lot of things happening on Big Data using Hadoop etc.
Can you let me know where do normal DBAs like fit in this :
DBAs supporting normal OLTP databases using Oracle, SQL Server databases
DBAs who support day to day issues in Datawarehouse environments .
Do DBAs need to learn Java (or) Storage Admin ( like SAN technology ) to get into Big Data ?
I hear a few questions here:

Do DBAs have a place at all in Big Data and Hadoop world? If so, what is that
Do they need new skills? Which ones?

Let me start by introducing everyone to a new role that now exists in many organizations:
Hadoop Cluster Administrator.
Organizations that did not yet adopt Hadoop sometimes imagine Hadoop as a developer-only
system. I think this is the reason why I get so many questions about whether or not we need to
learn Java every time I mention Hadoop. Even within Pythian, when I first introduced the idea of
Hadoop services, my managers asked whether we will need to learn Java or hire developers.
Organizations that did adopt Hadoop found out that any production cluster larger than 20-30
nodes requires a full time admin. This admins job is surprising similar to a DBAs job he is
responsible for the performance and availability of the cluster, the data it contains, and the jobs

that run there. The list of tasks is almost endless and also strangely familiar deployment,
upgrades, troubleshooting, configuration, tuning, job management, installing tools, architecting
processes, monitoring, backups, recovery, etc.
I did not see a single organization with production Hadoop cluster that didnt have a full-time
admin, but if you dont believe me note that Cloudera is offering Hadoop Administrator
Certification and that OReilly is selling a book called Hadoop Operations.
So you are going to need a Hadoop admin.
Who are the candidates for the position? The best option is to hire an experienced Hadoop
admin. In 2-3 years, no one will even consider doing anything else. But right now there is an
extreme shortage of Hadoop admins, so we need to consider less perfect candidates. The usual
suspects tend to be: Junior java developers, sysadmins, storage admins, and DBAs.
Junior java developers tend not to do well in cluster admin role, just like PL/SQL developers
rarely make good DBAs. Operations and dev are two different career paths, that tend to attract
different types of personalities.
When we get to the operations personnel, storage admins are usually out of consideration
because their skillset is too unique and valuable to other parts of the organization. Ive never seen
a storage admin who became a Hadoop admin, or any place where it was even seriously
Ive seen both DBAs and sysadmins becoming excellent Hadoop admins. In my highly biased
opinions, DBAs have some advantages:

Everyone knows DBA stands for Default Blame Acceptor. Since the
database is always blamed, DBAs typically have great troubleshooting skills,
processes, and instincts. All of these are critical for good cluster admins.
DBAs are used to manage systems with millions of knobs to turn, all of which
have a critical impact on the performance and availability of the system.
Hadoop is similar to databases in this sense tons of configurations to finetune.

DBAs, much more than sysadmins, are highly skilled in keeping developers in
check and making sure no one accidentally causes critical performance
issues on an entire system. This skill is critical when managing Hadoop

DBA experience with DWH (especially Exadata) is very valuable. There are
many similarities between DWH workloads and Hadoop workloads, and
similar principles guide the management of the system.

DBAs tend to be really good at writing their own monitoring jobs when
needed. Every production database system Ive seen has crontab file full of
customized monitors and maintenance jobs. This skill continues to be critical
for Hadoop system.

To be fair, sysadmins also have important advantages:

They typically have more experience managing huge number of machines

(much more so than DBAs).
They have experience working with configuration management and
deployment tools (puppet, chef), which is absolutely critical when managing
large clusters.
They can feel more comfortable digging in the OS and network when
configuring and troubleshooting systems, which is an important part of
Hadoop administration.

Note that in both cases Im talking about good, experienced admins not those that can just click
their way through the UI. Those who really understand their systems and much of what is going
on outside the specific system they are responsible for. You need DBAs who care about the OS,
who understand how hardware choices impact performance, and who understand workload
characteristics and how to tune for them.
There is another important role for DBAs in the Hadoop world: Hadoop jobs often get data from
databases or output data to databases. Good DBAs are very useful in making sure this doesnt
cause issues. (Even small Hadoop clusters can easily bring down an Oracle database by starting
too many full-table scans at once.) In this role, the DBA doesnt need to be part of the Hadoop
team as long as there is good communication between the DBA and Hadoop developers and
What about Java?
Hadoop is written in Java, and a fairly large amount of Hadoop jobs will be written in Java too.
Hadoop admins will need to be able to read Java error messages (because this is typically what
you get from Hadoop), understand concepts of Java virtual machines and a bit about tuning
them, and write small Java programs that can help in troubleshooting. On the other hand, most
admins dont need to write huge amounts of Hadoop code (you have developers for that), and for
what they do write, non-Java solutions such as Streaming, Hive, and Pig (and Impala!) can be
enough. My experience taught me that good admins learn enough Java to work on Hadoop
cluster within a few days. Theres really not that much to know.
What about SAN technology?
Hadoop storage system is very different from SAN and generally uses local disks (JBOD), not
storage arrays and not even RAID. Hadoop admins will need to learn about HDFS, Hadoops file
system, but not about traditional SAN systems. However, if they are DBAs or sysadmins, I
suspect they already know far too much about SAN storage.
So what skills do Hadoop Administrators need?
First and foremost, Hadoop admins need general operational expertise such as good
troubleshooting skills, understanding of systems capacity, bottlenecks, basics of memory, CPU,
OS, storage, and networks. I will assume that any good DBA has these covered.

Second, good knowledge of Linux is required, especially for DBAs who spent their life working
with Solaris, AIX, and HPUX. Hadoop runs on Linux. They need to learn Linux security,
configuration, tuning, troubleshooting, and monitoring. Familiarity with open source
configuration management and deployment tools such as Puppet or Chef can help. Linux
scripting (perl / bash) is also important they will need to build a lot of their own tools here.
Third, they need Hadoop skills. Theres no way to avoid this :) They need to be able to deploy
Hadoop cluster, add and remove nodes, figure out why a job is stuck or failing, configure and
tune the cluster, find the bottlenecks, monitor critical parts of the cluster, configure name-node
high availability, pick a scheduler and configure it to meet SLAs, and sometimes even take
So yes, theres a lot to learn. But very little of it is Java, and there is no reason DBAs cant do it.
However, with Hadoop Administrator being one of the hottest jobs in the market (judging by my
LinkedIn inbox), they may not stay DBAs for long after they become Hadoop Admins
Any DBAs out there training to become Hadoop admins? Agree that Java isnt that important?
Let me know in the comments.

Share this article


44 Responses to Hadoop FAQ But What About the DBAs?

January 24, 2013 at 10:14 pm

Hi Gwen! I prefer DBA = Does Basically Anything. Agree on the Hadoop Admin role,
and I am proceeding full speed ahead into the world of Hadoop! Excellent article.

Uwe Hesse
January 25, 2013 at 2:04 am

Hi Gwen,
very interesting and helpful article!

I encounter some worried DBAs myself who want to know exactly what you addressed
here: How is Hadoop affecting my job role? You make it clear that it is actually a great
chance for them :)
Thank you &
Kind regards


January 25, 2013 at 2:27 am

Hi Gwen,
I can relate to the Default Blame Acceptor :-) .I encounter it everyday.Great
Article.Hadoop sounds interesting.I will start doing some reading for this.

Janis Tupulis
January 25, 2013 at 3:11 pm

Excellent one, many thanks!


January 27, 2013 at 4:33 pm

Great Article. Few years ago I was started saying, there is should be Hadoop DBA
position in the company that using Hadoop. As same as for any other databases out there.


January 28, 2013 at 7:11 am

Thanks for sharing Gwen. I thinking about myself. Should I jump into magical Hadoop
world or stay an Oracle DBA for a while.
My current thoughts are:
There are still so many things waiting for me in the Oracle DBA space. I probably
would like like to cover some of them before jumping somewhere else
BIG Data arrived so rapidly that I got some associations with other rapidly arrived
buzzwords before Hadoop (e.g. SOA). Would Hadoop/Big Data stay for good? Or those
may disappear in few years?
No matter what I think there are going to be enough work for both DBA and Hadoop
Admin in the datafication age :)

Gwen Shapira
January 28, 2013 at 5:37 pm

Hi Yury,
Nuno Pinto De Souto gave a similar sentiment in the Big Data SIG.
First of all, I agree that Oracle DBA is a huge world and there is always more to
learn. I can easily point to areas where I can learn or improve myself. I also agree
with Nuno who said that many skills are timeless and always needed. I totally
agree on that.
Is Hadoop here to stay? If I could tell how market trends work, I would be a far
richer woman. Im just a simple DBA :)
I work with Hadoop because I love it. I love the brilliant simplicity of the
platform, the rich eco-system, the flexibility, the tools. I feel very creative when I
work with Hadoop, much more so than working with Oracle. But this is personal
everyone has his own favorite tools.
I try to encourage DBAs to learn Hadoop for two reasons:
1. Maybe some of them will love it as much as I do. I want to spread the joy!
2. Hadoop is being actively adopted by many organizations. Hadoop Admins are

necessary. Someone has to do the job. Im trying to encourage more people to

study Hadoop, so the job market will become a bit more balanced.
Will Hadoop go away? Personally, I see it as a real solution to real problems. I
dont see it going away any time soon. It will probably become boring. When I
started my career, XML was really hot everyone was talking about it, learning
new technologies around XML, re-designed processes, etc. Now, people just use
XML and dont talk about it much. The time I spent working on DOMs and XSLT
and all was not wasted at all. It was fun back then, and is still sometimes useful.
Knowledge is never wasted, even when the trend is over.

January 29, 2013 at 11:22 am

Hi Gwen,
great post!
It is true that DBAs are already used to be in the center of attention working with sys
admins, network admins, storage admins, developers and vendors to solve complex
technical challenges.
I also find Hadoop admin to make sense for DBAs as a career move especially for the
infrastructure DBAs who focus more around DB infrastructure, not schema design.
Since there currently is a shortage of such admins, it might also be financially beneficial
but the key factor should be just passionate to play and own cool, new technologies

Jeff P.
January 30, 2013 at 11:44 am

Great post.
When I explain to folks what a lowly DBA does, I tell them
I dont get to drive the train, but when it jumps off the track, guess whos phone starts
ringing itself off his desk!

Hadoop FAQ But What About the DBAs? | Big Data Press
February 3, 2013 at 1:04 pm

[] []

February 14, 2013 at 11:52 am

Amazing Article Gwen which describe what DBA future could be, its really Useful and
good To share .

February 18, 2013 at 5:50 am

You might find my recent posting interesting..

Big Data: NoSQL & the DBA public void killTime () {

February 18, 2013 at 5:59 am

[] Hadoop FAQ but what about DBAs []


DBA, Grow Thyself Moving and Shaking in the Era of Data Dominance |
Steve Karam :: The Oracle Alchemist
February 20, 2013 at 4:05 pm

[] is not going to hire an experienced Hadoop Administrator, its a great job for the
DBA. Gwen Shapira makes a great note of this, arguing that the DBAs experience with
complex tuning requirements, data warehouses, and developers []

April 12, 2013 at 3:38 pm

Excellent inputs. One Question: does the the experience of RAC DBA, in your opinion,
be helpful while administering the Hadoop clusters.

Gwen Shapira

April 15, 2013 at 6:51 pm

Here are few examples for when my RAC DBA experience was helpful when
administering an Hadoop cluster:
1. When setting up highly available namenode, you need to configure STONITH
method. As a RAC DBA, you probably know all about STONITH, why its
important and can easily choose the correct configuration.
2. In Hadoop, troubleshooting often involves figuring out which specific node is
having trouble. RAC DBAs are pretty good at drilling down on randomly
occuring issues to find the faulty server
3. Troubleshooting also involves correlating messages from large numbers of logs
and machines. RAC DBAs are usually experts on that too.
In general, RAC experience is experience in distributed systems which is
critical for Hadoop administration.


April 23, 2013 at 7:02 am

Very good information.

I have 4+ on Linux/Hadoop curently with 7LPA, is it fair enough to ask for 18LPA?

dinesh singh
March 20, 2014 at 4:48 am

I need learn hadoop Administration please suggest me 9411882528


April 24, 2013 at 9:26 am

Hi Gwen
Im fresher but i want learn hadoop admin please give me some suggestions
Is it good for me????

April 26, 2013 at 4:44 pm

Hi Gwen,
Very good & informative article.
I have 7+ years of exp as iSeries developer. Would Hadoop admin career helpful for
developer as well?
Thanks in advance!!

May 31, 2013 at 11:22 am

Excellent Article for upcoming Hadoop Admins


June 15, 2013 at 2:51 am

Good article.
i have a doubt sir actually am a B.Tech Passed out student well trained on
.NET(DOTNET) but no job still so i heard about Hadoop Admins have more demand in

market now..
So i want you to suggest whether i should go training for hadoop or stick to .NET

July 8, 2013 at 4:15 am

Hi Gwen,
Great Artical with good analytion about Hadoop,Let me Know i have around 1 year exp.
on Oracle i want learn Hadoop,please give me your valubale sugestion regarding this.

Syed Jahanzaib Bin Hassan

July 29, 2013 at 11:52 pm

Nice Article
Mixture of both skills required to troubleshoot the problems and a candidate which have
such kind of combination of skills in the market is very difficult

August 4, 2013 at 5:17 am

Hi Gwen: thanks for the wondeful article. Wanted to know your thoughts about the future
of the data / database architect and database developer e.g. PL/SQL developer (not the
DBA) in the Hadoop world. What would be the next logical step in the Hadoop world for
these types of professionals?

Sreekanth Matturthi
August 5, 2013 at 12:10 pm

I have 13+ yrs of experience in Solaris & Redhat Administrator with Symantec Cluster
Technology as an SME in a MNC company. By Using this experience, i have already
started learning Hadoop and created 2 nodes on my Virtual Machine doing some R&D.

then i decided to change my carrier to Hadoop Adminitration. Please suggest and help me

Rishi Jian

September 15, 2013 at 2:50 pm

Thanks Gwen for this article ..

I am 2013 fresher .. Currently jobless.. Will learning Hadoop/Bigdata is good for me for
getting a job having 0 years experience.. ??
Thanks in advance.

Mel Bourne

November 8, 2013 at 4:39 pm

Funny you keep mentioning the DBA fits the role of Hadoop administrator, but most of
the task you mentioned, like OS Performance Tuning, Hardware, Deployments, Storage
Configurations, Clusters, Backup Infrastructure, Networking, Coding are mostly done by
Unix SysAdmins. Read them again. I worked with DBAs a lot, none of them are
comfortable enough to work with hardwares, os tuning, and setting up the
infrastructure they even come to me to write some automation scripts. I was a Unix
Sysadmin, and guess what, I was the only one selected to be trained in Teradata, and was
tasked to stood up and Manage our first TERADATA Database and Infrastructure Ill
tell you what, that recommendation came from TERADATA itself, theyd rather train a
Unix Sys Admin who has a background in Database and coding/development, than
converting an Oracle DBA to manage a TERADATA environment.


November 24, 2013 at 7:59 am

thanks for the wonderful post.

i am a oracle DBA ,
i am planning for HADOOP.
i am planning for a HADOOP ADMIN .

shilpi Khandelwal
December 3, 2013 at 5:01 am

DBA is the basic thing which help to handle hadoop. There are many things which we
can get and along with this a perfact hadoop admin need to improve this technology in a
refining way.

March 7, 2014 at 3:37 am

I have 25 years experience in IT/Software Development/System Admin/DBA. Initially

started career as DBA and got 8 years in MSSQL/DB2/Oracle/MySQL. Also spend 4
years in Unix/Solaris Admin. Strong hands on PL/SQL and some exposure in Java/C#.
Later on mover to technical solution architecture. Now even after my age (52 Years), I am
very techie and involve myself in Hadoop and related technologies. Its my passion. I
LOVE HADOOP. Still working as Hadoop Admin.

March 17, 2014 at 2:04 am

I am working as Sr. Technical manager Database Architect , holding 14 years of

experience on Oracle Administration. I am very interested to move on Big Data Hadoop
technology but little confuse which stream will be match to my skill and experience. Off
course Haddop Administration is very similar type of work but as looking for senior level
to math the good package what will be the best suite for me.


May 8, 2014 at 11:14 pm

Nyc article..I have one question in my mind that is .. How can a java developer jump into
Hadoop without being an oracle DBA. So please suggest me , Im Confused ..thanx

May 25, 2014 at 11:29 am

I am a MS SQL DBA with no experience in Linux/Unix and Java. Hadoop looks really
interesting. I was looking at job market and most of them have Linux administration as
one of the requirements.
Any MS SQL DBA have experience getting into Hadoop to give me hope of making it if
I dive into it.

Samarth Sharma
May 26, 2014 at 11:15 am

Nice article. Cleared all of my doubts. I am a DBA but soon going for Hadoop Cluster
Admin :)
Thanks for this article

July 17, 2014 at 1:11 am

Im fresher but i want learn hadoop admin please give me some suggestions
How can a fresher can get a job as hadoop adminwhat courses need to be done??


July 23, 2014 at 4:15 pm

I have a good experience of Oracle DB administration. If I want to get into Hadoop

admin where shud i start from .. some one pls guide me

July 28, 2014 at 10:27 pm

Great article!
As for those who are expecting career guidance from an expert blogger: no-one owns
your career except you teach yourself the skills you need and prove to your boss that
you are THE person in the company who can and who WANTS to support it.
You can get your hands on an image of a Hadoop implementation easily enough, so all
you need is a VM and the documentation.
IMO, enterprise adoption of Hadoop is very immature. Unlike database appliances, such
as Exadata, where the sentiment is that its a database on steroids on an engineered
system and has ORACLE on the front so its natural to leave it to the DBA, no-one
knows what to do with Hadoop.
Underneath it all, its a file SYSTEM = sysadmins
You need to EXTRACT the raw data so its usable = developers?
Just kidding about the last one :)
Should Hadoop be deployed as part of a Big Data Appliance, support will be expected
from the DBAs with the SAs and network admins saying its an appliance, talk to the
vendor about it.
If its a roll-your-own commodity cluster, I cant see how the SAs and network admins
can throw it over the fence because theyre responsible for the hardware support.

The savvy companies will realize that, like enterprise data warehouses, Hadoop crosses
many traditional organizational silos and a dedicated Enterprise Data Management
support group is needed with participation from DBAs, SAs and network admins.

Subhransu Sahoo

August 9, 2014 at 3:10 am

Great article, concise and inspiring. After deep diving into Hadoop and Big Data
ecosystem, I learned that my past DBA+ skills helped a lot. Completely agree with the
narration. Thank you.


September 17, 2014 at 3:29 pm

can anybody plz suggest me is hadoop for freshers also or its only for the trained


October 9, 2014 at 10:32 am

Very Nice article I got an affinity to learn Hadoop Cluster administration. But basically
I am a SQL DBA work with Windows machines. Is it necessary to learn LINUX
administration for learning Hadoop Cluster Administration. Please advise.

February 1, 2015 at 12:22 pm

I have one question in my mind that is .. How can a java developer jump into Hadoop
without being an oracle DBA. So please suggest me , Im Confused ..thanx

April 1, 2015 at 11:28 am

Hi, Read your article still not sure about the career path in hadoop admin .. i am DBA
and love to work on oracle RAC .. now i wanted to learn more on performance tuning
and design. Still if i get a chance to go into hadoop admin is it a good choice, so not sure
about it if for DBA its a good choice or not as in hadoop admin you are not finding and
solving problem of Database where tuning database/queries are much more fun.

April 22, 2015 at 2:19 am

Thanks for such a great post! Todays ultra-connected world is generating massive
volumes of data at ever-accelerating rates. As a result, big data analytics has become a
powerful tool for businesses looking to leverage mountains of valuable data for profit and
competitive advantage. In the midst of this big data rush, Hadoop, as an on-premise or
cloud-based platform has been heavily promoted as the one-size fits all solution for the
business worlds big data problems. While Hadoop has lived up to much of the hype,
there are certain situations where running workloads on a traditional database may be the
better solution. More at