Sie sind auf Seite 1von 6

Verify and Fix the SAS and Hadoop Collection

Contents
Verify and Fix the SAS and Hadoop Collection.............................................................................................. 1
How to Verify or Fix SAS 9 on the Collection ............................................................................................ 1
How to Verify or Fix Hadoop on the Hadoop Collection........................................................................... 2
How to Restart Specific Hadoop Linux Servers ......................................................................................... 4
Resolving ALREADY EXISTS error with Hadoop ......................................................................................... 5

How to Verify or Fix SAS 9 on the Collection

The purpose of this section is to outline the steps to verify SAS 9 running on the windows client to
support the course demonstrations, activities, and practices.

1. Open a remote desktop or Citrix connection to the windows client in the collection.
2. Open the Firefox or Chrome browser from the task bar or desktop
3. Select SAS9 -> SAS Studio on the bookmarks bar

a. Sign in as user: student password: Metadata0


b. Sign out of SAS Studio once connection
4. Verify: SAS9 is verified if you successfully get the SAS Logon screen and can sign into SAS Studio.
5. Fix: If the SAS Logon screen is not available or you cannot successfully sign into SAS Studio with
the student credentials, then the SAS 9 services need to be restarted
a. Open the Windows File Explorer on the task bar
b. Navigate to D:\ImageUtilities\SAS9\Dangerous
c. Double-click stopSAS.bat to gracefully stop all of the SAS services.
d. Click the Service icon on the taskbar and verify all of the SAS9 services are not running.

e. Double-click startSAS.bat to start the SAS services


6. After 45 minutes, re-verify the SAS Logon and SAS Studio sign in as described above.
7. Fix: If the verification fails again, a new reservation should be started by selecting the Fresh
Image choice under Image in the Virtual Lab Reservation System.

How to Verify or Fix Hadoop on the Hadoop Collection

This section outlines the steps required to verify the Hadoop Cloudera 5.16 ecosystem is running
properly. Hadoop is required to support the course demonstrations, activities, and practices. Allow at
least 45 minutes after the course image reservation has started before performing the following steps.
This delay ensures that the Hadoop ecosystem components have had enough time to start normally.

1. Open a remote desktop or Citrix connection to the windows client in the collection.
2. Open the Firefox or Chrome browser from the task bar or desktop.
3. Click Admin –> Cloudera Manager on the bookmarks bar.
4. Login as user: admin password: Student1
5. Verify: all of the Hadoop applications, except Sqoop, should be green.

6. Fix: Any Hadoop applications which are red or black should be restarted or the entire Hadoop
cluster restarted.
7. To restart a specific application, click the down arrow for that application and select Start or
Restart

8. To restart the entire Hadoop cluster, click the down arrow for Cluster 1 and select Start or
Restart.
9. After 15 minutes, verify all of the Hadoop application, except Sqoop, are green.
10. Fix: If the verification fails again, a new reservation should be started by selecting the Fresh
Image choice under Image in the Virtual Lab Reservation System.

How to Restart Specific Hadoop Linux Servers

The purpose of this section is to outline the steps to restart a specific Linux server machine in the
Hadoop cluster. This will table about 45 minutes to complete.

The Hadoop cluster is comprised of three Linux server machines:

• server04.demo.sas.com – NameNode running all of the Hadoop applications (HDFS, Hive, etc.)
• server05.demo.sas.com – DataNode
• server06.demo.sas.com – DataNode / ClientNode (student@HadoopClient)

Perform the following steps for each Linux server in the Hadoop cluster as needed.
1. Double-click the mRemoteNG application on the desktop or taskbar of the client machine.
2. Double-click a <pre-defined server connection>, i.e. server04-root, to open a terminal session to
that machine.

3. At the command prompt, type the following to shut down and restart the server.
shutdown now -r
4. Press ENTER.
5. To monitor the state of the servers you can:
a. Start Chrome or Firefox from the taskbar
b. Click Admin -> Cloudera Manager
c. Login as user: admin password: Student1
d. Monitor the state of the cluster services. When all of them turn green the system is
ready.

Resolving ALREADY EXISTS error with Hadoop

The purpose of this section is to help resolve issues with Hadoop files or directories that already exists. If you
execute statements that create directories or files in HDFS twice or if you execute statements that create tables
in Hive twice you may get ERROR messages in the SAS log the second time. This is because these items were
created the first time you executed the program and you cannot overwrite these items using the methods in
your program.

You can identify these types of error because one of the ERROR messages will usually indicate that the object
ALREADY EXISTS. If this existing item was created correctly the first time you can ignore these errors. If you
need to re-create the item because it was not created correctly (or if you are unsure it was created correctly)
you will first need to delete the item first.

Note the following examples below showing the type of ERROR message to look for and a you can use to delete
the item if needed.

1. HDFS file already exists:


example ERROR message:
ERROR: java.io.IOException: Target
/user/student/DIAHD/data/moby_dick_via_sas.txt already exists

If you need to delete a file so you can re-create the file successfully you can execute statements similar
to below, substituting the name of the file you need to delete in place of the file named in this
example:
proc hadoop user="student";
hdfs delete="/user/student/DIAHD/data/moby_dick_via_sas.txt";
run;

2. HDFS directory already exists:


example ERROR message:
ERROR: org.apache.hadoop.mapred.FileAlreadyExistsException:
Output directory
hdfs://server04.demo.sas.com:8020/user/student/DIAHD/data
/output already exists

If you need to delete the directory and its contents you can use the following example, substituting
your directory for the one used in this example:
proc hadoop user="student";
hdfs delete="/user/student/DIADH/data/output";
run;

3. Hive table already exists:


example ERROR message:
ERROR: Execute error: Error while processing statement: FAILED:
Execution Error, return code 1 from
org.apache.hadoop.hive.ql.exec.DDLTask.
AlreadyExistsException(message:Table customer already
exists)

If you need to re-create this table because it was not created correctly, you can use the following
example, substituting the name of your Hive schema on the connect statement and the name of your
Hive table on the DROP TABLE statement.
proc sql;
connect to hadoop
(server='server04.demo.sas.com' subprotocol=hive2 port=10000
schema=DIACHDMYHIVE user="student" pw="Metadata0");
execute (drop table customer) by hadoop;
disconnect from hadoop;
quit;

Das könnte Ihnen auch gefallen