Sie sind auf Seite 1von 64

Best Practices for

Monitoring Exadata
Enterprise Manager Cloud Control 12c

Farouk Abushaban
Senior Principal Technical Analyst
Oracle USA, Engineered Systems Support
September, 2014

Copyright 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential Internal/Restricted/Highly Restricted

Safe Harbor Statement


The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, and timing of any features or
functionality described for Oracles products remains at the sole discretion of Oracle.

Copyright 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential Internal/Restricted/Highly Restricted

Objectives
Understand EM 12c Topology
Agent Deployment Best Practices
Component Level Monitoring
Discovery Deep-Dive

Copyright 2014, Oracle and/or its affiliates. All rights reserved. | Oracle Confidential Internal/Restricted/Highly Restricted

Program Agenda

Quick Overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Program Agenda

Quick overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Section 1:

EM OVERVIEW

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Enterprise Manager Cloud Control 12c


Concepts

System Management
Software
Lights-Out Monitoring
and Notification
Management and
Administration
Single GUI

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Centralized
Management
Target Administration
Life-Cycle
Management
Automation

Complete IT Monitoring
Oracle Products
Non-Oracle
technologies
Out-of-the-box
Metrics and Alerting
Real-time + Historical
Perf. Trending
Reports Publishing

Enterprise Manager Cloud Control 12c


Major Components

Oracle Management Agent (EM Agent)


Oracle Management Server (OMS)
Oracle Management Repository (EMREP)

Oracle Management Plug-Ins

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

EMCLI

EM Console
Agent

Repository
Database

Management
Server

Agent

Agent

10

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Management Via Plug-Ins

Provide specific management capabilities per target type


Standard (default installed) Plug-Ins
Install Plug-Ins for each product as needed
Automatic Plug-In deployment during target discovery*
Automated Plug-In updates via Plug-In Manager
Online or Offline
Quarterly bundled updates started since 12cR3 (12.1.0.3)

11

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Plug-In

12

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Plug-In 12.1.0.5 / 12.1.0.6


New Features and Enhancements

Supported HW and SW

13

SPARC SuperCluster T5-8 server


1/8 Rack and Multi-Rack
Expansion Rack, X4-2, 11.2.3.3, 12c GI

IB performance and on-demand schematic refresh


IORM active by default: CDB I/O with PDB breakdown
SNMP support for non-public community strings
Detailed Summary of Flash and Spindle disk performance side-by-side
Cell HW Fine Grained performance monitoring
Guided resolution for cell alerts
and more....(See References slide for docs link)

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Program Agenda

14

Quick overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Section 2:

Exadata ComponentMonitoring Tour

15

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Monitoring

Install

16

Deploy Agents
Introduces Host Targets

Discover

Discover and Configure Exadata DBM


Promote Monitored Targets

Monitor

Customize Monitoring
Automate Tasks

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Rack Components


Whats Monitored?

Database Servers

Storage Servers

InfiniBand Switches

Cisco Switch
17

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

PDUs

KVM Switch

Agent Deployment
Install on each compute node

Database Servers
18

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Monitoring
Deploy Agents
Agents run on compute nodes only
Compute nodes are RAC host targets
Monitor Exadata targets remotely
No additional software on Cells, IBs, KVM, PDUs, Cisco, and ILOM

Built-in failover monitoring via OMS mediation


Assign 2 agents per target
Only master agent is actively monitoring the target

OMS switches to backup agent when current master agent is down

19

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Add Host Target

20

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Agent Installation Properties

21

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

22

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

23

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Program Agenda

24

Quick overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Section 3:

Exadata Discovery
Deep-Dive

25

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Exadata Guided Discovery


More than just discovery
Even better with 12.1.0.6 exa*

Specify Schematic
Active pre-requisite check
Sets up SSH user equivalence
Subscribes to SNMP
Supports re-discovery of newly added hardware
components
Assigns Primary and Backup agents to each component
26

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Discover Exadata DBM

27

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

28

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Storage Cell Discovery


From Compute Node

Runs $/usr/sbin/ibnetdiscover
Reads the cell hostnames and IP addresses from the output

Pre-12.1.0.6 runs $ORACLE_HOME/bin/kfod op=cellconfig


Reads /etc/oracle/cell/network-config/cellip.ora

29

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Infiniband Network Discovery

30

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

InfiniBand Network Discovery


From Compute Node

Runs ssh nm2user@<ibswitch> ibnetdiscover


Reads IB Switch names connected to the Compute Node
Matches up the Compute node vs. Agent hostnames:

31

https://exa01db01.acme.com:3872/emd/main
ca 2 H-00212800.. # exa01db01 S 192.168.HCA-3

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Prerequisite Checks
You can manually run this pre-requisite check ahead of time from the
compute node:
$ORACLE_HOME/perl/bin/perl exadataDiscoveryPreCheck.pl

32

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Guided Discovery Wizard Summary


Select agent (provide RDBMS home path Pre-.6)
Exadata cells: runs ibnetdiscover (or kfod & cellip.ora for Pre.6)

Infiniband Switches + Compute Nodes >> ibnetdiscover


KVM, PDU, Cisco +ILOM through schematic file on compute
node

Automatically subscribes to SNMP (Cells, and IB switches)


Agent mediation and Target promotion
33

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring Storage Cells


EM Agent runs cellcli via ssh to collect Storage Cell metrics.
MS sends SNMP traps to EM Agent for subscribed alert conditions
Requires cellmonitor ssh eq. setup with Agent user
Associates ASM targets and disk groups
Collects rich storage data on home page, plus:
Aggregate storage metrics
Cell alerts via SNMP (PUSH)
Capacities
IORM consumer and DB level metrics
And much more
34

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring Infiniband Switches


EM Agent runs remote ssh calls to the switch collect metrics
IB Switch sends SNMP traps (PUSH) for some alerts
Requires ssh eq. for nm2user for metric collections such as:
Response
Various sensor status
Fan

Voltage
Temperature

Port performance data


Port administration

35

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring Cisco Switch


EM Agent runs remote SNMP get and push to collect metric data

against the Cisco switch.


Status / Availability
Port status
Vital signs: CPU, Memory, Power, Temperature

Network interface various data


Incoming traffic errors, traffic kb/s and %
Outgoing traffic errors, traffic kb/s and %

Admin and Operational bandwidth Mb/s

36

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring ILOM targets


EM Agent runs remote ipmitool and SNMP calls to each Compute

Node ILOM target


Requires nm2user credentials to run ipmitool
Runs collections via perl script wrappers & collects:
Response availability
Sensor alerts

Temperature
Voltage
Fan speeds
Configuration Data: Firmware version and Serial number, etc
37

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring Power Distribution Units


EM Agent runs remote SNMP get calls and receives SNMP traps

(PUSH) from each PDU


Response / Ping status
Phase values

38

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Monitoring KVM Switch


EM Agent runs remote SNMP get calls and receives SNMP traps

(PUSH) from the KVM switch


Status / Response
Reboot events
Temperature

Fan status
Power state
Factory settings

39

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Program Agenda

40

Quick overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Section 4:

Challenges and
Troubleshooting

41

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Challenges
Redeployment of a rack:
DEV to UAT to PROD etc
Partitioning full rack to smaller independent racks:
Full rack >> One rack + two racks
Combining partitioned racks to a larger rack:
Two racks >> Full rack
Two racks >> One rack
etc.
42

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Update existing
OneCommand
configurations
Generate new
schematic files for
each partitioned rack
Generate new
OneCommand
configuration to
consolidate racks
43

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Discover Exadata DBM

44

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

45

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Challenges
Discovery
Adding new hardware
Expanding or
Adding storage cells
Attaching additional rack
Attaching Storage Expansion rack
Adding spine switch
. etc

46

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Challenges
Networking
Network configuration changes
Re-IP some or all components
Domain name changes
Hostname changes
Subnet changes
Additional backup network / NICs
Additional listeners (IB listeners or TNS)
Firewall rules

Etc
47

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Discovery Issues

Compute node not managed by EM


Check

48

Fix

Agent hostname is different than


compute node hostname

# ibnetdiscover
Match up to agent hostname

Wrong agent used for discovery

Select compute node agent for


discovery

Reset the compute node name from


client to management or vice-versa

# /usr/sbin/set_nodedesc.sh

Short hostname used for agents?

Re-install agents using fully-qualified


hostname <hostname.domain>

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Discovery Issues
Extra or missing components new DBM

49

Check

Fix

Examine extra components for DBM


membership

De-select extra components manually


from the discovered list

Which schematic file was used for


discovery?

Ensure that EM can read the latest


xml file on the compute node

Missing components

Check schematic file content

Need to generate a new schematic file

Log an SR and provide details

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Discovery Issues
Discovery just hangs

50

Check

Review / Fix

Examine network

Hostname resolution
Accessibility from OMS to Agent(s)
Execute a simple job from the console

OMS reported errors

MW_HOME/gc_inst/sysman/log/emoms.log

Repository issues

Repository database alert.log

Agent logs

$AGENT/agent_inst/sysman/log/gcagent.log

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Schematic Issues
Schematic page blank
Check for Browser support and EM 12c
Run through discovery again and watch for messages
Check emoms.log for exceptions at the same time

Components missing
Add manually to the schematic page - Edit button
Check for component presence in EM (is it monitored?)

51

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Target Status Issues
Target status shows DOWN inaccurately
Cell: Check ssh equivalence (cellmonitor user)

ssh i /home/oracle/.ssh/id_dsa l cellmonitor <cell name> -e cellcli list cell


Output should be:
<cell name>
PDU: Check for access to browser console of PDU

http://<pdu name>
Is it connected to the lan?
Cisco: Check for proper SNMP subscriptions

See Exadata Management Doc Post Discovery


52

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Metric Collections
Target status shows Metric Collection Error
Hover over the Icon or navigate to Incident Manager
Read the full text of the error
Visit the Target Setup >> Monitoring Configuration page and examine
Trigger a new collection: Target menu > Configuration > Last Collected >

Actions > Refresh


Access the monitoring Agent Metric Browser

https://<agent URL>/emd/browser/main
Click the target >> click Response and evaluate the results / log an SR
53

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Pending Status
Cellsys target in Pending status forever
Must have Cluster ASM, Database and Storage Cell association
Check / fix the status of the associated target database
Check / fix the status of the associated target ASM cluster
Ensure UP status for all cell server targets
Delete unassociated cellsys targets
Check for problematic DBMS_JOBS in the repository database

54

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Pending Status
Database Machine target or any associated components in Pending

status
Check for duplicate or pending delete targets:
Setup >> Manage Cloud Control >> Health Overview
Check target configuration:
Target Setup >> Monitoring Configurations
Search for the target name in the agent or OMS logs
$ grep <target name> gcagent.log or emoms.log

55

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Troubleshooting
Maintenance
EMDiag
Download and install the latest version
Always check for the latest repvfy drop. Note 1426773.1
Run: repvfy verify exadata level 9 details
Run: repvfy verify
This will summarize all critical / fatal issues in the repository
Share the output with Support and explain the symptoms

56

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Summary
What we covered today

EM 12c Topology and Design

Agent Deployment Best Practices


Component Monitoring / Discovery

First-aid Troubleshooting tips

57

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

References
Documentation Libraries, Notes, etc..

Exadata Management Online Docs


Plug-In Manager - EM Cloud Control Admin
Guide
Exadata Plug-In BP Note 1613177.1
Database Plug-In BP Note 1580350.1

Exadata Monitoring Patch Requirement


Note 1323298.1

58

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Learn More
Available References and Resources to Get Proactive

About Oracle Support Best Practices


www.oracle.com/goto/proactivesupport

Get Proactive in My Oracle Support


https://support. oracle.com | Doc ID: 432.1
Get Proactive Blog
https://blogs.oracle.com/getproactive/
Ask the Get Proactive Team
get-proactive_ww@oracle.com

59

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Program Agenda

60

Quick overview of EM 12cR4

Tour of Exadata Monitoring

Deep-Dive Into Database Machine Discovery

Challenges and Troubleshooting

Q&A Session

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Questions & Answers

61

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Drinks. Food. Fun.


My Oracle Support Monday Mix
Tonight!
Monday, September 29
6:00 to 8:00 p.m.
ThirstyBear Brewing Company
(only block from Moscone Center)

Join us for a relaxing Happy Hour after a busy day at Oracle OpenWorld!
Take a break and unwind with your peers
Get to know the Oracle support engineers you depend on
Meet My Oracle Support executives and developers
Enjoy drinks and hors doevres
Admission is free with your Oracle OpenWorld badge

Event details at:


www.oracle.com/goto/mondaymix

Copyright 2014 Oracle and/or its affiliates. All rights reserved. | Oracle Confidential Internal/Restricted/Highly Restricted

62

THANK YOU
63

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

64

Copyright 2013, Oracle and/or its affiliates. All rights reserved.

Das könnte Ihnen auch gefallen