Beruflich Dokumente
Kultur Dokumente
Disclaimer
Technical feasibility and market demand will affect final delivery. Pricing and packaging for any new technologies or features
discussed or presented have not been determined.
Panelists
Fletcher Cocquyt, Stanford School of Medicine Brian Mack, Maximus Ian Dodd, Kaiser Permanente
vSphere
CIM4280
Stanford School of Medicine How does vCenter Operations address our real world VI monitoring requirements? Virtual Infrastructure:
Servers: 310 VMs on 21 ESXi hosts Storage: 20Tb NFS datastores replicated on campus, DR site Networking: 10Gb ESXi upgrades are 75% complete
Metrics:
25830 metrics monitored by static thresholds with Zabbix, cacti, big brother
7
Agenda Virtual Infrastructure (VI) comes with unique requirements in terms of monitoring Challenges posed when monitoring a consolidated VI How vCenter Operations addresses these for us Examples of issues vCenter Operations identified and helped us resolve
Confidential. Not for distribution. 8
Tuning can be done, but requires domain knowledge and large investment of time -> over tuned missed alerts Misses inefficiencies and transient issues which can be serious Problems are amplified in a consolidated environment No intelligence to filter the signal from the noise
10
11
Dynamic Thresholds
Obviate the requirement for domain knowledge and large investment of time tuning signal from noise Work with derivative metrics to virtually eliminate false positives Detect inefficiencies and transient conditions missed by static thresholds Provides a holistic context/perspective for the sysadmin monitoring the virtual infrastructure By incorporating derived metrics and dynamic thresholds, vCenter Operations presents a drop in solution to provide immediate impact and value detecting issues across the consolidated infrastructure Some real world examples showing the actionable value
12
13
VMworld 2011
14
VMworld 2011
15
CPU
Network
Disk IO
VMworld 2011 16
VMworld 2011
17
18
CIM4280
VirtualEnvironment
Infrastructure
15ESXServers 300VirtualMachines 25TBsofFibreChanelstorageina3Tieredenvironment
MonitoringTools(pre vCenterOperations)
UptimeSoftware vFoglight vKernel CapacityIQ
20
Whatwasneeded?
Improvedcommunicationandreporting performancestatisticstoourinternaland externalclients. Improvedvisibilityforinternalengineerstoour virtualenvironment. Betterproblemsolvingtools.
21
vCenterOperations
22
MAXIMUSRealWorldIssue
23
MAXIMUSRealWorldIssue(continued)
24
vCenterOperationswithCapacityIQ
vCenterOperations+ CapacityIQ
BetterreportutilizationinCapacityIQ BetterRootCauseAnalysisforclients Healthmonitoring/Betterresourceutilization
25
ReturnonInvestment(ROI)
Wherearewenow?
WenowhavevCenterOperationsrunningwithproduction licensesfor250VMs. vCenterOperationshasbecomeourmaindashboardfor monitoringourproductionanddevelopmentvirtual environments. CostSavings
Nootherlicensesrequiredforotherpreviouslyusedtools Betterresourcemanagement Lesstimeandexpenserequiredfortroubleshootingclient problems ShorterturnaroundtimefornewVMdeployment
26
CIM4280
Enterprise Overview
8m Members 15,000 Servers Virtualize first policy 20Pb Storage Large Mainframe environment Enterprise Management Vendors
HP IBM VMware Oracle Quest MSFT In-House Developed
28
The Challenges
Multiple management tools and consoles Multiple approaches to performance management Inconsistent approach to threshold management High Ticket Volume due to threshold breaches Proactive approach is challenging Automation not fully exploited Metric Acquisition inconsistent Workload is ticket driven
29
Goals
Reduce consoles to two operational views:
Availability Performance
50% reduction in Static Thresholds Increased Fidelity & Specificity of alerts Apply consistent metric attributes across Enterprise Create dedicated approach to Performance Management
30
Approach
Acquisition
Apply best practices for enterprise wide metric attribute gathering
Availability
Create single pane of glass for availability monitoring on automated alerting
Performance
Establish Critical Application Support Team and tools focused on Performance management
Automation
Develop methodology for Availability and Performance Automation
Visualization
Develop and publish role based visualization for Enterprise Availability and Performance
31
Architecture
Remedy Availability Console Orchestration / Automation
vCOPS VI Analytics
32
33
34
Accomplishments
Performance & Availability consoles selected Comprehensive Best practice metrics defined across Enterprise Critical Application Support Team (CAST) established 1st vCOPS Line Of Business (LOB) environment in production Identification of Static thresholds to be removed in progress Decreased Problem Life Cycle in Production LOB Automation process established in CAST team Increased Fidelity and routing of alerts
35
Q&A
Fletcher Cocquyt, Stanford School of Medicine Brian Mack, Maximus Ian Dodd, Kaiser Permanente
36
38
CIM4280