Sie sind auf Seite 1von 28

InfoSight | Dashboard Modules

Darren Wong
User Experience
Nimble Storage

Dashboard Modules | System

System: User Goals


As a User, I want to see the status of my physical environment so that I may early identify hardware issues.
As a User, I want to see the status of (peripheral) services, so that I may ensure connectivity is constant and reliable.
As a User, I want to receive software notifications, so that I may be aware of updates and upcoming releases.

Dashboard Modules | System

System Manager: System Design Proposal


Maintaining exception based criteria: enables ability to scan and identify exceptions.
Also enables multiple states to be displayed if needed (success, warning, error all together)

SYSTEM
HARDWARE

CONNECTIVITY

Arrays

Controllers

Shelves

54

Drives
NICs

SOFTWARE
VERSION 2.3.1
Upgrade Available

Example displayed here - 6 drive failures which is also indicating 1 controller is down; however,
the arrays are still running

MONITORING SERVICES
1

Syslog

SNMP Traps

SMTP

3RD PARTY MANAGEMENT


Active Directory

VMware vCenter

Microsoft Hypervisor

(software states)

i Upgrade Available
Up to date

Improved grouping and information architecture

Future: customizable

Dashboard Modules | System

InfoSight: System Design Proposal


Hardware roll ups link to Wellness page, with appropriate filters applied.

SYSTEM

Here are some examples:


Controller

HARDWARE

Arrays

24

Controllers

48

Drives

74

140

NICs

CONNECTIVITY

20

Call Home

24

NICs

Missing from RAID array


Has encountered read errors
Network connectivity lost, all links down on a subnet

SOFTWARE
VERSION 2.3.1

MONITORING SERVICES
Heart Beat

Drives

Shutdown - excessive temperature

Upgrade Available

(software states)

i Upgrade Available
Up to date

Links to asset list page, with filtered arrays

Data is being presented within desired limits

Data may be stale (exceeded desired limit)


Eng/PM to determine what these limits are

Dashboard Modules | System

Capacity: User Goals


As a User, I want to see which volumes are running out of space very soon, so that I may allocate more space for them.
As a User, I want to see which volumes are running out of space in the near future, so that I may plan (budget) for them.
As a User, I would like to see pool capacity usage, so that I may plan accordingly (grow/shrink pool, purchase/scale deep).

Dashboard Modules | Capacity

OBSOLETE

System Manager/InfoSight: Capacity Design Proposal


CAPACITY
VOLUME UTILIZATION
7

30

> 90%

80% - 89%

6
400

< 79%

Group B

Group A

POOL USAGE

700TB

Pool A

1PB

Pool B

1PB

300x savings

Pool C

1PB

150x savings

Reserved

View more

Used

Available

Affordance for if User has >x# pools.


OR - display top n# pools.

Dashboard Modules | Capacity

System Manager: Capacity Design Proposal


CAPACITY
VOLUME UTILIZATION
> 90%

80% - 89%

6
400

< 79%

Group B

Group A

POOL USAGE

These percentages based on volume threshold settings


700TB

Pool A

1PB

Pool B

1PB

300x savings

Pool C

1PB

150x savings

Reserved

View more

Used

Available

Affordance for if User has >x# pools.


OR - display top n# pools.

Dashboard Modules | Capacity

InfoSight: Capacity Design Proposal


CAPACITY
Timeframes are estimated

VOLUMES REACHING FULL CAPACITY

2-3 weeks

< week

> month

Group B

Group A

POOL USAGE

6
400

700TB

Pool A

1PB

Pool B

1PB

300x savings

Pool C

1PB

150x savings

Used

View more

Unused Reserve

Free

Affordance for if User has >x# pools.


OR - display top n# pools.

These time frames would ideally be customizable

Dashboard Modules | Protection

Protection: User Goals


As a User, I want to see what local protection (snapshots) has failed, so that I may take action (object limits/space) for them.
As a User, I want to see what remote protection (disaster recovery) has failed, so that I may take action (network/scheduling) for them.
As a User, I would like to see well I am protected, so that I may adjust RPO objectives.

Dashboard Modules | Protection

Protection: Design Proposal A

OBSOLETE

PROTECTION
PROTECTION LEVEL

90% Snapshots
20% Replication
10% Unprotected

VOLUMES EXCEEDING RPO


Snapshot

Replication

Snapshot limit exceeded

10

< 8 hours

1 < 1 week

between 1 -3 days

2 between 2 -3 weeks

< 4 days

0 < month

Pool out of space

1. Abnormal bandwidth spike


2. Scheduling queue conflict

Dashboard Modules | Protection

OBSOLETE

Protection: Design Proposal B


PROTECTION
PROTECTION LEVEL

90% Snapshots
20% Replication
10% Unprotected

VOLUMES EXCEEDING RPO


Snapshot

Replication
1 < 1 week

2 between 2 -3 weeks
0 < month

1. Abnormal bandwidth spike


2. Scheduling queue conflict

Dashboard Modules | Protection

Protection: Design Proposal C


PROTECTION
PROTECTED VOLUMES
90% Local Only
20% Remote Protection
10% Unprotected

VOLUMES EXCEEDING RPO


Local

All Compliant

Exceeding Remote RPO Causes


1. Abnormal bandwidth spike
2. Scheduling queue conflict

Remote

1
2

< 1 week
between 2 -3 weeks

OBSOLETE

Dashboard Modules | Protection

OBSOLETE

Protection: Design Proposal D


PROTECTION
PROTECTED VOLUMES
90% Local Only
20% Remote Protection
10% Unprotected

MEETING RPO OBJECTIVES


Local

Remote

Lag

Exceeding Remote RPO Causes


1. Abnormal bandwidth spike
2. Scheduling queue conflict

Network

Scheduling

Dashboard Modules | Protection

Protection: Design Proposal D

OBSOLETE

PROTECTION
PROTECTED VOLUMES
90% Local Only
20% Remote Protection
10% Unprotected

MEETING RPO OBJECTIVES


Local

Exceeding Remote RPO Causes


1. Abnormal bandwidth spike
2. Scheduling queue conflict

Remote

Removed Lag breakdown to de-emphasize scheduling issues

Dashboard Modules | Protection

OBSOLETE

Protection: Design Proposal D


PROTECTION
PROTECTED VOLUMES
Protected
Local Only

70%

Remote Protection 20%


Unprotected

10%

MEETING RPO OBJECTIVES


Local

Exceeding Remote RPO Causes


1. Abnormal bandwidth spike
2. Scheduling queue conflict

Remote

Removed Lag breakdown to de-emphasize scheduling issues

Dashboard Modules | Protection

Protection: Design Proposal D


PROTECTION

90%

PROTECTION

PROTECTED
Local

Remote 30%
RPO !

UNPROTECTED 10%

90%

Local
PROTECTED

Remote 30%
RPO !

UNPROTECTED 10%

Dashboard Modules | Performance

Performance: User Goals


As a User, I want to see which pools are experiencing high lOPs, so that I may identify potential high work loads.
As a User, I want to see which pool are experiencing high latency, so that I may identify lags in writes/reads to disk.
As a User, I would like an overview of trends, so that I may adjust workloads.

Dashboard Modules | Performance

EXPLORATION

System Manager: Performance Design Proposal


PERFORMANCE
Top Pools by Latency (within past 12 hours)

Latency

IOPs

Pool 1

ms

xxx,xxx

mbps

Pool 2

ms

xxx,xxx

mbps

ms

xxx,xxx

mbps

ms

xx,xxx

mbps

Pool 3

Latency: 30 ms

Pool 4

9AM

12PM

Displays range of last 12 hours

3PM

6PM

9PM

Throughput

Display in real time

Dashboard Modules | Performance

OBSOLETE

InfoSight: Performance
PERFORMANCE
LATENCY

Latency

IOPs

Array 1

ms

xxx,xxx

mbps

Array 2

ms

xxx,xxx

mbps

ms

xxx,xxx

mbps

ms

xx,xxx

mbps

Array 5

ms

xx,xxx

mbps

Array 6

ms

xx,xxx

mbps

Array 7

ms

xx,xxx

mbps

Array 8

ms

xx,xxx

mbps

Array 9

ms

x,xxx

mbps

Array 10

ms

x,xxx

mbps

Array 3

Latency: 30 ms

CPU

Array 4

Throughput

Max values within the last 24 hours

2/4

2/5
CPU

2/6
Cache

Host

2/7
Seq

Displays range of last 5 days

Links to Array (asset) detail page on performance tab


Can latency threshold be customizable? If not 5ms is a suggested threshold

2/8
I/O

Dashboard Modules | Performance

InfoSight: Performance
PERFORMANCE
LATENCY

Latency

IOPs

Array 1

ms

xxx,xxx

mbps

Array 2

ms

xxx,xxx

mbps

ms

xxx,xxx

mbps

ms

xx,xxx

mbps

Array 5

ms

xx,xxx

mbps

Array 6

ms

xx,xxx

mbps

Array 7

ms

xx,xxx

mbps

Array 8

ms

xx,xxx

mbps

Array 9

ms

x,xxx

mbps

Array 10

ms

x,xxx

mbps

Array 3

Latency: 30 ms

CPU

Array 4

Throughput

Max values within the last 24 hours

2/4

2/5
CPU

2/6
Cache

Host

2/7
Seq

Displays range of last 5 days

Links to Array (asset) detail page on performance tab


Can latency threshold be customizable? If not 5ms is a suggested threshold

2/8
I/O

Dashboard Modules | Performance

InfoSight: Performance
PERFORMANCE
LATENCY

Latency

IOPs

Array 1

ms

xxx,xxx

mbps

Array 2

ms

xxx,xxx

mbps

ms

xxx,xxx

mbps

Latency: 30 ms
Time: 16:40

Array 3

Throughput

Array 4

CPU

ms

xx,xxx

mbps

Array 5

Cache

ms

xx,xxx

mbps

Array 6

ms

xx,xxx

mbps

Array 7

ms

xx,xxx

mbps

Array 8

ms

xx,xxx

mbps

Array 9

ms

x,xxx

mbps

Array 10

ms

x,xxx

mbps

Max values within the last 24 hours

2/4

2/5

2/6

Amount over threshold of 5ms vs peer arrays


(We need to word smith this somehow)

2/7

2/8

2/9

2/10

Displays range of last week

Links to Array (asset) detail page on performance tab


Can latency threshold be customizable? If not 5ms is a suggested threshold

Displays lead contributing factor

Dashboard Modules | Performance

InfoSight: Performance
PERFORMANCE
LATENCY

Latency

IOPs

Array 1

ms

xxx,xxx

mbps

Array 2

ms

xxx,xxx

mbps

ms

xxx,xxx

mbps

Latency: 30 ms
Time: 16:40

Array 3

Throughput

Array 4

CPU

ms

xx,xxx

mbps

Array 5

Cache

ms

xx,xxx

mbps

Array 6

ms

xx,xxx

mbps

Array 7

ms

xx,xxx

mbps

Array 8

ms

xx,xxx

mbps

Array 9

ms

x,xxx

mbps

Array 10

ms

x,xxx

mbps

Max values within the last 24 hours

2/4

2/5

2/6

2/7
Top Latency: 100 ms

2/8

2/9

2/10

Dashboard Modules | Performance

EXPLORATION

InfoSight: Top VMs

Top VMs by Latency


Show last 24 hrs

12

Show events

nsdiag-1
pachinko
beta-nfs-1

32 ms
Max Latency

Virtual Machine

Average Latency (ms)

supportnfs01-rhel

32.44

beta-nfs-1

21.73

infosightsupport-nmbl

26.55

nsdiag-1

22.59

infosightdb-portal-test-2

21.33

infosightsupport-tsc

18.17

nsdiag-2

17.31

insfosightweb-test-1

4.28

pachinko

4.03

live-int

3.72

Dashboard Modules | Top VMs


I/O and Latency presented individually

OBSOLETE

InfoSight: Top VMs


TOP VMs
Sort by

TOP VMs
total I/O

Sort by

3,837,9800

VM-1

VM-1

VM-2

VM-2

VM-3

VM-3

VM-4

VM-4

VM-5

VM-5

VM-6

VM-6

VM-7

VM-7

VM-8

VM-8

I/O
Values within the last 24 hours

average latency

Latency
Values within the last 24 hours

32.44 msec

Dashboard Modules | Top VMs

InfoSight: Top VMs


TOP VMs

TOP VMs

Total I/O

Average Latency

3,837,980

VM-1

VM-8

VM-2

VM-6

VM-3

VM-3

VM-4

VM-4

VM-5

VM-9

VM-6

VM-12

VM-7

VM-1

VM-8

VM-18

I/O

Latency

Values within the last 24 hours

3,837 k

I/O

Latency

Values within the last 24 hours

33 ms

33 ms

Dashboard Modules | Overall Layout

Dashboard Modules: Overall Exploration


customer name

InfoSight

Performance
Latency
IOPS
Throughput

Read

Write

5 ms

5 ms

600

1,200

300 mbps

150 mbps

Something

Manage

4
Wellness

Tools

Reports

Protection

Administration

Search for an array

Capacity
Array

Recommendations

System

SLA for lorem ipsum will be expiring next month.


New software upgrades are available for 3 arrays.

Headlines
What can I help you with?

Topology
All

Array

3 volumes have seen high latency over the last 4 hours.


VM

5 VMs experiencing high CPU and Memory usage.

VM

1 array trending unusual high capacity since yesterday.

Pool

Lorem ipsum de loriat. Vivamus vulputate velit viverra.


Folder

Lorem ipsum de loriat. Vivamus vulputate velit viverra.

Lorem ipsum de loriat. Vivamus vulputate velit viverra.


show more...

!
volume-name

volume-name

volume-name

Volume

Dashboard Modules | Overall Layout

Dashboard Modules | Overall Layout

Das könnte Ihnen auch gefallen