Sie sind auf Seite 1von 47

IBM System Storage

Client Technical Specialist

TS7600 ProtecTIER Virtual Tape De-duplication

Reviso: 07/Janeiro/2013

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Agenda
Introduo IBM ProtecTIER TS7610/TS7620
Configurao VTL Configurao OST Configurao FSI/CIFS

Algoritmos de De-duplication Famlia ProtecTIER Replicao de backups e Disaster Recover

Identificao de oportunidades
Sizing do equipamento

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Introduo

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Modelo de backup/restore Opes de implementao


Primary Storage
NAS

Foco do modulo de treinamento

Secondary Storage
NAS

Application Servers

Backup Servers
Disk Metadata Server

DAS

LAN Attached

Backup Server

Backup Server

VTL

LAN Attached
Tape Library

Disk

SAN Attached

Tape Drive

Tape Drive

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Modelos de solues de virtualizao de fita


Backup/Restore Client Backup/Restore Server TSM Appl

SAN

VTL VTL Gateway Software Emulator Software Emulator TS7680 TS7650 Disk System C1 C2

Tape Library VTL / VTS Library Manager


Slot1 Slot2 Slot...n

Software Emulator

Storage reposit ory TS 7620 TS 7650 TS 7720 mainframe

Storage repository

drive drive drive

Storage repository

V7000 DS3500 DS5000 Other

TS3100 TS3200 TS3500 Other

TS7740 mainframe

www.redbooks.ibm.com IBM System Storage TS7650 GW and TS7620 redbook sg247652


5

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Por que virtualizar o processo de backup/restore ?


1.

Melhorar o processo de restore a. b. c. Melhor RTO (Recovery Time Objective) O backup reside em disco O modelo implementado disk-to-disk
Virtual Tape Drives Real Tape Drives
Janela

2.

Melhorar o RPO (recovery point in time) a. b. Backups incrementais mais frequentes Utilizao de disco virtualizado para cpias

3.

Melhorar o processo de backup a) b) Processos paralelos


Tempo

Vrios backups/restores possiveis simultaneamente

4.

Otimizar infraestrutura de rede para backup remoto a. Melhorar o RTO e RPO na recuperao

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

IBM ProtecTIER - TS7620 - VTL

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

O que ProtecTIER ?
um servidor utilizado para Backup e Restaurao de dados Apresenta-se para os servidores de backup em uma de tres opes:
1.

Tape Library Virtual (VTL) robot, cartuchos e unidades de fita Entrega disk drives lgicos Symantec Open Storage Tecnology (OST) Integrao com Netbackup

2.

3.

Entrega file system shares File System Interface (FSI) Suporta protocolo CIFS Usado para backup/restore usando uma aplicao Exporta shares na rede IP

Utiliza um repositrio em disco para armazenar os dados de backup

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

O que ProtecTIER ?
Configurvel em 2 opes

Appliance: contem o servidor e o repositrio Gateway: o servidor acessa o repositrio na SAN

O servidor ProtecTIER baseado em System x ProtecTIER o software que roda em Linux

O espao do repositrio otimizado


Algoritmo de Des-duplicao de dados Compresso de dados

Algoritmo referenciado por HyperFactor


Replicao remota de dados via TCP/IP

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Famlia ProtecTIER
TS7650G Gateway
3958-DD5 Repositrio

at 1.0 PB (til)
TS7610 Appliance Express

TS7620 Appliance Express

TS7620 Appliance Express


3959-SM2 Repositrio

TS7650 Gateway

TS7650 Appliance

(cap. til):

5.4 TB ou 11.0 TB

10

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

ProtecTIER Virtual Tape Library


O software de backup enxerga que o dado est sendo gravado em cartuchos
ProtecTIER armazena e restaura o dado diretamente em disco O dado no repositrio des-duplicado

5.5 or 11 TB physical useable capacity


Repository

ProtecTIER Application

11

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

TS7620 ProtecTIER Deduplication Appliance Express


VTL & CIFS Performance. Up to: 145MB/s backup 190MB/s restore OST Performance. Up to: 130MB/s backup 170MB/s restore Two configurations: 5.4 TB & 11 TB Useable capacity, not RAW Field upgradeable by customer

Same enterprise-proven ProtecTIER technology

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

TS7620 Appliance Hardware Building Block


Integrated Server, Storage and ProtecTIER Deduplication software 3U Enclosure fits in standard 19 rack Storage: Twelve 2 TB NL SAS Drives, RAID 6 Server: 6-core Intel Xeon E5645 Westmere 2.4 GHz processor Memory 48GB RAM

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

TS7620 deployment
Shipped preconfigured as:

Backup clients

TS3500 virtual library.


16 LTO3 virtual drives, balanced evenly across both FC ports. 200 GB cartridge size. 16 virtual import export slots. Small model (5.4 TiB)
400 virtual slots & 400 virtual cartridges. 540 virtual slots & 540 virtual cartridges.

LAN

Medium model(11 TiB)

Configuration can be modified by customer

Backup server

Application Interface Support

VTL
2 x 8GB FC ports for host connectivity 2 x 1Gb Copper ProtecTIER Native replication 2 x 1Gb Copper Ethernet for customer network 2 x 1Gb Ethernet ports for host connectivity 2 x 1Gb Copper Ethernet for ProtecTIER Native replication 2 x 1Gb Copper Ethernet for customer network
This document is for IBM and IBM Business Partner use only

SAN

OST, CIFS

TS7620 ProtecTIER VTL


2013 IBM Corporation

15

IBM System Storage

Configurao OST

16

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

ProtectTIER e Open Storage Technology (OST)


NetBackup Policy and Control

NetBackup Server
ProtecTIER OST Plugin

Rede TCP/IP

ProtecTIER ProtecTIER Server Server

OST API integra ProtecTIER com Symantec NetBackup Habilita backup em disco sem emulao de Tape Library OST API plug-in instalado no servidor NetBackup media server

OST API separa a lgica do backup da lgica do ProtecTIER Suporta a transferncia de dados e controle entre os servidores ProtecTIER e de backup.

17

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Arquitetura OST
NetBackup Master Server

Clients
Resource Manager Disk Service
Configuration Database (EMM)

Backup Restore

Catalog

NetBackup Media Servers

Remote Access (Configuration & Mgmt)

Data Movement (bptm/bpdm)

OpenStorage API
Basic Disk Plugin Advanced Disk Plug-in

Shared Disk Plug-in

ProtecTIER OST Interface

Plug-ins

ProtecTIER Gateway or Appliance


This document is for IBM and IBM Business Partner use only
2013 IBM Corporation

IBM System Storage

Operao do OST
ProtecTIER System Media Server Backup Copy 1 Replication Copy 2 ProtecTIER System

Benefits
Reduces workload on the media server

Update Catalog

Catalog-awareness of off-site images Faster and more flexible operations

Catalog
Copy 1

2 weeks 3 months Choose which images to duplicate.


Enables filtering for SLAs plus space and bandwidth utilization.
2013 IBM Corporation

Copy 2

Apply different retention periods.

The second image is an independent copy.

This document is for IBM and IBM Business Partner use only

IBM System Storage

Principais benefcios do OST


Define um novo disco chamado Logical Storage Unit (LSUs) que pode ser duplicado, movido ou compartilhado dentre vrios servidores NetBackup media servers
Backup pode ser replicado para um site remoto ou copiado para fita com total controle do NetBackup Suporta a soluo Machine-to-Machine (mx 12 nodes) e replicao em cascata com integrao do NetBackup Recuperao total ou parcial de imagens de backup replicadas usando uma interface de usurio do NetBackup

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Configurao FSI/CIFS

21

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

ProtecTIER File System Interface (FSI)


Apresenta o ProtecTIER como sendo um NAS
backup

ProtecTIER emula um Windows file system e


apresenta para os CIFS clients a hierarquia de :

Usado para o backup e restaurao de dados via


aplicao de backup

File Systems Diretrios Arquivos behavior and presents a virtualized

CommVault Tivoli Storage Manager EMC NetWorker Symantec NetBackup

Diferentes File Systems podem residir dentro do


repositrio do ProtecTIER

Symantec BackupExec

Samba/CIFS usado internamente

ProtecTIER FSI no para ser usado como um


servidor NAS

Samba VFS (virtual filesystem) mapedo para o


sistema nativo do ProtecTIER file system interface (FSI)

Utiliza o HyperFactor para desduplicar dados Espelha backups via TCP/IP reduzindo a banda
dos links

ProtecTIER Server
IP Network
SMB/CIFS
[Emulation Mapping]

ProtecTIER Native Interface

22

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Autenticao em domnio Windows


Necessita pertencer a um dos domnios
Windows

Active directory

Contm o sistema ProtecTIER


no servidor AD usando o mtodo Kerberos

Autenticao feita

Workgroups
Usurio

definido dentro do sistema ProtecTIER o servidor de autenticao


This document is for IBM and IBM Business Partner use only
2013 IBM Corporation

ProtecTIER
23

IBM System Storage

Viso de File System e Share no ProtecTIER


Exemplo do TSM Server: definio do IP address do ProtecTIER e do Share DEFine DEVclass PT1 DEVType=FILE MOUNTLimit=32 MAXCAPacity=16G
DIRectory=\\10.200.40.1\sharename1

Format is \\FSI_IP\CIFS_name)

24

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Algoritmos de de-duplication

25

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

O que data deduplication ?


Chunking: dividir o dado em unidades para encontrar duplicidades. Unidade: um bloco, um arquivo Repositrio: contm chunks nicos
Chunk Chunk Chunk Chunk

Data object / Data Stream

Mtodos de chunking:

File based: o arquivo o chunk (dedup usado em file-systems)


Block based: o objeto quebrado em blocos (dedup usado em disco) Format/Content aware: Exemplo: PowerPoint (os slides so as unidades)

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

O que data deduplication ? - Chunks


Processamento:
Data object / Data Stream

Chunks so identificados e processados. Calcula-se um valor (nmero hash, assinatura digital, fingerprint) associado ao seu contedo.

Mtodos de clculo:

Repositrio

Hashing Comparao binria

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Algoritmos Hashing Cdigos MD5 e SHA-1


MD5: 16-byte long hash Mudou a letra, o hash value diferente

# echo The Quick Brown Fox Jumps Over the Lazy Dog | md5sum 9d56076597de1aeb532727f7f681bcb0 # echo The Quick Brown Fox Dumps Over the Lazy Dog | md5sum 5800fccb352352308b02d442170b039d

SHA-1: 20-byte long hash # echo The Quick Brown Fox Jumps Over the Lazy Dog | sha1sum F68f38ee07e310fd263c9c491273d81963fbff35 # echo The Quick Brown Fox Dumps Over the Lazy Dog | sha1sum d4e6aa9ab83076e8b8a21930cc1fb8b5e5ba2335

28

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Algoritmo Hashing

Hash Value

Pointer

Tabela mantida com valores hashing e localidade do dado


Valor hashing existe ? Sim, descarta o dado medida que mais dados so gravados:

A tabela cresce em tamanho Maior o tempo de pesquisa na tabela Tempo do backup afetado

Yes

Hash already exists?

No

Store the new data Data is duplicated Update the hash index

Exemplo do tamanho de uma tabela

Algoritmo SHA-1 tem hash.value = 20bytes


Tamanho do repositrio: 50 TB Tamanho do chunk: 16KB
50.000.000.000 KB --------------------------- = 3.125.000.000 16 KB entradas 3.125.000.000 x 20bytes =~ 63GB

29

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Algoritmo Hashing O problema da coliso


Existe a probabilidade de 2 chunks com diferentes bytes, gerar o mesmo valor
hash, causando coliso.

O algoritmo hashing descarta o dado ocorrendo perda da informao

Referncia: http://preshing.com/20110504/hash-collision-probabilities

30

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Conceito bsico do algoritmo do ProtecTIER


New Data Stream

Repository HyperFactor

Memory Resident Index

ProtecTIER Server

Only 4GB needed to map 1PB of physical disk!


Backup Servers Filtered data

31

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

ProtecTIER data deduplication is performed inline

HyperFactor Algorithm Overview


1. New data stream is sent to ProtecTIER server.

Received and analyzed by HyperFactor.

2. For each data element, HyperFactor searches the Memory


Resident Index to locate the data in the repository that is most similar to the data element.

3. The similar data from the repository is read. 4. A binary differential between the new data element and the
data from the repository is performed

Resulting in the delta difference.

5. The delta is written to the disk repository after being


compressed (LZH).

6. The Memory Resident Index is updated with the location of


the new data that has been added.

32

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Post-processing versus Inline-processing


Post Processing Deduplication Inline De-Duplication (eg HyperFactor)

Backups run first De-dup algorithm runs thereafter Requires extra disk space to hold the interim full-sized copy of the backup

De-dup runs as part of backup process

Uses less disk


Once save is done, the entire process is done Only possible with a fast de-dup algorithm like ProtecTIER HyperFactor

Used when the de-dup algorithm is not fast enough to run inline

33

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Replicao do backup e Disaster Recover

34

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

ProtecTIER Native Replication w/ TS7620


Up to 12 branch offices (spokes) supported per target TS7650 (hub) TS7620 supported as Hub with limit of 4 spokes

Spoke
Spoke Spoke

Spoke

IP based NR links

Backup Server

ProtecTIER Gateway

Physical capacity

Virtual cartridges can be cloned to tape by the Main-Site B/U server Tape library

Central / DR Site
35 35

Hub

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

TS7620 as Spoke
TS7620 is best suited in a replication topology as a spoke, replicating to
a TS7650 Appliance or Gateway hub

Physical bandwidth is the primary concern for TS7620 spoke

ProtecTIER native replication only replicates unique data so the amount of bandwidth necessary will depend on the achieved deduplication rate.
TS7620 spoke backing up 500GB of data daily Average dedupe ratio measured or estimated @ ~10:1 Every daily backup (all 500GB) must be replicated to data center TS7650G Hub 12 hour replication window 500 GB / 10:1 / 12 hr = 4.1 GB/hr 1.2 MB/s physical bandwidth pipe required

Bandwidth Sizing Example:

36 36

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

TS7620 as Hub
If deployed as a hub, the maximum number of spokes is limited to 4

as opposed to 12 spokes for a TS7650 hub

Capacity planning for TS7620 as Hub:


The TS7620 hub daily nominal workload (i.e. all spokes pre-dedupe replication workload plus local backup) should not exceed 500 GB Data is only deduplicated at the spoke/sources. It is not deduped again at the hub against other spokes, so even if multiple spoke back up very similar data, the data will appear different at the hub Bottom line: For planning purposes, always treat replicated data at the hub as part of the 500gb/daily-backup overall limit Example of maximum use case: TS7620 used as hub + 4 spokes Hub performs 100GB of daily backups (500-100)/4 = 100 On average, each of the 4 spokes can replicates up to 100GB daily (although they may backup more than is replicated)

Hub Performance Implications: Performance will not be an issue if Capacity


guideline is maintained

37 37

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Sizing do ProtecTIER

38

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Qualificao da oportunidade para TS7620


A soluo adequada para o cliente?
As caractersticas de capacidade e performance atendem os requerimentos do cliente? O tipo do dado aproveita o algoritmo Hyperfactor (dedup) ?

rum : dado criptografado, comprimido, etc

A interoperabilidade da TS7620 no ambiente homologada?

Sugesto no datasheet
..and is ideal for: Customers experiencing significant data growth Weekly full backups of 3 TBs or less Daily incremental backups of 1 TB or less Customers looking to make backup and recovery improvements without making radical changes

39

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Capacity and Performance Requirements for TS7620 5.5TB Configuration


Capacity: The recommended backup workload for the is 500 GB or less per day

Daily backups exceeding 500 GB may be suitable with relatively low data change rate, but because data change rate cannot be accurately gauged (without measurement), its recommended to assume 15-20% by default The following table illustrates how physical space consumption is derived from the three general factors: backup size, retention, and data change rate

Performance: Customer workload cannot exceed TS7620 performance capability


-of 145 MB/s for VTL or 130 MB/s for FSI-CIFS
Daily Backup Retention Change Rate

Both backup and restore throughput requirements should to taken into consideration
Required Physical Space Dedupe Ratio

300 GB
300 GB 300 GB 300 GB 500 GB 500 GB 500 GB 500 GB

30
30 60 90 30 30 60 60

10%
20% 20% 10% 10% 20% 10% 15%

1.17 TB (300 + 300*29*0.1)


2.04 TB 3.84 TB 3 TB 1.95 TB 3.4 TB 3.45 TB 4.93 TB

7.6 : 1
4.5 4.6 9.0 7.7 4.4 8.7 6.1
2013 IBM Corporation

This document is for IBM and IBM Business Partner use only

IBM System Storage

Capacity and Performance Requirements for TS7620 11TB Configuration


Capacity: The recommended backup workload for the is 1TB or less per day

Daily backups exceeding 1TB may be suitable with relatively low data change rate, but because data change rate cannot be accurately gauged (without measurement), its recommended to assume 15-20% by default The following table illustrates how physical space consumption is derived from the three general factors: backup size, retention, and data change rate

Performance: Customer workload cannot exceed TS7620 performance


capability of 145 MB/s
Daily Backup 600 GB 600 GB 600 GB 600 GB

Both backup and restore throughput requirements should to taken into consideration
Retention 30 30 60 90 Change Rate 10% 20% 20% 10% Required Physical Space 2.34 TB 4.08 TB 7.68 TB 6 TB Dedupe Ratio 7.6 : 1 4.5 4.6 9.0

1TB
1TB 1TB 1TB

30
30 60 60

10%
20% 10% 15%

3.9 TB
6.8 TB 6.9TB 8.6 TB

7.7
4.4 8.7 6.1
2013 IBM Corporation

This document is for IBM and IBM Business Partner use only

IBM System Storage

Calculo do fator de de-duplication


Workload HyperFactor Processing Disk Usage After LZH Compression

10TB Full #1 Backup #2

10TB New Data

1TB Incremental ~300GB 1TB New New Data Data

500GB ~150GB ~150GB Compressed ~150GB Compressed ~150GB Compressed ~150GB Compressed 5TB Compressed Compressed Compressed

12TB 13TB 11TB 25TB 15TB 14TB Nominal Nominal Nominal

HyperFactor Ratio 4:1

5.45TB 6.25TB 5.75TB 5.6TB 5.3TB 5.15TB Accumulated Accumulated Accumulated

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Portal de suporte no Partnerworld

43

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Architecting a Solution Capacity Planner

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Sugesto para bons/maus candidatos


Deduplication results will vary based on backup data change rate Certain data types are prone to higher change rates due to their internal makeup and format
Good Candidates Databases uncompressed, unencrypted. Operating System and Application software packages. Text files, Log files (usually dedupe very well). Email (PST, DBX, Domino DB, and similar files). Snapshots (Filer Snaps, VMWare Images, BCVs). Problematic Candidates Images, Video (JPEG, GIF, TIF, MPEG, others). Unless Redundant.

Compressed and Encrypted Files (unless redundant).


Rendering or seismic data. CAD/CAM (depending on the type).

45

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Technical and Delivery Assessment Product Checklist


Official pre-sale qualification document for the TS7620
A Checklist that outlines the set of intuitive qualifiers of a TS7620 order

No need for deep technical expertise to evaluate No review conference-call required as in TS7650G TDAs

Checklist will be posted at the following location by GA:

BPs: http://partners.boulder.ibm.com/src/assur30i.nsf/WebIndex/SA933

46 46

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Leve para casa


ProtecTIER de-dup elimina dados redundantes durante os backups

Um dos modos eficazes de gerenciar o crescimento exponencial de dados

De-dup armazena mais dados de backup com menos disco

Uma tecnologia de eficincia em disco

ProtecTIER realiza backups rpidos e principalmente restores mais rpidos


Reduz a banda necessria para replicao de dados via IP entre localidades remotas. Hyperfactor o algoritmo de de-duplication patenteado e garante 100% do dado integro

Diferente do algoritmo hashing(coliso (perda do dado)).

Simplifica a implementao de solues de Disaster Recover


Emula uma Tape Library, comandos de robtica, unidades de fita LTO, cartuchos e slots
virtuais.

Dois modelos: Gateway e Appliance. Ambos com conexes SAN via Fiber Channel

47

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

IBM System Storage

Fim

48

This document is for IBM and IBM Business Partner use only

2013 IBM Corporation

Das könnte Ihnen auch gefallen