Beruflich Dokumente
Kultur Dokumente
Course Objectives
Evaluate storage architectures and key data center elements in classic, virtualized, and cloud environments Explain physical and logical components of a storage infrastructure including storage subsystems, RAID, and intelligent storage systems Describe storage networking technologies such as FC SAN, IP SAN, FCoE, NAS, and object-based and unified storage Articulate business continuity solutionsbackup and replication, and archive for managing fixed content Describe information security requirements and solutions, and identify parameters for managing and 2012 EMC Corporation. All rights monitoring storage infrastructure in classic, virtualized, 3 reserved. TDC-478 Information Storage and Management
Course Organization
Module 1 : Introduction to Information Storage Module 2 : Data Center Environment Module 3 : Data Protection RAID Module 4 : Intelligent Storage System Module 5 : Fibre Channel Storage Area Network (FC SAN) Module 6 : IP SAN and FCoE Module 7 : Network-Attached Storage (NAS) Module 8 : Object-based and Unified Storage Module 9 : Introduction to Business Continuity Module 10 : Backup and Archive Module 11 : Local Replication Module 12 : Remote Replication Module 13 : Cloud Computing Module 14 : Securing the Storage Infrastructure Module 15 : Managing the Storage Infrastructure
TDC-478 Information Storage and Management 4
Next
Take a Free online practice test to identify areas you may need to review.
ForEMC more information, 2012 Corporation. All rights please visit reserved . TDC-478 Information Storage and Management http://education.emc.com/certification.
Receive no-cost Knowledge Maintenance updates on the latest EMC products and technologies.
Learn from in-depth technical papers from our Knowledge Sharing library. Become an advocate for the Information Storage and Management industry.
Get published. Share your expertise and best practices in our Knowledge Sharing Competition. Connect and Collaborate Join a community of the most trusted professionals in the industry Discuss, share, find answers, or simply connect in our EMC Proven Professional online community. Like Us: Follow Us:
EMC Proven Professional @EMCEducation, @EMCProven 2012 EMC Corporation. All rightshttp://education.EMC.com/ProvenCommunity reserved. TDC-478 Information Storage and Management
Certification
TDC-478 students are eligible for a 50% discount with a free retake on one exam voucher for $100.00. (The standard exam price is $200.00 USD and allows for one attempt.) Students who are interested in taking this exam should obtain an ISM student exam voucher code from me at the end of the quarter.
TDC-478 Information Storage and Management 7
Course Information
Contact Information
E-Mail:
jcannici@cdm.depaul.edu
Office Hours:
Course notes will be available on the Course Online website at http://col.cdm.depaul.edu. It is your responsibility to download and print a copy of the slides before coming to class.
TDC-478 Information Storage and Management 8
Grade Components:
Homework (4 @ 5% each) Exercises (2 @ 5% each) Group Presentations Midterm Exam Final Exam
Midterm Exam will be on October 10th. Final Exam will be on November 21st.
TDC-478 Information Storage and Management 9
Homework
4 Homework Assignments.
Must be submitted via COL. Four homework assignments will be given during the term. Late submissions will not be acceptedno exceptions.
10
Exercises
Exercises
Two practical exercises will be given during the term. Late submissions will not be acceptedno exceptions.
11
Textbook REQUIRED!
Information Storage and Management by EMC - Wiley Publishing Inc. 2012 (ISBN: 978-1-118-09483-9)
12
Chapter 1 Objectives
Define data and information Describe types of data Describe the evolution of storage architecture Describe the core elements of a data center List the key characteristics of data center Provide an overview of virtualization and cloud computing
13
21st Century is information era Information is being created at ever increasing rate Information has become critical for success
Example: Social networking sites, e-mails, video and photo sharing website, online shopping, search engines etc
14
Information is the knowledge derived from data Growth of digital information has resulted in information explosion We live in an on-command, on-demand world
Increasing dependency on fast and reliable access to information Businesses seek to store, protect, optimize, and leverage the information
What is Data?
It is a collection of raw facts from which conclusions may be drawn.
Data is converted into more convenient form Digital Data Factors for digital data growth are:
Increase in dataprocessing capabilities Lower cost of digital storage Affordable and faster communication technology
e-Book Book
email Letter
16
Categories of Data
PDFs email Attachments Unstructured (90%) X-rays Manuals Instant Messages Documents Web Pages Rich Media Invoices
Structured Unstructured
Audio, Video
17
Big Data
It refers to data sets whose sizes are beyond the ability of commonly used software tools to capture, store, manage, and process within acceptable time limits.
Includes both structured and unstructured data generated by variety of sources Big data analysis in real time requires new techniques and tools that provide:
High performance Massively parallel processing (MPP) data platforms Advanced analytics
Big data analytics provide an opportunity to translate large volumes of data into right decisions
TDC-478 Information Storage and Management 18
Storage
Data created by individuals/businesses must be stored for further processing Type of storage used is based on the type of data and the rate at which it is created and used Examples:
Individuals: Digital camera, Cell phone, DVDs, Hard disk Businesses: Hard disk, external disk arrays, tape library Centralized: mainframe computers Decentralized: Client/server model Centralized: Storage Networking
19
LAN
FC SAN
Time
20
Five core elements essential for the basic functionality of a data center:
22
Manageability
Performance Scalability
Capacity
23
Monitoring
Continuous process of gathering information on various elements and services running in a data center Details on resource performance, capacity, and utilization Configuration and allocation of resources to meet the capacity, availability, performance, and security requirements
Reporting
Provisioning
Virtualization and cloud computing have 2012 EMC Corporation. All rights changed the way center 24 reserved. TDC-478data Information Storage and Management
Virtualization: An Overview
Virtualization is a technique of abstracting physical resources and making them appear as logical resources
Pools physical resources and provides an aggregated view of physical resource capabilities Virtual resources can be created from pooled physical resources
Enables individuals and organizations to use IT resources as a service over network Enables self-service requesting and automates request-fulfillment process
Enables users to scale up or scale down the usage of computing resources quickly Consumers pay only for the resources they use
Example: CPU hours used, amount of data transferred, and Gigabytes of data stored
26
Chapter 1 Summary
Data and information Types of data Big data Evolution of storage architecture Core elements of data center Key characteristics of data center Virtualization and cloud computing
27
Chapter 2 Objectives
Describe the core elements of a data center Describe virtualization at application and host layer Describe disk drive components and performance Describe host access to storage through DAS Describe working and benefits of flash drives
28
Application and application virtualization DBMS Components of host system Compute and memory virtualization
29
Application
A software program that provides logic for computing operations Commonly deployed applications in a data center
Business applications email, enterprise resource planning (ERP), decision support system (DSS) Management applications resource management, performance tuning, virtualization Data protection applications backup, replication Security applications authentication, antivirus
Read intensive vs. write intensive Sequential vs. random I/O size 2012 EMC Corporation. All rights
reserved.
30
Application Virtualization
It is the technique of presenting an application to an end user without any installation, integration, or dependencies on the underlying computing platform.
Aggregates Operating System (OS) resources and the application into a virtualized container Ensures integrity of Operating System (OS) and applications Avoids conflicts between different applications or different versions of the same application
31
Database is a structured way to store data in logically organized tables that are interrelated
Processes an applications request for data Instructs the OS to retrieve the appropriate data from storage
32
Popular DBMS examples are MySQL, 2012 EMC Corporation. All rights Oracle RDBMS, SQL Server, etc. reserved. TDC-478 Information Storage and Management
Host
Resource that runs applications with the help of underlying computing components
Include CPU, memory, and input/output (I/O) devices Include OS, device driver, file system, volume manager, and so on
TDC-478 Information Storage and Management 33
Software components
Virtualization layer controls the environment OS works as a guest and only controls the application environment In some implementation OS is modified to communicate with virtualization layer
Device driver is a software that enables the OS to recognize the specific device
TDC-478 Information Storage and Management 34
Memory Virtualization
An OS feature that presents larger memory to the application than physically available
Additional memory space comes from disk storage Space used on the disk for virtual memory is called swap space/swap file or page file Inactive memory pages are moved from physical memory to the swap file Provides efficient use of available physical memory Data access from swap file is slower use of 2012 EMC Corporation. All rights 35 reserved. TDC-478 Information Storage and best Management flash drives for swap space gives
Logical Storage
Physical view of storage is converted to a logical view by mapping Logical data blocks are mapped to physical data blocks
LVM
Usually offered as part of the operating system or as third party host software LVM Components:
Volume Groups
Logical Volume
One or more Physical Volumes form a Logical Volume Logical Disk Block Volume Group LVM manages Volume Groups as a single entity Physical Volumes can be added and removed from a Volume Group as necessary Physical Volumes are typically Physical Volume 2 Physical Volume 3 divided into contiguous Physical equal-sized disk blocks Disk Block Volume Group A host will always have at least one disk group for the OS
Application and Operating System data maintained in separate volume groups 2012 EMC Corporation. All rights
reserved.
37
Logical Volume
Physical Volume
Partitioning
Concatenation
38
1
Creates/ Manages
2
Reside in
3
Mapped to
Disk Sectors
6
Mapped to
5
Mapped to
4
Mapped to
39
Compute Virtualization
It is a technique of masking or abstracting the physical compute hardware and enabling multiple operating systems (OSs) to run concurrently on a single or clustered physical machine(s).
Enables creation of multiple virtual machines (VMs), each running an OS and application
VMs are provided with 2012 EMC Corporation. All rights standardized hardware reserved. TDC-478 resources
Hard Disk
40
x86 Architecture
Before Virtualization
CP U
NIC Card
Memor y
Hard Disk
CPU
NIC Card
After Virtualization
Memor y
Hard Disk
Runs single operating system (OS) per machine at a time Couples s/w and h/w tightly May create conflicts when multiple applications run on the same machine Underutilizes resources Is inflexible and expensive
2012 EMC Corporation. All rights reserved.
Runs multiple operating systems (OSs) per physical machine concurrently Makes OS and applications h/w independent Isolates VM from each other, hence, no conflict Improves resource utilization
Offers flexible infrastructure at low cost 41 TDC-478 Information Storage and Management
Desktop Virtualization
It is a technology which enables detachment of the user state, the Operating System (OS), and the applications from endpoint devices.
Desktops run as virtual machines within the data center and accessed over a network
LAN/WAN
Desktop VMs
It is a technology which enables detachment of the user state, the Operating System (OS), and the applications from endpoint devices.
Desktop Virtualization
Desktop Virtualization
LAN/WAN
enablement of thin clients Improved data security Simplified data backup and PC maintenance
2012 EMC Corporation. All rights reserved.
Desktop VMs
43
Connectivity
Interconnection between hosts or between a host and any storage devices Physical Components of Connectivity are:
Protocol = a defined format for communication between sending and receiving devices
Popular storage interface protocols: IDE/ATA Host and SCSI Cable Adapter
Disk Port
TDC-478 Information Storage and Management 44
Connectivity Protocol
Protocol = a defined format for communication between sending and receiving devices
Tightly connected entities such as central processor to RAM, or storage buffers to controllers (example PCI) Directly attached entities connected at moderate distances such as host to storage (example IDE/ATA) Network connected entities such as networked hosts, NAS or SAN (example SCSI or FC)
TDC-478 Information Storage and Management 45
Most popular interface used with modern hard disks Good performance at low cost Inexpensive storage interconnect Used for internal connectivity Serial version of the IDE/ATA specification that has replaced the parallel ATA Inexpensive storage interconnect, typically used for internal connectivity Provides data transfer rate up to 6 Gb/s (standard 3.0)Hot-pluggable
TDC-478 Information Storage and Management 46
Higher cost than IDE/ATA, therefore not popular in PC environments Available in wide variety of related technologies and standards Support up to 16 devices on a single bus Ultra-640 version provides data transfer speed up to 640 MB/s
Point-to-point serial protocol replacing parallel SCSI Supports 2012 EMC Corporation. All rights data transfer rate up to 6 Gb/s (SAS 2.0)
reserved.
47
Widely used protocol for high speed communication to the storage device Provides a serial data transmission that operates over copper wire and/or optical fiber Latest version of the FC interface 16FC allows transmission of data up to 16 Gb/s Traditionally used to transfer host-to-host traffic Provide opportunity to leverage existing IP based network for storage communication
48
Magnetic Tape
Preferred option for backup destination in the past Sequential data access Single application access at a time Physical wear and tear Storage/retrieval overhead Optical Disks
Limitations
49
Optical discs
Popularly used as distribution medium in small, singleuser computing environments Limited in capacity and speed Write once and read many (WORM): CD-ROM, DVD-ROM Other variations: CD-RW, Blu-ray discs Most popular storage medium Large storage capacity Random read/write access Uses semiconductor media Provide high performance and low power consumption
TDC-478 Information Storage and Management 50
Disk drive
Flash drives
Controller
HDA
Interface
Power Connector
51
Cylinder
Track Platter
52
Surface) Surface)
Block 16
ylinder 2
Block 32
53
Electromechanical device
Impacts the overall performance of the storage system Time taken by a disk to complete an I/O request is sum of
Disk service time = seek time + rotational latency + data transfer time
2012 EMC Corporation. All rights reserved.
54
Time taken to position the read/write head Lower the seek time, the faster the I/O operation Seek time specifications include:
Radial Moveme nt
The time taken by platter to rotate and position the data under the R/W head Depends on the rotation speed of the spindle Average rotational latency
One-half of the time taken for a full rotation Approx. 5.5 ms for 5400rpm drive Approx. 2.0 ms for 15000rpm drive
Rotational delay (in sec) = 0.5/ 2012 EMC Corporation. All rights (RPM/60)
reserved.
56
Average amount of data per unit time that the drive can deliver to the HBA
Speed at which data moves from a platters surface to the internal buffer of the disk Rate at which data move through the interface to the HBA
Internal transfer rate measured here
Disk Drive
Head Disk Assembly
HBA
Interface
Buffer
Littles Law
Describes the relationship between the number of requests in a queue and the response time. N=aR
N is the total number of requests in the system a is the arrival rate R is the average response time
Utilization law
Arrival
58
0%
Utilization
70%
100%
Consider a disk I/O system in which an I/O request arrives at a rate of 100 I/Os per second. The service time, Rs, is 4 ms.
Utilization of I/O controller (U=a Rs) Total response time (R=Rs /1-U)
59
Calculate the response time at different % of TDC-478 Information Storage and Management
Scenario
Require 1TB of storage capacity Peak I/O workload 4900 IOPS Typical I/O size is 4Kb 15K rpm drive with storage capacity = 73 Gb Average seek time = 5 ms Data transfer rate = 40 Mb / sec
Task
Calculate number of disks required for the 2012 EMC Corporation. All rights application TDC-478 Information Storage and Management reserved.
60
Solution
Time required to perform one I/O is sum of seek time, rotational delay and transfer time Therefore, 5 ms + 0.5 /(15000/60) + 4KB/(40MB/sec) = 7.1 msec Calculate max. number of IOPS a disk can perform
For acceptable response time disk controller utilization must be less than 70%
Therefore, 140 X 0.7 = 98 IOPS Performance requirement we need 4900 / 98 = 50 disks Capacity requirement we need 1 TB / 73 GB = 14 disks
To meet application
More power consumption due to mechanical operations Low Mean Time Between Failure
No Spinning magnetic media No Mechanical movement which causes seek and latency Solid State enables consistent I/O performance
Lower power requirement per GB of storage Lower power requirement per IOPS
62
High performance and low latency Non volatile memory Uses single layer cell (SLC) or Multi Level cell (MLC) to store data
63
Faster performance
Response Time
Up to 30 times greater IOPS (benchmarked) Typical applications: 8 12X Less than 1 millisecond service time
1 Flash drive
IO per second
Better reliability
No moving parts 2012 EMC Corporation. All rights reserved. Faster RAID rebuildsTDC-478
64
Compute
Compute
Application Application
Application Application
File-level Request
Block-level Request
Block-level Request
File FileSystem System
Storage Storage
Storage System
Storage Storage Storage Storage
Direct-Attached Storage
2012 EMC Corporation. All rights reserved.
Block-level Access
File-level Access 65