Presentation on theme: "Everything You Wanted to Know About Storage, but Were Afraid to Ask."— Presentation transcript:
Everything You Wanted to Know About Storage, but Were Afraid to Ask
Do you have a Cell phone, PDA or Smartphone?
Do you have a DIGITAL CAMERA?
Do you have a PC?
What do all of these devices have in common ?
How do you protect your data?
Digital Footprint Calculator
Are you familiar with RAID ?
RAID 0 Data is striped across the HDDs in a RAID set The stripe size is specified at a host level for software RAID and is vendor specific for hardware RAID When the number of drives in the array increases, performance improves because more data can be read or written simultaneously Used in applications that need high I/O throughput Does not provide data protection and availability in the event of drive failures
RAID 1 Mirroring is a technique whereby data is stored on two different HDDs, yielding two copies of data. In addition to providing complete data redundancy, mirroring enables faster recovery from disk failure. Mirroring involves duplication of data the amount of storage capacity needed is twice the amount of data being stored. Therefore, mirroring is considered expensive It is preferred for mission-critical applications that cannot afford data loss
Nested RAID Mirroring can be implemented with striped RAID by mirroring entire stripes of disks to stripes on other disks RAID 0+1 and RAID 1+0 combine the performance benefits of RAID 0 with the redundancy benefits of RAID 1 These types of RAID require an even number of disks, the minimum being four. RAID 0+1 is also called mirrored stripe. This means that the process of striping data across HDDs is performed initially and then the entire stripe is mirrored.
Nested RAID RAID 1+0 is also called striped mirror The basic element of RAID 1+0 is that data is first mirrored and then both copies of data are striped across multiple HDDs in a RAID set Some applications that benefit from RAID 1+0 include the following: High transaction rate Online Transaction Processing (OLTP),Database applications that require high I/O rate, random access, and high availability
RAID 3 RAID 3 stripes data for high performance and uses parity for improved fault tolerance. Parity information is stored on a dedicated drive so that data can be reconstructed if a drive fails RAID 3 is used in applications that involve large sequential data access, such as video streaming.
RAID 4 Stripes data across all disks except the parity disk at the block level Parity information is stored on a dedicated disk Unlike RAID 3, data disks can be accessed independently so that specific data elements can be read or written on a single disk without read or write of an entire stripe
RAID 5 RAID 5 is a very versatile RAID implementation The difference between RAID 4 and RAID 5 is the parity location. RAID 4, parity is written to a dedicated drive, while In RAID 5, parity is distributed across all disks The distribution of parity in RAID 5 overcomes the write bottleneck. RAID 5 is preferred for messaging, medium-performance media serving, and relational database management system (RDBMS) implementations in which database administrators (DBAs) optimize data access
RAID 6 RAID 6 works the same way as RAID 5 except that RAID 6 includes a second parity element This enable survival in the event of the failure of two disks in a RAID group. RAID-6 protects against two disk failures by maintaining two parities
Hot Spare A hot spare refers to a spare HDD in a RAID array that temporarily replaces a failed HDD of a RAID set. When the failed HDD is replaced with a new HDD, The hot spare replaces the new HDD permanently, and a new hot spare must be configured on the array, or data from the hot spare is copied to it, and the hot spare returns to its idle state, ready to replace the next failed drive. A hot spare should be large enough to accommodate data from a failed drive. Some systems implement multiple hot spares to improve data availability. A hot spare can be configured as automatic or user initiated, which specifies how it will be used in the event of disk failure
What is an Intelligent Storage System Intelligent Storage Systems are RAID arrays that are: Highly optimized for I/O processing Have large amounts of cache for improving I/O performance Have operating environments that provide: – Intelligence for managing cache – Array resource allocation – Connectivity for heterogeneous hosts – Advanced array based local and remote replication options
Components of an Intelligent Storage System An intelligent storage system consists of four key components: front end, cache, back end, and physical disks.
Components of an Intelligent Storage System The front end provides the interface between the storage system and the host. It consists of two components: front-end ports and front-end controllers The front-end ports enable hosts to connect to the intelligent storage system, and has processing logic that executes the appropriate transport protocol, such as SCSI, Fibre Channel, or iSCSI, for storage connections Front-end controllers route data to and from cache via the internal data bus. When cache receives write data, the controller sends an acknowledgment
Components of an Intelligent Storage System Controllers optimize I/O processing by using command queuing algorithms Command queuing is a technique implemented on front-end controllers It determines the execution order of received commands and can reduce unnecessary drive head movements and improve disk performance
Intelligent Storage System: Cache Cache is an important component that enhances the I/O performance in an intelligent storage system. Cache improves storage system performance by isolating hosts from the mechanical delays associated with physical disks, which are the slowest components of an intelligent storage system. Accessing data from a physical disk usually takes a few milliseconds Accessing data from cache takes less than a millisecond. Write data is placed in cache and then written to disk
Cache Data Protection Cache mirroring: Each write to cache is held in two different memory locations on two independent memory cards Cache vaulting: Cache is exposed to the risk of uncommitted data loss due to power failure using battery power to write the cache content to the disk storage vendors use a set of physical disks to dump the contents of cache during power failure
Intelligent Storage System: Back End It consists of two components: back-end ports and back-end controllers Physical disks are connected to ports on the back end. The back end controller communicates with the disks when performing reads and writes and also provides additional, but limited, temporary data storage. The algorithms implemented on back-end controllers provide error detection and correction, along with RAID functionality. Controller Multiple controllers also facilitate load balancing
Intelligent Storage System: Physical Disks Disks are connected to the back-end with either SCSI or a Fibre Channel interface
What is LUNs Physical drives or groups of RAID protected drives can be logically split into volumes known as logical volumes, commonly referred to as Logical Unit Numbers (LUNs)
High-end Storage Systems High-end storage systems, referred to as active-active arrays, are generally aimed at large enterprises for centralizing corporate data These arrays are designed with a large number of controllers and cache memory An active-active array implies that the host can perform I/Os to its LUNs across any of the available Paths
Midrange Storage Systems Also referred as Active-passive arrays Host can perform I/Os to LUNs only through active paths Other paths remain passive till active path fails Midrange array have two controllers, each with cache, RAID controllers and disks drive interfaces Designed for small and medium enterprises Less scalable as compared to high-end array
CLARiiON Whiteboard Video
Direct-Attached Storage (DAS) storage connects directly to servers applications access data from DAS using block-level access protocols Examples: internal HDD of a host, tape libraries, and directly connected external HDD
DAS Direct-Attached Storage (DAS) DAS is classified as internal or external, based on the location of the storage device with respect to the host. Internal DAS: storage device internally connected to the host by a serial or parallel bus distance limitations for high-speed connectivity can support only a limited number of devices, and occupy a large amount of space inside the host
DAS Direct-Attached Storage (DAS) External DAS: server connects directly to the external storage device usually communication via SCSI or FC protocol. overcomes the distance and device count limitations of internal DAS, and provides centralized management of storage devices.
DAS Benefits Ideal for local data provisioning Quick deployment for small environments Simple to deploy Reliability Low capital expense Low complexity
DAS Connectivity Options host storage device communication via protocols ATA/IDE and SATA – Primarily for internal bus SCSI – Parallel (primarily for internal bus) – Serial (external bus) FC – High speed network technology
DAS Connectivity Options protocols are implemented on the HDD controller a storage device is also known by the name of the protocol it supports
DAS Management LUN creation, filesystem layout, and data addressing Internal – Host (or 3 rd party software) provides: Disk partitioning (Volume management) File system layout
DAS Management External – Array based management – Lower TCO for managing data and storage Infrastructure
DAS Challenges limited scalability Number of connectivity ports to hosts Number of addressable disks Distance limitations For internal DAS, maintenance requires downtime Limited ability to share resources (unused resources cannot be easily re-allocated) – Array front-end port, storage space – Resulting in islands of over and under utilized storage pools
Introduction to SCSI SCSI–3 is the latest version of SCSI
SCSI Architecture Primary commands common to all devices
SCSI Architecture Standard rules for device communication and information sharing
SCSI Architecture Interface details such as electrical signaling methods and data transfer modes
SCSI Device Model SCSI initiator device – Issues commands to SCSI target devices – Example: SCSI host adaptor
SCSI Device Model SCSI target device – Executes commands issued by initiators – Examples: SCSI peripheral devices
SCSI Device Model CDB structure – 8 bit structure – defines the command to be executed – contains operation code, command specific parameter and control parameter
SCSI Addressing a number from 0 to 15 with the most common value being 7
SCSI Addressing a number from 0 to 15
SCSI Addressing a number that specifies a device addressable through a target
SCSI Addressing Example controllerdevicetarget
Areas Where DAS Fails Just-in-time information to business users Integration of information infrastructure with business processes Flexible and resilient storage architecture
The Solution? Storage Networking FC SAN NAS IP SAN
What is a SAN ? Dedicated high speed network of servers and shared storage devices Provide block level data access
What is a SAN ? Resource Consolidation – Centralized storage and management Scalability – Theoretical limit: Appx. 15 million devices Secure Access
Fibre Channel Latest FC implementations support 8Gb/s
Fibre Channel a high-speed network technology that runs on high-speed optical fiber cables (for front- end SAN connectivity)
Fibre Channel and serial copper cables (for back-end disk connectivity)
FC SAN Evolution
Components of SAN three basic components: servers, network infrastructure, and storage, can be further broken down into the following key elements: node ports, cabling, interconnecting devices (such as FC switches or hubs), storage arrays, and SAN management software
Components of SAN: Node ports Examples of nodes – Hosts, storage and tape library Ports are available on: – HBA in host– Front-end adapters in storage – Each port has transmit (Tx) link and receive (Rx) link HBAs perform low- level interface functions automatically to minimize impact on host performance
Components of SAN: Cabling Copper cables for short distance Optical fiber cables for long distance – Single-mode Can carry single beams of light Distance up to 10 KM – Multi-mode Can carry multiple beams of light simultaneously Distance up to 500 meters
Components of SAN: Cabling
Components of SAN: Cabling (connectors) Node Connectors: SC Duplex Connectors LC Duplex Connectors Patch panel Connectors: ST Simplex Connectors
Components of SAN: Interconnecting devices – Hubs – Switches and – Directors
Components of SAN: Storage array storage consolidation and centralization provides – High Availability/Redundancy – Performance – Business Continuity– Multiple host connect
Components of SAN: SAN management software A suite of tools used in a SAN to manage the interface between host and storage arrays Provides integrated management of SAN environment Web based GUI or CLI
SAN Interconnectivity Options: FC-AL – Devices must arbitrate to gain control – Devices are connected via hubs – Supports up to 127 devices Fibre Channel Arbitrated Loop (FC-AL)
SAN Interconnectivity Options: FC-SW – Dedicated bandwidth between devices – Support up to 15 million devices – Higher availability than hubs Fabric connect (FC-SW)
Think "File Sharing"
What is NAS?
IP-based file sharing device attached to LAN Server consolidation File-level data access and sharing
Why NAS? dedicated to file-serving
Support comprehensive access to information Improves efficiency and flexibility Centralizes storage Simplifies management Scalability High availability – through native clustering Provides security integration to environment (user authentication and authorization) Benefits of NAS
CPU and Memory NICs NAS OS file sharing protocols storage protocols (ATA, SCSI, or FC) IP network
Benefits: Increases performance throughput (service level) to end users Minimizes investment in additional servers Provides storage pooling Provides heterogeneous file servings Uses existing infrastructure, tools, and processes
Benefits: Provides continuous availability to files Heterogeneous file sharing Reduces cost for additional OS dependent servers Adds storage capacity non- disruptively Consolidates storage management Lowers Total Cost of Ownership
Celerra Whiteboard Video
In FC SAN transfer of block level data takes place over Fibre Channel Emerging technologies provide for the transfer of block-level data over an existing IP network infrastructure Driver for IP SAN
Easier management Existing network infrastructure can be leveraged Reduced cost compared to new SAN hardware and software Supports multi-vendor interoperability Many long-distance disaster recovery solutions already leverage IP-based networks Many robust and mature security options are available for IP networks Why IP?
Block Storage over IP - iSCSI SCSI over IP IP encapsulation Ethernet NIC card iSCSI HBA Hardware-based gateway to Fibre Channel storage Used to connect servers
Block Storage over IP - FCIP Fibre Channel-to- IP bridge / tunnel (point to point) Fibre Channel end points Used in DR implementations
IP based protocol used to connect host and storage Carries block-level data over IP-based network Encapsulate SCSI commands and transport as TCP/IP packet iSCSI ?
iSCSI host initiators – Host computer using a NIC or iSCSI HBA to connect to storage – iSCSI initiator software may need to be installed iSCSI targets – Storage array with embedded iSCSI capable network port – FC-iSCSI bridge LAN for IP storage network – Interconnected Ethernet switches and/or routers Components of iSCSI
No FC components Each iSCSI port on the array is configured with an IP address and port number – iSCSI Initiators Connect directly to the Array
Bridge device translates iSCSI/IP to FCP – Standalone device – Integrated into FC switch (multi-protocol router) iSCSI initiator/host configured with bridge as target Bridge generates virtual FC initiator
Array provides FC and iSCSI connectivity natively No bridge devices needed
FCIP is an IP-based storage networking technology Combines advantages of Fibre Channel and IP Creates virtual FC links that connect devices in a different fabric FCIP is a distance extension solution – Used for data sharing over geographically dispersed SAN FCIP (Fibre Channel over IP)?
FCoE Whiteboard Video
Question 1 What was EMCs revenue in 2009? A. 60 BillionB BillionD. 9 BillionC. 14 Billion Ask a Colleague 50:50 Ask the Audience
EMC Corporation 2009 At a Glance 112 Revenues $14 billion Net Income $1.9 billion Employees ~41,500 Countries where EMC does business >80 R&D Investment ~$1.5 billion Operating Cash Flow $3.3 billion Free Cash Flow $2.6 billion Founded 1979
IDC Digital Universe Study IDC – May 2010
Question 2 How much digital information was created worldwide in 2009? A. 846 TerabytesB. 686 PetabytesD ExabytesC..8 Zettabytes 50:50 Ask the Audience Ask a Colleague
2009: 0.8 ZB Growing by a Factor of 44 Source: IDC Digital Universe Study, sponsored by EMC, May : 35.2 Zettabytes The Digital Universe One Zettabyte (ZB) = 1 trillion gigabytes
75 Billion Fully Loaded 16GB iPads 1.2 ZB in 2010 is Equal to...
What is Driving the Digital Explosion? Web 2.0 ApplicationsUbiquitous Content-Generating Devices Longer Data Retention Periods Secure Collaboration 3G/4G SEC 17a-4 Sarbanes-Oxley HIPAA Freedom of Information Act Regulation Landscape Data CenterRemote Site Data Copy for archiving Remote Copies Local Copies Backup copy 4 2
Question 3 What percentage of the.8 zettabytes of digital information is created by individuals? A. 30%B. 50%D. 90%C. 70% 50:50 Ask the Audience Ask a Colleague
Individuals create data …companies manage it! Create Ind. Of the digital universe will be created by individuals Ind. Manage Corp. Of the digital universe will be the responsibility of companies to manage and secure The Digital Information World Source: IDC Digital Universe Study, sponsored by EMC, May 2010
Question 4 How much storage capacity was available on the first Symmetrix 4200 that EMC shipped in 1990? A. 24 GigabytesB. 240 GigabytesD ExabytesC. 24 Terabytes 50:50 Ask the Audience Ask a Colleague
Invista Celerra Rainfinity Global File Virtualization NS500G NS700G NS500 NS700 NS704 NSX NS704G NS350 EMC Disk Library DL710 DL720 DL740 DL210 CLARiiON CX3 UltraScale Series AX150 Symmetrix DMX1000 DMX-3 DMX800 EMCs Tiered Storage Platforms SATA 250 GB 7,200 rpm Fibre Channel 73 GB 10k/15k rpm Fibre Channel 300 GB 10k rpm Fibre Channel 146 GB 10k/15k rpm SATA 500 GB 7,200 rpm Low-cost Fibre Channel 500 GB 7,200 rpm iSCSI Fibre Channel IPFICONSANNASCAS Connectrix ADIC Scalar family EMC Centera iSCSI EMC Centera 4-Node Symmetrix DMX-3 DMX Celerra Rainfinity Global File Virtualization NS350 NS40G NSX NS80G NS40 NS80 FC & iSCSI EMC Disk Library DL4400 DL4100 DL4200 DL210 Symmetrix 4200 Integrated Cached Disk Array introduced with a capacity of 24 gigabytes. Symmetrix V-Max Systems are available with up to 2 petabytes of usable storage in a single system Broadest Range of Function, Performance, and Connectivity
Managing Information Storage Trends, Challenges and Options EMC –
Question 6 What is the number 1 challenge identified by IT and storage managers? A. Storage consolidation 50:50 Ask the Audience B. Designing & deploying multi-site environments C. Managing storage growth x D. Making informed strategic / big picture decisions Ask a Colleague
Digital Information Storage Challenges 1.Managing Storage Growth 2.Designing, deploying, and managing backup and recovery 3.Designing, deploying, and managing storage in a virtualized server environment 4.Designing, deploying, and managing disaster recovery solutions 5.Storage consolidation 6.Making informed strategic / big-picture decisions 7.Integrating storage in application environments (such as Oracle, Exchange, etc.) 8.Designing and deploying multi-site environments 9.Lack of skilled storage professionals Most important activities/constraints identified as challenges by IT/storage managers Managing Information Storage: Trends, Challenges and Options *Source Input from over 1,450 storage professionals worldwide
Building an Effective Storage Mgmt Organization Based on EMC study Managing Information Storage: Trends, Challenges & Options ( ) Hire an additional 22%+ storage professionals...
Where Managers Plan to Find Storage Expertise Based on EMC study Managing Information Storage: Trends, Challenges & Options ( )
Top IT Certifications by Salary Source: Certification Magazine, December 2009
Storage Role Across IT Disciplines Leverage the functionalities of storage technology products to….. Systems Architects/Administrators –Maximize performance, increase availability, and avoid costly server upgrades. Network Administrators –Maximize performance of your network and to help you plan in advance. Database Administrators –Maximize performance, increase availability, and realize faster recoverability of your database. Application Architect –Increase the performance and availability of your application IT Project Managers –Plan & execute your IT Projects, which involve or are impacted by Storage technology components
EMC Academic Alliance
Key Pillars of IT Businesses IT perspective on the data center in the last 20 years have focused on 4 pillars of Information Technology: operating systems, databases, networking, and software application development Based on todays IT infrastructure, Information Storage is the 5th pillar of IT!
Question 7 What is the name of the EMC authored booked that was released in May 2009? A. Storage Area Networks for Dummies 50:50 Ask the Audience B. Storage Networks Explained C. Administering Data Centers x D. Information Storage and Management Ask a Colleague
Information Storage and Management (ISM) Modules Section 1. Storage System Section 2. Storage Networking Technologies & Virtualization Section 3. Business Continuity Section 4. Storage Security & Management
Information Storage and Mgmt (ISM) Section 1. Storage System KEY CONCEPT COVERAGE Data and Information Structured and Unstructured Data Storage Technology Architectures Core Elements of a Data Center Information Management Information Lifecycle Management Host, Connectivity, and Storage Block-Level and File Level Access File System and Volume Manager Storage Media and Devices Disk Components Zoned Bit Recording Logical Block Addressing Littles Law and the Utilization Law Hardware and Software RAID Striping, Mirroring, and Parity RAID Write Penalty Hot Spares Intelligent Storage System Front-End Command Queuing Cache Mirroring and Vaulting Logical Unit Number (LUN) LUN Masking High-end Storage System Midrange Storage System Open Section 1.Section 2. Section 3.Section 4. Student Profiles ExperiencedAspiring
Information Storage and Mgmt (ISM) Section 2. Storage Networking Technologies and Virtualization KEY CONCEPT COVERAGE Internal and External DAS SCSI Architecture SCSI Addressing Storage Consolidation Fibre Channel (FC) Architecture Fibre Channel Protocol Stack Fibre Channel Ports Fibre Channel Addressing World Wide Names (WWN) Zoning Fibre Channel Topologies NAS Device Remote File Sharing NAS Connectivity and Protocols NAS Performance and Availability MTU and Jumbo Frames iSCSI Protocol Native and Bridged iSCSI FCIP Protocol Fixed Content and Archives Single-Instance Storage Object Storage and Retrieval Content Authenticity Memory Virtualization Storage Virtualization Network Virtualization In-Band and Out- of-Band Implementations Server Virtualization Block-Level and File Level Virtualization Open Section 1.Section 2. Section 3.Section 4. Key initiatives for all companies ConsolidationVirtualization Physical / Smaller Footprint Logical / Greater Flexibility
Information Storage and Mgmt (ISM) Section 3. Business Continuity KEY CONCEPT COVERAGE Business Continuity Information Availability Disaster Recovery BC Planning Business Impact Analysis Operational Backup Archival Retention Period Bare-Metal Recovery Backup Architecture Backup Topologies Virtual Tape Library Data Consistency Host-Based Local Replication Array-Based Local Replication Copy on First Access (CoFA) Copy on First Write (CoFW) Restore and Restart Synchronous and Asynchronous Replication LVM-Based Replication Host-Based Log Shipping Disk-Buffered Replication Three-Site Replication Data Consistency Open Section 1.Section 2. Section 3.Section 4. Always available / Never lost Data CenterRemote Site Maximize Data AvailabilityMinimize chances of data loss Customer / Business Data Copy for archiving Remote Copies Local Copies Backup copy 4 2
Information Storage and Mgmt (ISM) Section 4. Storage Security and Management KEY CONCEPT COVERAGE Storage Security Framework The Risk Triad Security Domain Infrastructure Right Management Access Control Alerts Management Platform Standards Internal Chargeback Open Section 1.Section 2. Section 3. Section 4. Is my data secure? Data storage security considerations Consolidated Virtualized and in the Cloud
EMC Academic Alliance Partnering with leading Institutes of Higher Education worldwide to bridge the storage knowledge gap in Industry Providing EMC, Customers and Partners with source to hire storage educated graduates Hundreds of institutions globally, educating thousands of students Offering unique open course on Information Storage and Management Focus on concepts and principles Opportunity for EMC to give back as the industry leader For the latest list of participating institutions and to introduce us to your Alma Mater, visit Developing tomorrows Information Storage Professionals…today!
Becoming an Academic Partner Required Steps... 1.Institution enrolls via the EAA online application. 2.Institution identifies faculty to teach course and administer the program. 3.Institution identifies faculty to attend the 5 day ISM Faculty Readiness Seminar (FRS) and clear ISM certification exam. 4.Institution accesses secure Faculty website to download teaching aids such as chapter PowerPoints, quizzes, simulators, etc. 5.Institution promotes ISM course to students. 6.Institution schedules and begins teaching the ISM course.
Summary Information storage is one of the fastest growing sectors within IT. Information growth and complexity creates challenges and career opportunities Business and industry are looking for IT professionals who know all 5 pillars. Those who obtain the skills through formal education and industry qualification have an advantage.