Presentation is loading. Please wait.

Presentation is loading. Please wait.

Chapter 2: Introduction To Hardware

Similar presentations


Presentation on theme: "Chapter 2: Introduction To Hardware"— Presentation transcript:

1 Chapter 2: Introduction To Hardware
Special Edition Redbook copy

2 Chapter 2 objectives After completing this chapter, you will be able to: Discuss S/360 and System z hardware design. Explain processing units and disk hardware. Explain how mainframes differ from PC systems in data encoding. List some typical hardware configurations. Describe platform performance management features. Explain how Parallel Sysplex can achieve continuous availability. Explain dynamic workload balancing. Explain the single system image.

3 Key terms in this chapter

4 Introduction Here we look at the hardware in a complete system although the emphasis is on the processor “box” Terminology is not straightforward Ever since “boxes” became multi-engined, the terms system, processor, and CPU have become muddled CEC Box CPU Processor

5 Terminology Overlap In the early S/360 days a system had a single processor, which was also known as the central processing unit (CPU). The terms system, processor, and CPU were used interchangeably. However, these terms became confusing when systems became available with more than one processor. Today the mainframe has a rich heritage of terms as illustrated here. zEnterprise uses the term “node” in reference to entire Box or CEC or CPC or CPU or...

6 Early system design System/360 was designed in the early 1960s The central processor box contains the processors, memory, control circuits and channel interfaces Early systems had up to 16 channels whereas modern systems have 1024 (256 * 4 Logical Channel Subsystems) Channels connect to control units Control units connect to devices such as disk drives, tape drives and communication interfaces

7 Characteristics of early system design
Parallel channels are not available on the newest mainframes and are slowly being displaced on older systems. They are described here for the completeness of the topic. Parallel channels have been replaced with Enterprise Systems CONnection (ESCON®) and FIbre CONnection (FICON®) channels. These channels connect to only one control unit or, more likely, are connected to a director (switch) and are optical fibers. Current mainframes can have over one thousand channels and use two hexadecimal digits as the channel portion of an address. Channels are generally known as channel path identifiers (CHPIDs) or physical channel identifiers (PCHIDs) on later systems, although the term channel is also correct. The channels are all integrated in the main processor box.

8 Simple Conceptual S/360 The central processor box contains the processors, memory,1 control circuits, and interfaces for channels. A channel provides an independent data and control path between I/O devices and memory. Early systems had up to 16 channels; the largest mainframe machines at the time of this writing can have over 1000 channels. A channel can be considered as a high speed data bus. Channels connect to control units. A control unit contains logic to work with a particular type of I/O device. A control unit for a printer would have much different internal circuitry and logic than a control unit for a tape drive, for example. Some control units can have multiple channel connections providing multiple paths to the control unit and its devices. Today’s channel paths are dynamically attached to control units as the workload demands. This provides a form of virtualizing access to devices. More on this later in the chapter. Control units connect to devices, such as disk drives, tape drives, communication interfaces, and so forth. The division of circuitry and logic between a control unit and its devices is not defined, but it is usually more economical to place most of the circuitry in the control unit.

9 Device address In the early design the device address was physically related to the hardware architecture Parallel channels had large diameter heavy copper “bus and tag” cables This addressing scheme is still in use today, only virtualized

10 Current design Current CPC designs are considerably more complex although modular in their architecture to allow for easy maintenance and upgrades then the early S/360 design. This new design includes: - ESCON and FICON channels connect to only one device or one port on a switch. - Most modern mainframes use switches between the channels and the control units. The switches are dynamically connected to several systems, sharing the control units and some or all of its I/O devices across all the systems. - CHPID addresses are two composed of two hexadecimal digits. - Multiple partitions can sometimes share CHPIDs. This is known as spanning. Whether this is possible depends on the nature of the channel type and control units used through the CHPIDs. In general, CHPIDs used for disks can be shared. - An I/O subsystem layer exists between the operating systems in partitions (or in the basic machine if partitions are not used) and the CHPIDs. - The largest machine today can support up to four Logical Channel Subsystems (LCSSs), each having a maximum of 256 channels. - InfiniBand (IFB) is used as the pervasive, low-latency, and high-bandwidth interconnect that has low processing impact and is ideal for carrying multiple traffic types.

11 I/O and Channel Anatomy

12 Device addressing

13 PR/SM Architecture supports multiple operating systems

14 System Control and Partitioning
Support Elements (SEs) There are many ways to illustrate a mainframe’s internal structure, depending on what we wish to emphasize. Here shows several of the functions of the internal system controls on current mainframes. The IBM mainframe can be partitioned into separate logical computing systems. System resources (memory, processors, I/O channels) can be divided or shared among many such independent logical partitions (LPARs) under the control of the LPAR hypervisor, which comes with the standard Processor Resource/ Systems Manager (PR/SM) feature on all mainframes. The hypervisor is a software layer to manage multiple operating systems running in a single central processing complex. The mainframe uses a type 1 hypervisor. Each LPAR supports an independent operating system (OS) loaded by a separate initial program load (IPL) operation. For many years there was a limit of 15 LPARs in a mainframe; today’s machines can be configured with up to 60 Logical Partitions. Practical limitations of memory size, I/O availability, and available processing power usually limit the number of LPARs to less than these maximums. Each LPAR is considered an isolated and distinct server that supports an instance of an operating system (OS). The operating system can be any version or release supported by the hardware. In essence, a single mainframe can support the operation of several different OS environments Either can be use to configure the IOCDS

15 Hardware Management Console (HMC)

16 Logical Partition Weight Controls

17 Characteristics of LPARs
Processors – Storage - Channels Logical partitions are, in practice, equivalent to separate mainframes. Each LPAR runs its own operating system (OS). - This OS can be any mainframe operating system; there is no need to run z/OS, for example, in each LPAR. The installation planners may elect to share I/O devices across several LPARs, but this is a local decision. The system administrator can assign one or more system processors for the exclusive use of an LPAR. - Alternately, the administrator can allow all processors to be used on some or all LPARs. - Here, the system control functions (often known as microcode or firmware) provide a dispatcher to share the processors among the selected LPARs. The administrator can specify a maximum number of concurrent processors executing in each LPAR. The administrator can provide weightings for different LPARs, for example, specifying that LPAR1 should receive twice as much processor time as LPAR2. The operating system in each LPAR is performs an IPL separately, has its own copy of its operating system, has its own operator console (if needed), and soon. If the system in one LPAR fails or is taken down for maintenance, it has no effect on the other LPARs.

18 Two Models for System z10 – Physical Dimensions
Business Class (BC) Enterprise Class (EC) 79.3 Air flow front to rear Air flow front to rear 71.1 61.7 30.9 (inches) (inches) Model Model 2097

19 Enterprise Class design
Innovative designs Business Class design is based on Drawers Enterprise Class design is based on Books One drawer houses the Central Processing Complex (CPC) One to four drawers house the I/O features. Up to four Books house the Central Processing Complex One to three cages house the I/O features

20 Closer look at mainframe hardware internals

21 Consolidation of mainframes
There are fewer mainframes in use today than there were 20 years ago because of corporate mergers and data center consolidations. An additional reason for consolidation is that mainframe software (from many vendors) can be expensive, often costing more than the mainframe hardware. As an aid to consolidation, the mainframe offers software virtualization through z/VM. Mainframes require fewer staff when supporting hundreds of applications.

22 Processing units Central processor (CP) System Assist Processors (SAP)
Integrated Facility for Linux (IFL) z/OS’ Application Assist Processor (zAAP) z/OS’ Integrated Information Processor (zIIP) Integrated Coupling Facility (ICF) Spare (Capacity On Demand) Subcapacity processors

23 Disk Devices IBM 3390 disk drives are commonly used on current mainframes. The associated control unit (3990) typically has up to four fibre channels connected to one or more processors (probably with a switch), and the 3390 unit typically has eight or more disk drives. Simple 3390 arrangement

24 Storage Management Hardware
Traditional DASD (such as 3380 and 3390) Enterprise Storage Server (ESS) DS6000 and DS8000 Today customers use tapes for such purposes as backup, archiving, or data transfer between companies. The following types of tape devices are: Traditional tapes like 3480 and 3490 IBM Magstar® 3590 and 3592 Automated tape library (ATL) 3494 Virtual tape server (VTS).

25 Basic shared DASD

26 Channel To Channel Rings - Basic sysplex
Usage and locking information for data sets on disks. This allows the system to automatically prevent unwanted duplicate access to data sets. This locking is based on JCL specifications provided for jobs sent to the system. Job queue information, such that all the systems in the ring can accept jobs from a single input queue. Security controls that allow uniform security decisions across all systems. Disk metadata controls, so that RESERVE and RELEASE disk commands are not necessary.

27 Sysplex hardware overview and Server Time Protocol
Today's implementation uses the Server Time Protocol (STP), which is a server-wide facility that is implemented in the Licensed Internal Code (LIC). STP presents a single view of time to Processor Resource/Systems Manager™ (PR/SM), and is designed to provide the capability for multiple mainframe servers to maintain time synchronization with each other. NOTE: It is the follow-up to the Sysplex Timer. The Sysplex Timer distributes time to multiple servers in a star pattern, that is, the Sysplex Timer is the star, and its time signals distribute out from it to all attached servers. The signals from the Sysplex Timer are used to increment or step the TOD clocks in the attached server. - Unlike the Sysplex Timer, STP passes time messages in layers, or strata. The top layer (Stratum 1) distributes time messages to the layer immediately below it (Stratum 2). - Stratum 2 in turn distributes time messages to Stratum 3 and so on. In a timing network based on STP, a stratum is used as a means to define the hierarchy of a server in the timing network. A Stratum 1 server is the highest level in the hierarchy in the STP network.

28 Parallel Sysplex Is like an symphony

29 What is a Coupling Facility (CF)?
A CF functions largely as a fast scratch pad. It is used for three purposes: Locking information that is shared among all attached systems Cache information (such as for a database) that is shared among all attached systems Data list information that is shared among all attached systems

30 Why use a Coupling Facility
z/OS applications on different LPARs often need to access the same information, sometimes to read it and other times to update it. Linking a number of images together brings with it special importance, such as how the servers communicate and how they cooperate to share resources. Implementing a sysplex significantly changes the way z/OS systems will share data. Use of the Coupling Facility (CF) significantly improves the viability of connecting many z/OS systems in a sysplex to process work in parallel. Within a single z/OS system, the data management system keeps track of which piece of data is being accessed or changed by which application in the system. The information in the CF resides in very large memory structures. Dynamically be balanced across systems with high performance Improve availability for both planned and unplanned outages Provide for system or application rolling maintenance Offer scalable workload growth both vertically and horizontally View multiple-system environments as a single logical resource

31 Clustering Technologies for the mainframe
Parallel Sysplex technology helps to ensure continuous availability in today’s large systems environments. A Parallel Sysplex allows the linking up to 32 servers with near linear scalability to create a powerful commercial processing clustered system. Every server in a Parallel Sysplex cluster can be configured to share access to data resources, and a “cloned” instance of an application might run on every server. Unique advantage of using Parallel Sysplex technology is the ability to perform hardware and software maintenance and installation in a nondisruptive manner. z/OS LPAR1 z/OS LPAR2 z/OS LPAR3 z/OS LPAR4 Deactivate Rolling maintenance

32 Parallel Sysplex workload balancing

33 Intelligent Resource Director
Intelligent Resource Director is not actually a product or a system component; rather, it is three separate but mutually supportive functions: WLM LPAR CPU Management This function provides the means to modify an LPAR weight to a higher value to move logical CPUs to an LPAR that is missing its service level goal. Dynamic Channel-path Management (DCM) Dynamic Channel-path Management is designed to dynamically adjust the channel configuration in response to shifting workload patterns. DCM is implemented by exploiting functions in software components, such as WLM, I/O, and hardware configuration. This supports DASD controller to have the system automatically manage the number of I/O paths available to disk devices. Channel Subsystem I/O Priority Queuing (CSS IOPQ) z/OS uses this function to dynamically manage the channel subsystem priority of I/O operations for given workloads based on the performance goals for these workloads as specified in the WLM policy. The Channel Subsystem I/O Priority Queuing works at the channel subsystem level, and affects every I/O request (for every device, from every LPAR) on the CPC.

34 Intelligent Resource Director (IRD)
IRD establishes an enhanced synergy between WLM and the hardware IRD uses WLM to direct resources to the priority work IRD amplifies the advantage of WLM components by pooling resources together as LPAR clusters IRD has three major functions: - Dynamic CPU Management - Dynamic Channel Path Management - Channel Subsystem Priority Queuing Policy Adjustment Algorithms policy SJ – The z/OS OS running The z/OS operating system running in a logical partition provides yet another level of heterogeneous workloadmanagement. Multiple workloads within a z/OS logical partition are given an even finer granularity of workload management via the WLM component of z/OS. The priority of the logical partition with respect to other logical partitions is managed by PR/SM. [The priority of the individual workloads and applications within the logical partition is managed by WLM. IRD creates a synergy between the management done by PR/SM and the management done by WLM, producing a fully dynamic and automatic resource management capability that is unmatched in the industry. The detailed knowledge of WLM can be used to redirect the machine (PR/SM) resource allocations according to workload priorities across a set of logical partitions on a CPC called an LPAR cluster. The WLM component of z/OS is responsible for ensuring that customer policy goals are met for the set of diverse applications and workloads that a customer may run. This includes making changes or adjustments on the local sysplex member level, as well as redistributing work across members of a sysplex when needed, and is largely thought of as moving the work to the resources. With IRD, WLM augments the adjustments it makes to local sysplex members by managing the LPAR cluster. This can be thought of as moving the resources to the work. LPAR Cluster z/OS z/OS LPAR members not participating in the cluster z/OS z/OS z/OS CF

35 What happens when LPARs get into unexpected resource trouble?
Resources are not consumed equally in each system What happens when white space or unused resources in one LPAR can be used by another to meet service level objectives. Weight 15 Weight 25 Weight 40 Weight 20

36 Benefits of IRD At very low cost, LPAR clustering improves systems management by managing: 1. CPU resources A. Dynamic distribution of capacity within LPAR cluster while protecting capacity outside LPAR cluster B. Improves efficiency C. Uses upgraded capacity immediately 2. IO resources A. Prioritizes work when I/O is constrained (CSSQ) B. Improves channel configuration efficiency (DCM)

37 Prioritizing Work Across Images in a Server – IRD
PR/SM, IRD and WLM work together to ensure that the resources of the server are correctly balanced to enable work to complete within stated policy goals Coupling Facility Needs More Resources Needs More Resources Linux PRODUCTION WAS CICS DB2 Production WAS CICS DB2 TESTING Batch (low priority) 40 65 90 35 10 65 60 35 Weight Weight Weight Weight z/VM z/OS z/OS z/OS Intelligent Resource Director Intelligent Resource Director Processor Resource / Systems Manager Processor Resource / Systems Manager System z9 System z9 Processor resources, data bandwidth and I/O queueing decisions are perfectly balanced across the server to manage diverse workloads within the parameters of stated business goals

38 No operator intervention Processor Resource / System Manager (PR/SM)
LPAR CPU weights can change dynamically More Workload Demand ! pu pu pu pu pu pu pu No operator intervention WebSphere (SOA) Traditional OLTP and Batch Linux Move resources to where workload is … 50 weight 20 weight 30 weight 70 Weight 20 Weight 10 Weight LPAR 3 LPAR 2 LPAR 1 Intelligent Resource Director Processor Resource / System Manager (PR/SM) System z9

39 Dynamic Channel Management
Channels dynamically move to workload Example 9:30 AM DB2 Disk Other Disk Director/Switch 9:46 9:45 FICON IRD modifies channel distribution to meet changing workload SLO’s Are Being Met Workload increases in SOA LPAR Channel Subsystem pu pu pu pu pu pu pu pu pu pu Websphere (SOA) Traditional OLTP and Batch Linux Dynamic Channel Management Weight = 50 Weight = 20 Weight = 30 = 10 = 70 = 20 LPAR 3 LPAR 2 LPAR 1 Intelligent Resource Director Processor Resource / System Manager (PR/SM) System z9

40 IRD – Channel System Priority Queuing
I/O Priority Queuing prioritizes - I/O within an LPAR - Channel Subsystem Priority - Queuing prioritizes I/O within an LPAR cluster Allows better channel resource management - High priority work is given preferential access to the channel - Can reduce channel requirements - Managed within System z server LPAR Cluster High Priority WKL Med. Priority WKL Low Priority WKL This feature works with the special engine native to the mainframe called the System Assist Processor - which uses Work Queues for prioritizing Disk Directing resources through priority

41 Platform Performance Management with zEnterprise
The ability to define Platform Workloads, which is a grouping of the virtual servers (AIX® partitions on POWER® blades, z/VM virtual machines, and PR/SM LPARs) that support business applications. - Consider a three-tiered application with a web server running on a POWER blade, IBM WebSphere Application Server running in a Linux on System z z/VM guest, and DB2 in a z/OS partition. “The workload would be the three virtual servers”. The ability to define a performance policy for a Platform Workload that is used to monitor and manage the virtual servers in that Platform Workload. - This performance policy allows goal-oriented performance objectives and workload importance to be set for the virtual servers in the Platform Workload. Provide performance monitoring in the context of the Platform Workload. - This monitoring indicates whether the virtual servers in the Platform Workload are achieving their goals. - If not, it helps determine which virtual servers are contributing to the performance problem.

42 Sample ensemble: Remaining virtual servers are defined to the “Administration” Workload

43 Typical mainframe system growth
Customer Initiated Upgrade (CIU): The CIU feature enables a customer to order permanent capacity upgrades rapidly and download them without disrupting applications already running on the machine. - When extra processing power becomes necessary, an administrator simply uses a two step process: a. Navigates to special web-based link to order an upgrade. b. Uses the Remote Service Facility on the Hardware Management Console (HMC) to download and activate preinstalled inactive processors (uncharacterized engines) or memory. On/Off Capacity on Demand (On/Off CoD): This feature is available through CIU, and uses On/Off CoD for temporary increases in processor capacity. - With temporary processor capacity, customers manage both predictable and unpredictable surges in capacity demands. - They can activate and deactivate quickly and efficiently as the demands on their organization dictates to obtain additional capacity that they need, when they need it, and the machine will keep track of its usage. Capacity Backup (CBU): Customers can use CBU to add temporary processing capacity to a backup machine in the event of an unforeseen loss of server capability because of an emergency. - With CBU, customers can divert entire workloads to backup servers for up to 90 days.

44 Example of Upgrade Panel

45 Continuous availability of mainframes
short list Parallel Sysplex technology is an enabling technology, allowing highly reliable, redundant, and robust mainframe technologies to achieve near-continuous availability. Hardware and software components provide concurrency to facilitate nondisruptive maintenance, such as Capacity Upgrade on Demand, which allows processing or coupling capacity to be added one engine at a time without disruption to running workloads. DASD subsystems employ disk mirroring or RAID technologies to help protect against data loss, and exploit technologies to enable point-in-time backup, without the need to shut down applications Networking technologies deliver functions such as VTAM® Generic Resources, Multi-Node Persistent Sessions, Virtual IP Addressing, and Sysplex Distributor to provide fault-tolerant network connections. I/O subsystems support multiple I/O paths and dynamic switching to prevent loss of data access and improved throughput. z/OS software components allow new software releases to coexist with lower levels of those software components to facilitate rolling maintenance. Business applications are “data sharing-enabled” and cloned across servers to allow workload balancing to prevent loss of application availability in the event of an outage. Operational and recovery processes are fully automated and transparent to users, and reduce or eliminate the need for human intervention. z/OS has a Health Checker to assist in avoiding outages. This uses “best practices,” identifying potential problems before they impact availability. It produces output in the form of detailed messages and offers suggested actions to take. (Discussed in next session)

46 Systems management technologies
Workload management component To make a contract between the installation (user) and the operating system. The installation classifies the work running on the z/OS operating system in distinct service classes and defines goals for them that express the expectation of how the work should perform Sysplex Failure Manager Allows the installation to specify failure detection intervals and recovery actions to be initiated in the event of the failure of a system in the sysplex. Automatic Restart Manager Enables fast recovery of subsystems that might hold critical resources at the time of failure. Cloning and symbolics Cloning refers to replicating the hardware and software configurations across the different physical servers in the Parallel Sysplex (identical instances). System symbolics are used to help manage cloning z/OS resource sharing The IBM Coupling Facility shared storage provides a medium for sharing component information for the purpose of multi-system resource management.

47 Application compatibility
Technology benefits Scalability (even with nondisruptive upgrades), availability, and dynamic workload management are tools that enable an architect to meet customer needs in cases where the application plays a key role in the customer’s business process. Integration benefits Because many applications are historically S/390- and z/OS-based, new applications on z/OS get performance and maintenance benefits, especially if they are connected to existing applications. Infrastructure benefits If there is already an existing Parallel Sysplex, it needs little infrastructure work to integrate a new application. In many cases, the installation does not need to integrate new servers. Instead, it can use the existing infrastructure and make use of the strengths of the existing sysplex. With Geographically Dispersed Parallel Sysplex™ (GDPS®) connecting multiple sysplexes in different locations, the mainframe IT staff can create a configuration that is enabled for disaster recovery.

48 1. z/OS R7 and z/OS R8 support require IBM Lifecycle Extension for z/OS (5637-A01 or 5638-A01). PTFs required for z/OS R8-R12; refer to the PSP. 2. Maximum of 1 TB per LPAR. Maximum supported by z/OS R7 is 512 GB. z/OS R8 and later are designed to support up to 4 TB per image. 3. z/OS R12 required 4. z/OS R12, or R10 or later with PTFs required 5. Cryptographic Support for z/OS V1.10 through z/OS V1.12 Web deliverable with the PTF for APAR OA33260 required.

49 Summary Terminology is important
The classic S/360 design is important as all later designs have enhanced it. The concepts are still relevant New processor types are now available to reduce software costs Clustering techniques and parallel sysplex

50 2.15 Questions for review To help test your understanding of the material in this chapter, answer the following questions: Why does software pricing for mainframes seem so complex? Why does IBM have so many models (or “capacity settings”) for recent mainframe machines? 3. Why does the power needed for a traditional COBOL application not have a linear relationship with the power needed for a new Java application? 4. Multiprocessing means running several processors simultaneously (available to the operating system and applications). - What does multiprogramming mean? 5. What are the differences between loosely coupled systems and tightly coupled systems? 6. What z/OS application changes are needed for it to work in an LPAR? 7. What z/OS application changes are needed to work in a Parallel Sysplex? 8. How do specialty processors help applications? 9. How do disaster recovery solutions benefit a global business?

51 2.16 Topics for further discussion
What are the advantages of a Parallel Sysplex presenting a single image externally? Are there any disadvantages? 2. Why is continuous availability required in today’s marketplace? 3. How might someone justify the cost of the “redundant” hardware and the cost of the software licences required to build a Parallel Sysplex?

52 2.17 Exercises Here are some exercises you can perform: To display the CPU configuration: a. Access SDSF from the ISPF primary option menu. b. In the command input field, enter /D M=CPU and press Enter. c. Use the ULOG option in SDSF to view the command display result. To display the page data set usage: a. In the command input field, enter /D ASM and press Enter. b. Press PF3 to return to the previous screens. To display information about the current Initial Program Load (IPL): a. Use ULOG option in SDSF to view the command display result. b. In the command input field, enter /D IPLINFO and press Enter.

53 End of May 2011 slides


Download ppt "Chapter 2: Introduction To Hardware"

Similar presentations


Ads by Google