Chapter 2A: Hardware systems and LPARs

Chapter 2A: Hardware systems and LPARs

Objectives In this chapter you will learn:
About S/360 and zSeries hardware design Mainframe terminology Hardware components About processing units and disk hardware How mainframes differ from PC systems in data encoding Some typical hardware configurations

Introduction Here we look at the hardware in a complete system although the emphasis is on the processor “box” Terminology is not straightforward Ever since “boxes” became multi-engined, the terms system, processor, and CPU have become muddled

Terminology Overlap In the early S/360 days a system had a single processor, which was also known as the central processing unit (CPU). The terms system, processor, and CPU were used interchangeably. However, these terms became confusing when systems became available with more than one processor. Today the mainframe has a rich heritage of terms as illustrated here.

Early system design System/360 was designed in the early 1960s
The central processor box contains the processors, memory, control circuits and channel interfaces Early systems had up to 16 channels whereas modern systems have 1024 (256 channels * 4 Logical Channel Subsystems) Channels connect to control units Control units connect to devices such as disk drives, tape drives and communication interfaces

Device address In the early design the device address was physically related to the hardware architecture Parallel channels had large diameter heavy copper “bus and tag” cables This addressing scheme is still in use today only virtualized

Parallel Channel “Connectivity”
The maximum data rate of the parallel channel is up to 4.5 MB, and the maximum distance that can be achieved with a parallel channel interface is up to 122 meters (400 ft). These specifications can be limited by the connected control units and devices.

Conceptual S/360 The central processor box contains the processors, memory,1 control circuits, and interfaces for channels. A channel provides an independent data and control path between I/O devices and memory. Early systems had up to 16 channels; the largest mainframe machines at the time of this writing can have over 1000 channels. A channel can be considered as a high speed data bus. Channels connect to control units. A control unit contains logic to work with a particular type of I/O device. A control unit for a printer would have much different internal circuitry and logic than a control unit for a tape drive, for example. Some control units can have multiple channel connections providing multiple paths to the control unit and its devices. Today’s channel paths are dynamically attached to control units as the workload demands. This provides a form of virtualizing access to devices. More on this later in the chapter. Control units connect to devices, such as disk drives, tape drives, communication interfaces, and so forth. The division of circuitry and logic between a control unit and its devices is not defined, but it is usually more economical to place most of the circuitry in the control unit.

Current design Current CEC designs are considerably more complex although modular in their architecture to allow for easy maintenance and upgrades then the early S/360 design This new design includes: - CEC modular components - I/O housing - I/O connectivity - I/O operation - Partitioning of the system

Recent Configurations
Most modern mainframes use switches between the channels and the control units. The switches are dynamically connected to several systems, sharing the control units and some or all of its I/O devices across all the systems. Multiple partitions can sometimes share channel addresses known as spanning.

ESCON Connectivity ESCON (Enterprise Systems Connection) is a data connection created by IBM commonly used to connect their mainframe computers to peripheral devices. ESCON replaced the older, slower parallel Bus&Tag channel technology The ESCON channels use a director to support dynamic switching. ESCON is an optical fiber, half-duplex, serial interface. It originally operated at a rate of 10 Mbyte/s, which was later increased to 17Mbyte/s. The current maximum distance is 43 kilometers. ESCON was introduced by IBM in September It replaced the older, slower (4.5 Mbyte/s), copper-based, parallel, Bus & Tag channel technology of era mainframes. Optical fiber is smaller in diameter and weight, and hence could save installation costs. Space and labor could also be reduced when fewer physical links were required - due to ESCON's switching features. ESCON allows the establishment and reconfiguration of channel connections dynamically, without having to take equipment off-line and manually move the cables. ESCON supports channel connections using serial transmission over a pair of fibers. The ESCON Director supports dynamic switching (which could be achieved prior to ESCON, but not with IBM-only products). It also allows the distance between units to be extended up to 60km over a dedicated fiber. “Permanent virtual circuits” are supported through the switch. ESCON switching has advantages over a collection of point-to-point links. A peripheral previously capable of accessing a single mainframe can now be connected simultaneously to up to eight mainframes, providing peripheral sharing.

ESCON Director ESCD ESCD
ESCON Director is an I/O switch capable of providing dynamic, nonblocking, any-to-any connectivity for up to 60 fiber optic links operating at 200 Mb/s.

Fiber Connectivity (FICON)
FICON (for Fiber Connectivity) was the next generation high-speed input/output (I/O) interface used by for mainframe computer connections to storage devices. FICON channels increase I/O capacity through the combination of a new architecture and faster physical link rates to make them up to eight times as efficient as ESCON (Enterprise System Connection),

ESCON vs FICON ESCON - 20 Mbytes / Second
- Lots of “dead time”. One active request at a time. - One target control unit FICON - 400 Mbytes / Second, moving to 800 - Uses FCP standard - Fiber Optic cable (less space under floor) - Currently, up to 64 simultaneous “I/O packets” at a time with up to 64 different control units - Supports Cascading switches

System z I/O Connectivity
ESCON and FICON channels Switches to connect peripheral devices to more than one CEC The Channel Subsystem handles the channel scheduling CHPID is a 16-bit Channel Path Id The channel Subsystem maps the CHPID to the channel and device numbers, queues I/O requests and selects the available path CHPID addresses are two hex digits (FF / 256) Multiple partitions (LPARs) can share CHPIDs Channel subsystem layer exists between the operating system and the CHPIDs MIF = Multiple Image facility..Share CHPID across LPARs

MIF Channel Consolidation - example
statically dynamically assigned assigned

I/O Connectivity Addressing and Definitions
I/O control layer uses a control file IOCDS that translates physical I/O addresses into devices numbers that are used by z/OS Device numbers are assigned by the system programmer when creating the IODF and IOCDS and are arbitrary (but not random!) On modern machines they are three or four hex digits example - FFFF = 64K devices can be defined The ability to have dynamic addressing theoretically 7,848,900 maximum number of devices can be attached. Today’s mainframe - up to 64,512 subchannels per LCSS maximum. With 15 LPARs per LCSS = 967,680 subchannels per LCSS maximum. That said, we can't multiply that by 4 LCSSes because the z990 is limited to 30 LPARs. No matter how the 30 LPARs are placed in 2, 3 or 4 LCSSes, the maximum number of subchannels is 1,935, (Decide for yourself whether you want to consider 1 K to be 1,000 or 1,024 for this discussion. When we publish 63K, the definition is 1,024. System z EC - Up to 65,280 subchannels in Set 0 (768 larger than z990) plus 65,535 in Set 1 per LCSS = 130,815 subchannels per LCSS maximum. With 15 LPARs per LCSS = 1,962,225 subchannels per LCSS maximum. We CAN multiply this by 4 LCSSes because z9 EC supports 60 LPARs. So, the maximum number of subchannels is 7,848,900, more that 4 times as many as z990. The corresponding number for the z9 BC is half the number for the z9 EC because the BC supports only 30 LPARs. Note that these numbers are theoretical maximums. How many a given system will actually use, even if all 60 LPARs are defined, is determined by the HCD/IOCP definition. This complex topic is discussed in the "IOCP User's Guide" for each machine. And, the subchannels in Set 1 on z9 do NOT represent more devices because they are used only for parallel access (PAVs) for disk devices defined in Set 0. So, more I/O access, not more devices. This may allow larger disk volumes to be defined, so it is fair to say that more subchannels enable access to more data.

Partitions Subchannels Channels
Channel Subsystem Relationship to Channels, Control Units and I/O Devices Z10 Controls queuing, de-queuing, priority management and I/O identification of all I/O operations performed by LPARs Supports the running of an OS and allows CPs, memory and Subchannels access to channels This represents an I/O device to the hardware and is used by the OS to pass an I/O request to the channel subsystem The communication path from the channel subsystem to the I/O network and connected Control Units/devices Channel Subsystem Partitions Subchannels Channels Control Units Devices (disk, tape, printers) CU CU CU The role of the Channel Subsystem is to control communication of internal and external channels to control units and devices. The configuration definitions of the CSS define the operating environment for the correct execution of all system Input/Output (I/O) operations. The CSS provides the server communications to external devices via channel connections. The channels permit transfer of data between main storage and I/O devices or other servers under the control of a channel program. The CSS allows channel I/O operations to continue independently of other operations within the central processors. The building blocks that make up a Channel Subsystem are listed here. The structure provides up to four Channel Subsystems. Each CSS has from one to 256 CHPIDs, and may be configured with up to 15 logical partitions that relate to that particular Channel Subsystem. CSSs are numbered from 0 to 3, and are sometimes referred to as the CSS Image ID (CSSID 0, 1, 2, and 3). Note: The z9 and z10 EC provides for up to four Channel Subsystems, 1024 CHPIDs, and up to 60 logical partitions for the total system. @ = @ = 19

Channel Spanning across LPARs (partitions)
Channel spanning extends the MIF concept of sharing channels across logical partitions to sharing channels across Logical Channel Subsystems and logical partitions. Spanning is the ability for the channel to be configured to multiple Logical Channel Subsystems. When defined that way, the channels can be transparently shared by any or all of the configured logical partitions, regardless of the Logical Channel Subsystem to which the logical partition is configured. A channel is considered a spanned channel if the same CHPID number in different LCSSs is assigned to the same PCHID in IOCP, or is defined as “spanned” in HCD. In the case of internal channels (for example, IC links and HiperSockets), the same applies, but there is no PCHID association. They are defined with the same CHPID number in multiple LCSSs. CHPIDs that span LCSSs reduce the total number of channels available on the z9 EC, z9 BC, z990, or the z890. The total is reduced, since no LCSS can have more than 256 CHPIDs. For a z890 that supports two LCSSs, a total of 512 CHPIDs are supported. If all CHPIDs are spanned across the two LCSSs, then only 256 channels can be supported. For a z9 EC or a z990 with four LCSSs, a total of 1024 CHPIDs are supported. If all CHPIDs are spanned across the four LCSSs, then only 256 channels can be supported. 20

Physical-Channel Subsystem
The Mainframe I/O Logical Channel Subsystem Schematic Logical-channel Subsystem 0 Subsystem 1 Subsystem 2 Subsystem 3 FICON Switch, Control Unit, Devices, etc. ESCON Switch, Physical-Channel Subsystem Cache H I P E R V S O A MBA SAP -One FICON channel shared by all LCSs and all partitions -A MCSS-Spanned Channel Path - One ESCON channel shared by all partitions configured to LCS15 - A MIF-shared channel path The mainframe’s CSS comprises all of the hardware and firmware required to implement the System z’s CSS architecture, including all of the different types of channel paths provided by system z. The firmware in the system assist processors (SAPs) and I/O channel paths performs the bulk of the I/O instructions and I/O interrupt processing. There is also firmware in the CPs that initiates the I/O instructions and participates in the handling of I/O interruptions. The CSS directs the flow of information between I/O devices and main storage. The CSS uses one or more channel paths as the communication links in managing the flow of this information. As part of I/O processing, the CSS also performs channel-path selection and channel-path management functions, such as testing for channel-path availability, selecting an available channel path, and initiating execution of I/O operations over the selected channel path with the attached I/O devices. When I/O operations are completed, the CSS analyzes the resulting status and transmits it back to the program by use of I/O interruptions and I/O status information. To provide communication among the various elements in the CSS and to maintain information about the I/O configuration, a set of control blocks are allocated in the hardware system area (HSA)—storage that is accessible only to the embedded software (firmware). 21

System z – I/O Configuration Support
Each Logical Channel Subsystem has a set of Subchannels System z Processor Logical Channel Subsystem Logical Channel Subsystem Partitions Partitions Subchannels Subchannels 63K 63K Channels Channels 22

System Control and Partitioning
Support Elements (SEs) There are many ways to illustrate a mainframe’s internal structure, depending on what we wish to emphasize. Here shows several of the functions of the internal system controls on current mainframes. The IBM mainframe can be partitioned into separate logical computing systems. System resources (memory, processors, I/O channels) can be divided or shared among many such independent logical partitions (LPARs) under the control of the LPAR hypervisor, which comes with the standard Processor Resource/ Systems Manager (PR/SM) feature on all mainframes. The hypervisor is a software layer to manage multiple operating systems running in a single central processing complex. The mainframe uses a type 1 hypervisor. Each LPAR supports an independent operating system (OS) loaded by a separate initial program load (IPL) operation. For many years there was a limit of 15 LPARs in a mainframe; today’s machines can be configured with up to 60 Logical Partitions. Practical limitations of memory size, I/O availability, and available processing power usually limit the number of LPARs to less than these maximums. Each LPAR is considered an isolated and distinct server that supports an instance of an operating system (OS). The operating system can be any version or release supported by the hardware. In essence, a single mainframe can support the operation of several different OS environments Either can be use to configure the IOCDS

Logical Partitions (LPARs) or Servers
A system programmer can assign different operating environments to each partition with isolation An LPAR can be assigned a number of dedicated or shared processors. Each LPAR can have different storage (CSTOR) assigned depending on workload requirements. The I/O channels (CHPIDs) are assigned either statically or dynamically as needed by server workload. Provides an opportunity to consolidate distributed environments to a centralized location For many years there was a limit of 15 LPARs in a mainframe; today’s machines can be configured with up to 60 Logical Partitions or servers within a single CEC. Practical limitations of memory size, I/O availability, and available processing power usually limit the number of LPARs to less than these maximums. Each LPAR is considered an isolated and distinct server that supports an instance of an operating system (OS). The operating system can be any version or release supported by the hardware. In essence, a single mainframe can support the operation of several different OS environments. System administrators assign portions of memory to each LPAR; memory also known as central storage (CSTOR) cannot be shared among LPARs. CSTOR in past literature may also be referred to as main storage, provides the system with directly addressable, fast-access electronic storage of data. Both data and programs must be loaded into central storage (from input devices) before they can be processed by the CPU.

Characteristics of LPARs
LPARs are the equivalent of a separate mainframe for most practical purposes Each LPAR runs its own operating system Devices can be shared across several LPARs Processors can be dedicated or shared When shared each LPAR is assigned a number of logical processors (up to the maximum number of physical processors) Each LPAR is independent

Shared CPs example

LPAR Logical Dispatching (Hypervisor)
1 - The next logical CP to be dispatched is chosen from the logical CP ready queue based on the logical CP weight. 2 - LPAR dispatches the selected logical CP (LCP5 of MVS LP) on a physical CP in the CPC (CP0, in the visual). 3 - The z/OS dispatchable unit running on that logical processor (MVS2 logical CP5) begins to execute on physical CP0. It executes until its time slice (generally between 12.5 and 25 milliseconds) expires, or it enters a wait, or it is intercepted for some reason. 4 - In the visual, the logical CP keeps running until it uses all its time slice. At this point the logical CP5 environment is saved and control is passed back to LPAR , which starts executing on physical CP0 again. 5 - LPAR determines why the logical CP ended execution and requeues the logical CP accordingly. If it is ready with work, t is requeued on the logical CP ready queue and step 1 begins again.

LPAR Summary System administrators assign: Memory Processors
CHPIDs either dedicated or shared This is done partly in the IOCDS and partly in a system profile in the CEC. Changing the system profile and IOCDS will usually require a power-on reset (POR) but some changes are dynamic

Processor units or engines
Today’s mainframe can characterize workloads using different license engine types General Central Processor (CP) - Used to run standard application and system workloads System Assist Processor (SAP) - Used to schedule I/O operations Integrated Facility for Linux (IFL) - A processor used exclusively by a Linux LPAR under z/VM. z/OS Application Assist Processor (zAAP) - Provides for Java and XML workload offload z/OS Integrated Information Processor (zIIP) - Used to optimize certain database workload functions and XML processing Integrated Coupling Facility (ICF) - Used exclusively by the Coupling Facility Control Code (CFCC) providing resource and data sharing Spares - Used to take over processing functions in the event of an engine failure Multi Chip Module (MCM) Note: Channels are RISC micro processors and are assigned depending on I/O configuration requirements.

Capacity on Demand Various forms of Capacity on Demand exist
Additional processing power to meet unexpected growth or sudden demand peaks CBU – Capacity Back Up CUoD – On/Off Capacity Upgrade on Demand SubCapacity Licensing Charges LPAR CPU Management (IRD)

Disk Devices Current mainframes use 3390 disk devices
The original configuration was simple with a controller connected to the processor and strings of devices attached to the back end

Current 3390 Implementation

Modern 3390 devices The DS Enterprise Storage Server just shown is very sophisticated It emulates a large number of control units and 3390 disks. It can also be partitioned and connect to UNIX and other systems as SCSI devices. There are TB of disk space, up to 32 channel interfaces, GB cache, and 284 MB of non-volatile memory

Modern 3390 Devices The physical disks are commodity SCSI- type units
Many configurations are possible but usually it is RAID-5 arrays with hot spares Almost every part has a fallback or spare and the control units are emulated by 4 RISC processors in two complexes.

Modern 3390 Devices The 2105 offers FlashCopy, Extended Remote Copy, Concurrent Copy, Parallel Access Volumes, Multiple Allegiance This is a huge extension of the original 3390 architecture and offers a massive performance boost. To the z/OS operating system these disks just appear as traditional 3390 devices so maintaining backward compatibility

EBCDIC The IBM S/360 through to the latest zSeries machines use the Extended Binary Coded Decimal Interchange character set for most purposes This was developed before ASCII and is also an 8 bit character set z/OS Web Server stores ASCII data as most browsers run on PCs which expect ASCII data UNICODE is used for JAVA on the latest machines

Clustering Clustering has been done for many years in several forms
Basic shared DASD CTC/GRS rings/Ethernet Basic and Parallel sysplex Image is used to describe a single z/OS system, which might be standalone or an LPAR on a large box

Basic shared DASD Limited capability
Reserve and Release against a whole disk Limits access to that disk for the duration of the update The capabilities of a basic shared DASD system are limited. The operating systems automatically issue RESERVE and RELEASE commands to a DASD before updating the volume table of contents (VTOC), or catalog. The VTOC and catalog < discussed later > are structures that contain metadata for the DASD, indicating where various data sets reside.) The RESERVE command limits access to the entire DASD to the system issuing the command, and this lasts until a RELEASE command is issued. These commands work well for limited periods (such as updating metadata). Applications can also issue RESERVE/RELEASE commands to protect their data sets for the duration of the application. This is not automatically done in this environment and is seldom done in practice because it would lock out other systems’ access to the DASD for too long. A basic shared DASD system is typically used where the Operations staff controls which jobs go to which system and ensures that there is no conflict such as both systems trying to update the same data at the same time. Despite this limitation, a basic shared DASD environment is very useful for testing, recovery, and careful load balancing.

Next few slides introduce Sysplex and Parallel Sysplex
or Instructor can use Slides in Chapter02B for more details

Basic Sysplex Global Resource Sharing (GRS) used to pass information between systems via the Channel-To-Channel ring (Token ring) Request ENQueue on a dataset, update, then DEQueue Loosely coupled system A Systems Complex, commonly called a sysplex, is one or more (up to 32) systems joined into a cooperative single unit using specialized hardware and software. It uses unique messaging services to exchange status information and can share special file structures contained within coupling facility (CF) data sets. A sysplex is an instance of a computer system running on one or more physical partitions where each ‘can’ run a different release of a z/OS operating system. Sysplexes are often isolated to a single system, but Parallel Sysplex technology allows multiple mainframes to act as one. It is a clustering technology that can provide near-continuous availability.

Parallel Sysplex This extension of the Channel to Channel (CTC) ring uses a dedicated Coupling Facility(CF) to store ENQ data for Global Resource Serialization (GRS) This is much faster The CF can also be used to share application data such as DB2 tables Can appear as a single system Sysplexes are often isolated to a single system, but Parallel Sysplex technology allows multiple mainframes to act as one. It is a clustering technology that can provide near-continuous availability. A Parallel Sysplex is a symmetric sysplex using multisystem data-sharing technology. This is the mainframe’s clustering technology. It allows direct, concurrent read/write access to shared data from all processing servers in the configuration without impacting performance or data integrity. Each LPAR can concurrently cache shared data in the CF processor memory through hardware-assisted cluster-wide serialization and coherency controls.

Parallel Sysplex Attributes
Dynamically balances workload across systems with high performance Improve availability for both planned and unplanned outages Provide for system or application rolling-maintenance Offer scalable workload growth both vertically and horizontally View multiple-system environments as a single logical resource Use special server time protocol (STP) to sequence events between servers

Summary Terminology is important
The classic S/360 design is important as all later designs have enhanced it. The concepts are still relevant New processor types are now available to reduce software costs EBCDIC character set Clustering techniques and parallel sysplex

Extra Slides on Hardware Management and LPAR Creation

Hardware Management Console (HMC)

OS i.e assign a profile to a Linux partition Storage assignment Allocate PUs

Chapter 2A: Hardware systems and LPARs

Similar presentations

Presentation on theme: "Chapter 2A: Hardware systems and LPARs"— Presentation transcript:

Similar presentations

About project

Feedback

Log in

Auth with social network:

Chapter 2A: Hardware systems and LPARs

Similar presentations

Presentation on theme: "Chapter 2A: Hardware systems and LPARs"— Presentation transcript:

Similar presentations

About project

Feedback