Salsmann Mainframes

End of Study Thesis
Mainframes in IT environments
www.supinfo.com Last Name: Salsmann First Name: Flavien Campus Booster ID: 28622
Abstract
This end of study thesis focuses on the Mainframe computer systems. Not very known, these platforms are often victims of many prejudices which veracity must be verified. Are they deprecated? Are they still used? Are they doomed to disappear? Will they be replaced by distributed GNU/Linux server? What are and will be their state? To answer these questions, we will study the position of Mainframes in enterprises, as well as their legitimacy in big infrastructures. To better understand their importance, we will present several aspects of this platform, theoretical as well as technical. First, after having defined the context of this thesis, we'll study the current position of Mainframes in the world, especially in companies. We'll try to understand what they really are and to have a new look on them. Well then briefly present their evolution and study why they are still used despite criticisms that are often made about their age. Well identify factors of their continued existence, such as the need to capitalize on the existing structure, notably in banks. Therefore, we'll see that billions of dollars as been invested in them and that most important consulting group such as Gartner still believe in them. Then, we will present an overview of their qualities not present on distributed servers, or not enough efficient for big infrastructures. After having study its overall strength, its place in IT environments will be presented, as well as its market. In a second step, we will present technologies used under these environments, their efficiency and their legitimacy in a modern world. We will add system commands to theoretical concepts in order to concrete their presentation. If possible, we will compare existing technologies under conventional systems as Linux with Mainframe technologies, in order to see if they are really obsolete or modern. Obviously, we will present the hardware used by these machines, as well as their IBM proprietary operating system called z/OS. As basic concepts such as file system are very different compared to those that customers are often used to use on other OS, we will briefly explain their specificities, advantages and defects. Several products will then be described, in order to better apprehend most subtleties of this platform, as security and workload management. Then, we'll see how it deals with needs as disaster recovery and virtualization thanks to technologies like Parallel Sysplex and z/VM. Finally, we will attempt to define the future of Mainframes. To do so, we will present the role they can have in server consolidation projects. Then, we will describe why it could be interesting in Data Centers, notably for its TCO, and according to the new emerging problems, such as energy and place costs. New applications scheduled on this platform will also be presented, such as Gameframe for online gaming, and the recent zSolaris contract validated with Sun. At last, we will project the broad market trends, and the position Mainframes could occupy in few years. Well try to define if they can be attractive to a wider range of businesses, and get out of their niche market status.
2|Page
Acknowledgement
First, Id like to thank all IBMers I met during my internship, especially Mr Alain Richard, Mr Frederic Rondeau, Bruno Paganel and Eric Gaillard for their patience, advices and many courses. They spent a lot of time each day to answer all my questions, teach me a lot of concepts, and give me invaluable feedbacks from their experience in Mainframe environments. Ive really been impressed by their IT culture. Its been a real pleasure to work with them. And I would even say an honour. I thank ESI SUPINFO and IBM for organizing the IBM zNextGen training. Thanks to it, I realized that I lacked many skills that could be interesting to acquire. It has changed my professional aspirations. Id also like to thank all the persons who read my thesis and gave me their impression, incomprehension, and then helped me to improve it. I would like to thank my friends, who supported me during my studying at SUPINFO, and who made these years unforgettable, especially Louis Champion, Nathalie Rufin, Guillaume Sudrie, Florent Chambon, Brice Dekany, Rmi Vincent, Philippe Job, Jrome Masse, Laurent Bodin, Mickeal Desbois, Rmi Assailly, Selim Meskine, Gilles Dallemagne and Gatan Poupeney. Finally, I would like to thank my parents and my sister who helped me to join SUPINFO and for all the support and love they gave me. This document would surely not have been written without them.
3|Page
Table of Content
Abstract ........................................................................................................................................................... 2 Acknowledgement ........................................................................................................................................... 3 Table of Content .............................................................................................................................................. 4 Introduction ..................................................................................................................................................... 6 1/ Mainframe Computers: Myths and Realities ................................................................................................ 7 1.1 Whats all about these old dinosaurs? .......................................................................................................... 8 1.2 Who is enough mad to use it? ...................................................................................................................... 9 1.3 Why are they still running? ........................................................................................................................... 9 1.4 What is its place in IT environments? ......................................................................................................... 19 1.5 The Mainframe market nowadays As dead as itself? .............................................................................. 21 2/ Mainframe Today: Denver the Last Dinosaur? ........................................................................................... 23 2.1 An impressive advanced Hardware ............................................................................................................ 23 2.2 Specialty Engines ........................................................................................................................................ 27 2.3 z/OS : the IBM Operating System ............................................................................................................... 35 2.4 An horrible user interface ........................................................................................................................... 36 2.5 z/OS file system .......................................................................................................................................... 40 2.6 JCLs for batch processing ............................................................................................................................ 46 2.7 Jobs, performances, and network management ........................................................................................ 49 2.8 Transaction Servers and Database: Middleware ........................................................................................ 52 2.9 RACF Security Server ................................................................................................................................... 54 2.10 DFSMS: Managing Data ............................................................................................................................ 61 2.11 Health Checker: Auditing system .............................................................................................................. 68 2.12 Virtualization technologies ....................................................................................................................... 71 2.13 Solutions for high and continuous availability .......................................................................................... 78
4|Page
3/ Mainframe in the future: Dead or Messiah? .............................................................................................. 85 3.1 Server Consolidation ................................................................................................................................... 85 3.2 An interesting total cost of ownership ....................................................................................................... 92 3.3 A mature and credible platform ............................................................................................................... 106 3.4 Emerging applications............................................................................................................................... 108 3.5 SWOT and future market .......................................................................................................................... 110 Conclusion.................................................................................................................................................... 111 References ................................................................................................................................................... 112
5|Page
Introduction
Nowadays, the IT market seems to be divided into two sectors, composed of either Linux or Microsoft Windows platforms. The advent of distributed servers during the 90s has amplified this simple representation. However, other solutions were used before the rise of the personal computer. Indeed, between the 60s and the 80s most companies primarily used huge computers called Mainframes. Most popular were IBM models, from OS/360 to the AS/400. Every modern infrastructure had a Mainframe, and it was used for most application, such as bank transactions. As a result, most critical programs were written during that period, most of time in COBOL language. As these critical applications are perfectly working and needed many investments from companies, they are still running today on Mainframes. Most of them are then still executed for historical reasons and are must have for many companies. Yet, these systems are ignored or even unknown by general public and by most IT Specialist. They are often judged as old machines doomed to disappear, and are compared to "dinosaurs", because they execute very old programs. Many people tend to say Mainframes have become totally obsolete and no longer meet the modern market criteria, and thus will be irreversibly replaced by distributed servers defined as modern. However, major infrastructure continue to use it, and not only because it can help them to capitalize on their existing. Indeed, Mainframes are systems implemented in big companies to meet very specific needs. They propose very advanced technologies helping enterprises to better define their Business Recovery Plan, such as Parallel Sysplex for continuous and high availability, or Copy Services for high level data replication. Therefore, they constitute an essential component of IT environments. In recent years, Mainframes greatly evolved, notably through the System z9 range from IBM, having impressive hardware capabilities and offering uptime of about 99,999%. More and more IT managers find Mainframe to be the only system able to effectively support very large workloads, such as in banks for transactions, and to meet their performance, security and reliability needs. As virtualization technologies are more and more used to execute several instances of Linux systems, with solutions such as Xen Source which are very popular in companies, the Mainframe alternative could be seriously considered in many infrastructures. Indeed, Mainframes benefit from more than thirty year of experience in the virtualization domain, and could have a major place in server consolidation projects. It could then conquer a new market, usually reserved for x86 platforms. In the next year, the Data Center crisis will explode, because of considerations that were not suitably taken into account, such as energy and place costs du to massive use of distributed servers. The Mainframe could be effective to solve these problems, as its TCO is not so high, despite prejudices.
6|Page
1/ Mainframe Computers: Myths and Realities

Mainframe Computers are often seen as old and archaic systems. When one talk to average people about Mainframes and ask them to think about these machines, they will probably ask you if youre talking about these huge things which need so much place that a small room is not enough. Well, maybe thats a caricature But give it a try and youll see by yourself! If only there were living during the 60s they would be probably true. Mainframes were indeed systems hosted in huge room-sized metal box, needed an incredible amount of electricity, space and air-conditioning. It needed about thousand square meters, up to 3000 to take place. But this time is over. Its now like a big refrigerator, nothing more, taking the place of about two frames containing standard x86 servers. Mainframe evolved. This is all about this thesis.
Lot of people will also tell you Mainframe Computers are dead, and that the small ones still being used will be replaced by grid computing technologies Well In fact, the truth is while peoples mind didnt evolve about Mainframe Computers technologies, these last one did. Another reality is that even if its been said they were finished, theyre still used. People dont really have a concrete idea about what a Mainframe really is, even in most IT Environments. It thus remains important to precise what are Mainframes, which companies are using them, their real place in the world and why theyre still there, despite violent criticisms and jokes.
7|Page
1.1 Whats all about these old dinosaurs?

First of all, it seems important to precise that our modern world couldnt be what it is without Mainframes. Although Mainframe technologies are referred as a legacy platform which is no longer strategic for companies, they play an essential and central role in all usual operations people does each days. Whatever you do, if you deal with some kind of data, then youll pass through a Mainframe for sure. Most of Fortune 500 companies own one or more! Banks, finance, insurance, health care, government, each transaction of big infrastructures is treated by Mainframe. It really is the heart of all great Data Centers. Its the only system which can handle so much data with such speed and reliability. The situation is very paradoxical, indeed, these machines are seeing as old and creepy, but in reality theyre the most technologically advanced. Here is a definition I found quite ironic and relevant.
An obsolete device still used by thousands of obsolete companies serving billions of obsolete customers and making huge obsolete profits for their obsolete shareholders. The Devils IT Dictionnary
Behind all these concepts and passionate debates, what really is a Mainframe? Well, the most important point is its a machine which has been designed since its beginning to achieve all its customers workloads in time. It automates thousand of actions in order to reach consistent business objectives. This is the only system we expect to NOT stop, crash or fail. It requires unmatched qualities, such as security, availability and integrity. Supporting hundred of thousands I/O operations due to numbers of simultaneous transactions which can be potentially vital for initiators, it just have to be sure. In people mind, a machine crash is a normal thing, it can happen anytime for any reasons, you just have to reboot it, and thats it. A Mainframe executes so much critical applications it cannot crash.
Fail, crash or slow-down is NOT an option

Then, a Mainframe is a device which can serve at the same time thousands of users without any errors. Customers who use Mainframe expect them to have 24/24 up-time, as they cant allow a minute of down-time, since it means millions of dollars lost. Since its own reputation is concerned, customers want the best machine to host their most hot applications. Thats what Mainframes do. Theyre just reliable machines, what are not others on the market, nothing less.
A Mainframe is a computer system designed to continuously run very large, mixed workloads at high levels of utilization meeting user defined service level objectives. IBM
8|Page
1.2 Who is enough mad to use it?

Even if Mainframe computers dominate the landscape of large-scale business, they remain obsolete for much people, and as it was said before, being considered as dinosaurs, since emerging technologies leap into the public eye such as cool 3D effects, great look and feels All that stuff. Then who make them still live? Well To be honest, as you maybe should have guessed it, everyone use Mainframe at least once in his life. This is obvious. You used a Mainframe computer at one point or another. Lets take an easy example. Got an American Express or a Blue Card? Then youve used a Mainframe to interact with your bank account. This is also the same process when you use ATM (Automated Teller Machine). In fact, worlds economy rests on Mainframe Computers. Then, people just cant forget about it and says its dead, it would be totally wrong. This is, was, and continue to be the foundation of business. Just think about it: there are more transactions daily executed on Mainframe than web pages server! Most members from Fortuna 500 are running a Mainframe. Every big project involves a Mainframe in some part. Only big enterprises are aware of there existence and more precisely of their advantages.
1.3 Why are they still running?

Companies still use Mainframes for a numbers of reasons which people are not enough interested in. Capitalize on existing IT infrastructure You have to understand Mainframes are the base of every big IT infrastructure. When big society raised these last years, the only way to provide them a correct way to deal with their data was nothing but to use Mainframes. IBM machines were the only one able to do it. Then, every large batch jobs dealing with big services such as general ledger and payroll processing were running on these infrastructures. A lot of money has been spent for these applications which are bases of many modern structures.
Applications hosted on Mainframe systems represent an investment about 1500 billions of dollars. Gartner Group
Every critical application, which are currently running in structures such as bank were written in COBOL. Theyve been tested, fixed and run perfectly. Even if theyve thirty years old, they work. Customers dont want to lose their passed investments, and they know it will be less expensive to reuse or adapt existing applications than rewrite them again in a cool and hype new language. They dont care about it; they just want their program do run correctly. Furthermore, this is more careful to use something which as been validated, it reduces risks related to new developments. 9|Page
10
More than 200 billions of COBOL code lines are still used and 5 more are added each year. Gartner Group
Companies also know that if they launch a huge project under Mainframe environment; it will still work in many years. Contrary to other platforms such as Windows or UNIX/Linux, IBM wants its platform to hold its entire legacy. It means that customers are able to launch their oldest applications on the latest z/OS and System z. Lets have a try with another system, such as Windows, or even Linux. Take an old application from Windows 3.1 and make it run on your new Windows Vista... If it works without doing anything but trying to make it run, youre lucky. With Mainframes, customers know they have continuing compatibility, there capital is preserved. This compatibility across decades of changes and enhancements is the Mainframes hardware and software designers top priority. Thats why JCLs are still used in order to preserve compatibility with older tasks so that they can continue to be executed without any modifications.
The ability of our system both to run software requiring new hardware instructions and to run older software requiring the original hardware instructions is extremely important for our customers. IBM
IBM Mainframes make possible to reuse all applications customers invested in. Thats a huge point, because big enterprises really live thanks to their old and legacy applications. Huge Workloads Mainframes benefit from comfortable and huge hardware to process very significant workload. As a result, a single system can scale up to process a billion transactions per day, and up to 13 billions for a clustered System z9, which represents more than the amount of transaction in a week for the New York Stock Exchange! Mainframes support different kinds of workloads, which can be defined in two categories, basic batch processing (often old applications running during nights, to make statistics and long jobs), and online transaction processing, which are the most used during days. Batches processing dont need any user interaction. Theyre often planned to be executed nightly, when all machine power is available. They have to advantages to be able to process huge data, such a as terabytes to create valuable statistics. Banks use them to produce important reports about their customers. You can see it as Cron defined in an UNIX Cron tab, but with advantages with often lacks it distrusted servers environments, such as a huge available processors capacity and significant data storage do deal with. These jobs do not need an immediate response but have to be complete in what we called a batch window, which in the maximum period it can be running. 10 | P a g e
11
Online processing occurs interactively with users. Unlike batches processing, they have to be executed very fast, and response time is the most important thing with, of course, data integrity. As these transactions often depend of the enterprise core functions, each of them is critical and has to be treated with attention. When you take money in an ATM machine, you want it to be fast. Every user who uses the same transaction at the same time wants the same thing. Them, they have to be treated in fraction of seconds. Immediate response is needed, which supposes high performance, integrity, and data protection. Numbers of industry use Mainframe to be as fast as possible: banks with, ATM, travel enterprise with online ticket checking reservation, government to process tax processing, etc If customers use a distributed server infrastructure, time needed to meet their needs, especially integrity, will need much important. Indeed, even if they can effectively process the job, their I/O capacity cannot be compared with a Mainframe. As the whole system is running on the same hardware, data check and processing is far more speed. Mainframe systems also use advanced technologies hardware and software, to improve huge workloads processing. As a result, IBM designed its machine as balanced systems. It means it balances servers components to processor, memory and I/O scalability. Its then able to deal with large quantities of data available to support transactions. In the Operating System, a manager called WLM (Work Load Manager) allocate resources when and where needed, offering dynamic resource prioritization. Then, WLM decides the resources level to be applied to meet a particular service goal, in a particular part of the system for example. Workload Manager monitors the system and continuously readapts processing to meet needs, and then systems can run at 100% utilization. For really big infrastructure, EWLM (Enterprise WLM) allow you to automatically monitor distributed and heterogeneous or homogeneous workloads across an IT infrastructure to better achieve defined business goals for end-user services. Please note that as Linux can be executed under Mainframe environment thanks to z/VM, workloads can be balanced and allocated as if you were under a distributed server infrastructure for some kind of needs (Apache Web servers responding to numbers of http requests for example). It also benefits from all features due to the System Z partition system, as HiperSocket for data exchange between each virtual operating system. Data flow then operates at memory speed. In other words, in this situation, a Mainframe can be an improved x86 cluster. Reliability, Availability and Serviceability These three concepts are also known as RAS. RAS is one of the most important things when you talk about a system or an infrastructure, as it includes numbers of aspects of a computer and application, revealing its capacity to be in service every time. In fact, we can define a system in seconds knowing its RAS level. The more an infrastructure RAS level is high, the more it may be trusted. We can then talk about a 24/24 and 7/7 service, which mean there is no down-time accepted, and we expect IT infrastructures with RAS characteristics to have a full up-time. These features help a system to stay fully operational for a very long period (months and even years for Mainframes) without reboot or crash. 11 | P a g e
12
The IBM Mainframe platform retains industry-leading availability characteristics even for singlesystem instances. For example, standard service availability commitments from tier one service providers in commercial Data Center outsourcing agreements suggest that the Mainframe is delivering 99.9% to 99.999% scheduled availability versus 99.5% for distributed server platforms in non-clustered configurations. Forrester
It seems important to define precisely each terms of RAS. As you may notice, these are hardware and software attributes, which may be founded in distributed environment systems but which truly are prized by Mainframe users. Here is the definition of each characteristic. Reliability: Ability to avoid faults, if founded, theyre very quickly fixed Availability: Deals with the up-time, which means the amount of time a machine will be running and being fully operational, even if a problems occurs. For example, a system with continuous availability would stop a process causing problem and will go on without having to launch other services after fail. Serviceability: Ability of the system to diagnose itself. It can then detect faults before they happen and fix them. It avoids significant human intervention and downtime caused by maintenance.
RAS works as if each of its part was some kind of layer, used by hardware and software.
12 | P a g e
13
To illustrate that concept, lets take a very simple example: a CPU fails
Customers should be aware that Mainframe technologies are very advanced to support all these features. For example, there are no SPOF (Single Point of Failure) in a Mainframe, every hardware component is redundant: CPs, memory, I/O channel, etc You can even change hardware without having to stop the system. Its been designed to handle this kind of operations. Errors detections are also used every time: each instructions sent to CPs are mirrored, and then double-checked. If this comparison does not provide the same results, the CP is known as unreliable and a spare is then used to execute its workloads. It a fantastic feature to assure integrity of every data processing. Other technologies are used to ensure data safety, integrity and backup, as RAID (Redundant Array of Independent Drives) and cyclic redundancy check checksums. At last but no least, very modern technologies such as Parallel Sysplex enable scalable runtime execution which presents extreme high availability and reliability for companies. Thanks to these, Mainframes can run at about 99.999% up-time, with average unplanned downtime of under 5.3 minutes per year. They can also play a major part on disaster recovery solutions, as presented below.
13 | P a g e
14
Disaster Recovery What would happen if a bank production Data Center was victim of a natural disaster? Can it say to its customers: well were sorry but weve lost all your data? No, it cant be, such structures should be able to fully recover from a disaster, even catastrophic. We distinguish numbers of possible disasters, there are not that rare, more recent and famous being Terrorist Attacks of September 11, Indonesia Tsunami, floods in Western Europe, fire in Greece, etc... There really is a recrudescence of sinister, and companies shouldnt neglect their potential effects...
43% of American enterprises immediately file for bankruptcy after a disaster and 29% after about three years 40% of American enterprises disappears in less than seventy two hours following its IT and telecoms equipments 93% of enterprises which lost significant part of their data stop have to stop their activities at the end of five years U.S Bureau of Labor, U.S National Fire Protection Agency and Richmond House Group
To protect themselves, enterprise should have a BCP (Business Continuity Plan), which is a logical plan describing a practiced and validated methodology. Its then helps to fully recover from disasters and to restore partially, or even better, completely their critical functions to continue business process. There are much ways to do it, but the most efficient is to have a full backup of its production Data Center. Its as if customers had a spare of their entire system, if you prefer. A Distributed Systems should be very difficult to replicate exactly. Even if efficient cluster solutions exist, they remain long and complex to configure, even more if the machines number is high. Systems configurations are one thing, but data are other things. They are even more important for an enterprise, and should be replicate in another site. When you deal with Terabytes, its not that easy. Mainframe infrastructures offer advanced tested and validated technology which can help companies to create their BCP efficiently, such as GDPS (Geographically Dispersed Parallel Sysplex) and Metro Global Mirror. They are able to manage the remote copy configuration and storage subsystems, to perform failure recovery from a SPC (Single Point of Control) and automate operational task. If customers want it, they can also use XRC (eXtended Remote Copy), to use a secondary backup site which can be thousand of kilometres away the primary. This solution also allows enterprises to manage huge workloads across multiple sites. It supports both synchronous and asynchronous data replication for a continuous availability, operations and data integrity. These technologies help companies to meet their RPO (Recovery Point Objective) and RTO (Recovery Time Objective). Well describe them later... but they not really seem obsoletes, huh?
14 | P a g e
15
Security In big infrastructure, especially in banks, security is a must have. System hosting hot and sensible data must then be highly secured: customers lists, account details, there are the most valuable resources for an enterprise. Accesses must be controlled as much as possible, and Mainframes use many technologies to do so, from hardware to applications, passing through, of course, by the OS. Its legacy is impressive: it benefits from about forty years of unmatched security.
The System z9 has been built on top of the security platform that is the Mainframe. It boasts a range of updated and new security features that push system security to a whole new level. There is no doubt that the Mainframe remains the platform for secure computing. Bloor Research
LPAR Systems: In Mainframe systems, every logical partition is isolated from the others, in an LPAR. If we had to do comparison with x86 architectures, its like the partition concept in the Xen virtualization system. As a result, applications cannot overlay, write or even read code running on the other partitions. This doesn't mean they can't communicate each others. If they're configured to do so, they use HiperSocket technology which offers a very speed (memory transfer rate) and highly secured way to communicate. Certifications: IBM Mainframe obtained a very high EAL (Evaluation Assurance Level) for most of its technologies: LPARs are certified EAL 5, and z/OS is EAL 4+, which is best rated than the over solutions available on distributed servers. RACF (Resource Access Control Facility), the main security system, is also EAL 4+ thanks to its LSPP (Labelled Security Protection Profile) and CAPP (Controlled Access Protection Profile) achievements. It also use other technology such IDF(Introduction Detection Service), which is a very advanced feature built into the software stack defending the system against intrusion and detecting attacks, using a special policies. It's the proof that these it can be trusted, and explain why it's used by all government agencies. APF System: APF (Authorized Program Facility) is a program used by z/OS and MVS to explicitly precise which programs can run in the system storage protection key. In fact, there is some kind of memory which must only be used by the system. Its access is then protected, as it contains critical data and can interact with serious part of the OS. However, some programs need to be executed in that memory to directly interact with the system. Customers can thus select which product can do it or not. It avoids massive attacks or systems modifying. As a library specified in APF can potentially allows users to by pass all RACF authorisations, it's very important to exactly know how many they are, who can access them, and who can update APF libraries themselves. Data Encryption: Mainframes are designed to be secured, and they can use direct built in function to encrypt data. IBM has been one the first enterprise to encrypt its data, with hardware cryptographic solutions, such as DES (Data Encryption Standard). 15 | P a g e
16
It now uses services directly available via ICSF (Integrated Cryptographic Service Facility) which help customers to encrypt their data on tape or other devices. Each general purpose processor in Mainframes provide cryptographic function called CPACF (CP Assist for Cryptographic Function), offering a huge set of features which enhance the encryption/ decryption performances. It can be used for the popular SSL protocol. Simplify Secure and Audit: In a distributed server environment, you would have to configure each server to define your access policies. You would have to collect and aggregate all logs records to have a concrete and global view of the whole data access. The more you have computer, the more it will take time, energy; CP time wasted, and the more humans errors could happen. With a Mainframe, you only have to use a product such as RACF or Top Secret. As data is centralized, you can also centralize your security. You will only need to specify your policies on the current system. Configure it once for more, and that's it. It considerably reduces maintenance time and costs. Furthermore, these products can record very detailed logs, which can be analysed to measure your whole security. More secure and more simplified... Could you possibly ask more? No virus and malware: Mainframes architecture provides a so high level of protection and isolation it prevents them to be attacked. Hardware is also designed to avoid problems caused by programming errors such as buffer overflow. Note the system will not have to be updated every months or weeks with patches just to be sure it will be secure. It is already.
Scalability World changes, always, continuously, at a dramatic speed. If there's a thing pretty sure about IT infrastructures, then it's that it will always change for sure. Customers might be aware of that fact. Customers computers have to be ready to evolve with these infrastructures, to bring more power, more feature, without having to reinstall an OS or buy new machines. This is the concept of scalability: the ability to handle growing amounts of work without having to be changed.
Scalability is the ability of a system to retain performance levels when adding processors, memory, and storage IBM
There are several dimensions in scalability, people often think about the load scalability, which is the ability to accommodate higher workloads with the current system. But theres also the geographic scalability, which is the ability to have same performances regardless of system geographical localisations. As a result, you must approximately have the same performance if your machines are in the same room or in a more big area, as a country or even dispersed in the entire world. 16 | P a g e
17
Scalability can assured in both ways, each having its advantages and defects: Scale Up also called Scale Horizontally: you simply add more nodes to your systems. Thats the way most companies follow, as theyre using distributed servers infrastructure for the most part. It means that they will add power adding a new computer. For example, if their three database server cant handle any more transactions because there are too many, they will add a fourth server to help the others. This seems to be a good solution, but with time it horribly complexes customers infrastructures. Adding more and more machines is not efficient, because when theyll remain obsolete, enterprises will have to renew most of them This is not the greatest way to invest money, huh?
Mario wants to be helped: Scale Up!
Scale Out also called Scale Vertically: you only add needed resources to a single node of your system. Most of time, its about memory or CP. Then, the current system is able to execute more processes, faster, etc It simplifies your IT infrastructure as it doesnt change. Mainframes are designed for that. Lets see why this method is interesting.
Mario wants to do it all by himself: Scale Out method!
In a distributed server architecture, scale out is not that simple. Indeed, most of computers must be shutdown when you change their hardware, which mean a significant down-time due to maintenance which means lost money. Furthermore, imagine youre in a big company which has plan about twenty millions transactions per day during a week. Lets say you have more than planned, for example about twenty five millions you have to react and add power, and quick! Even if youre able to do it, what will you do with your added CP after this week? They wont be that used. Youve lost money only for few days. That wouldnt have been the case with Mainframes.
17 | P a g e
18
On the System z, you can add and remove CP on the fly. Hot-plugs on these machines are very advanced. Then, you can add power when needed for permanent or temporary growth, with a maximum of 54 CPs! Need power? Just active a CP with the CUD (Capacity Upgrade on Demand) providing the capability to non-disruptively add general purposes processors, zAAP, zIIP, IFL or even ICFs! Dont need more power anymore? Deactivate them. Could you imagine such features in a distributed server environment, with zero down-time and all advantages it supposes? Its not possible. CUD is the only solution allowing customers to use hardware capacity by day, turning it on in needs, turning it off when it subsides and only pay for days its been used. Scalability means processing power, but also I/O performances. With System z, customers can benefits from up to 512 GB of central processor storage to deal with large workloads, and up to four LCS (Logical Channel Subsystems) able to use up to 256 channel paths to support intense I/O operations. They dont have to worry anymore when their hardware will be obsolete, as with x86.
With system Z, you can dynamically increase or decrease machine capacity in a way that is transparent to active applications and users. CA
Here is the representation of a well known situation is banks: unplanned workloads. CUD in action!
You should say that scale up is also important and might be present, and youll be right. Thats why z/VM is here, and its very interesting for scalability. It helps customers to accommodate growing workloads of varying performance characteristics in order to meet their business service needs. 18 | P a g e
19
The Mainframe offers the broadest performance range of any universal server. CA
With z/VM, customers can add z Linux images on the fly to deal with additional new workloads and to offer fault tolerance. Its as if you had a Data Center in a box, with overall power which can be changed, and numbers of problems forgotten, such as network connection between each server. Migration Costs At last but not least, even if big infrastructures wanted to migrate to a distributed server model, they couldnt do it. Most customers who tried quickly stopped these kinds of projects, mostly MVS to UNIX or even Linux operation systems. Rewrite programs, rebuild them, and buy a whole new server farm represents too much costs. Its not interesting for them, and these kinds of project, when almost succeeded, take years to be accomplished. Enterprises cannot give a try to migration, as its repercussions are not sure.
1.4 What is its place in IT environments?

Mainframe is the heart of every big infrastructure. It hosts very critical applications, such as transactions engines, database, etc... Its used as a reliable base for everything and offers many hot services needed by enterprises core functions. Then Systems such as UNIX and Linux come as distributed servers, and finally desktop running Operating Systems such as Microsoft Windows XP.
Diagram resuming the three IT infrastructure pillars: from huge server to desktop
Our customers devote between 60 and 80% of their budget ICT to maintain their Mainframe and its applications Gartner Group
19 | P a g e
20
If youve always work in average companies, you shouldnt understand this place. You should say its wrong, systems such as Red Hat works very well, as Windows Server 2K3 if its greatly administrated... and youre right; most companies dont need a Mainframe. Youre also right about the fact other Operating Systems and hardware as newest BladeCenter can be reliable. They can be enough in average companies, thats true. But not in the ones needing everything we talked about. You have to focus on the fact that this thesis deal with extremely big infrastructure or with enterprises which need a system having all advantages we described. They cant rely on system which need to be reboot to apply patch, which dont have a serious support able to solve a problem in minutes if its really critical, etc... They also need a system able to run there old applications, there payroll, all that stuff. They can do nothing but use again and again Mainframe computers. Mainframe computers are not only machines used for their hardware, technologies allowing great Disaster Recovery Plan or their legacy side. Its also used in very modern projects, in particular Server Consolidation, which is a concept in the groove nowadays. We will talk about this one in few chapters, explaining why Mainframe is so much interesting today, for its ability to run hundred of Linux at the same time and on a same machine for example. An incredibly amount of money can be saved with Mainframes such as the IBM System Z, thats why companies invest so much money in them. Here are the results from a study which ask big companies their strategy about Mainframes...
Results are enough to show that Mainframe is still the strategic platform in which companies invest in and also that its seeing as a system which has its place in the future, as investments are growing.
20 | P a g e
21
1.5 The Mainframe market nowadays As dead as itself?

Mainframe market was really important between the 60s and the 80s Thats quite understandable; this was the only way to have an IT infrastructure. Then came distributed server solutions, running Linux and Windows Server. Numbers of specialist said things like its over, the Mainframe is gonna dieor This is time for modern machine and Operating System. Well maybe Mainframes were and continue to be seeing as dinosaurs, but fact is theyre still there, and market is surprisingly good. Market even had a growth of about 8% last year.
Mainframe hardware sales in the fourth quarter of 2006 were the largest that IBM has seen since the fourth quarter of 1998! Bob Hoey, Worldwide vice president of IBM zSeries sales
Sales are successful thanks to the news specialty engines, such as zAAP, zIIP and IFL which well talk about in few chapters. The new politic aiming enterprises which want to consolidate their servers into a tough one is also very good to seduce new potentials customers. As VMWare and Xen become very popular, IBM wants to take back the virtualization market, and can do it, because its system use these technology since the 70s with the S/370, tested and approved for years.
Mainframe is the best solution to virtualize Linux servers. Nowadays, on a VMware machine, customers typically consolidate about twelve servers. With z/VM 5.3, its about hundred. Ray Jones, Worldwide vice president of IBM system Z softwares
IBM earns also a lot of money with its installed MIPS, which is a very original way to invoice customers, not present on distributed servers infrastructure. Its based on the on demand system: customers only pay in function of the power they use. This model is very effective.
21 | P a g e
22
Sun Microsystems and its Sun Enterprise 10000 which werent really successful, has found a partner in Fujitsu, creating processors for the new Sun Fire R15K and E25K. He agrees with IBM and thinks that Mainframes will be still used and will have a second life in consolidation projects.
Server consolidation projects on Mainframe are really important, and DRP needed by Ble II had a considerable impact on our incomes. Jacques-Yves Pronier, Sun Microsystem marketing director
However, others historic vendors, such as Bull dont believe anymore in Mainframe. As a result, famous GCOS 7 and 8 wont be on market anymore, and prefer to use standard x86 technologies.
We dont use proprietary components anymore, since three years, we use pressed Intel, Xeon and Itanium processors on our new Nova-scale server range. Franois Bauduin, Bull sales director
As Bull now equips its machine with both Linux and Windows to set up SOA architecture, its customers are not really the same aimed by IBM. But its very interesting to see they dont fallow the same path, whereas they shared the same market few years before.
Mainframes market was quite disastrous in the 21 centurys beginning. But during these last years, it impressively grew, especially System z9 from IBM. With all the new security needs, the 11 September effect, and even more huge workloads coming, it could seriously be back. Server consolidation will for sure play a major role in sales. Question is, will this market remain niches, or will it transform into a visionary players, leading to innovations and interesting less imposing customers than now? 22 | P a g e
23
2/ Mainframe Today: Denver the Last Dinosaur?

Now that youve understand why Mainframes are still used and so important, you should be asking yourself which are technologies behind all concepts we talked about Youll probably be disappointed as some of them will really seem old and archaic like JCL, as others will appear incredibly modern, such as Parallel Sysplex and Global Mirror.
2.1 An impressive advanced Hardware

You should be aware than IBM System z is the most advanced and self tolerant platform. Indeed, everything in a Mainframe is doubled. As a result, each hardware elements will have a spare. For example, if a CP fails, another one will execute its workload and this operation will be fully invisible. There are two system z9 model, the Business Class, and the Enterprise Class. Mainframe software can be executed on both model, the difference is only about hardware: CP provided, Memory, etc As the Enterprise Class is the most interesting, we will base our study on its architecture.
IBM System z9 Hardware
First things first, Mainframe are used for their power, and IO/capacity. To deal with so much data, System z9 uses a CEC (Central Electronic Complex) cage. You can see it as a mother ship where you could add or remove a book. A system z9 can use up to four books. Each book is interconnected with a very high speed internal communications links and has a refrigeration subsystem to cool itself. 23 | P a g e
24
Processor Book
A book is a piece of hardware which include several elements: MCM: MultiChip Module: contains processors, also named PU for Process Unit. A MCM contains up to 12 (or 16 for S54 model). However, theyre not all used, as some are just spares and others as SAP (System Assist Processor); which is a dedicated I/O processor helping to improve performances and reduce the overhead of I/O processing. When a customer install its Mainframe, it decides to install a specify number of books according to its needs. These bookss CP can then be activated or not, according to its capacity planning. Most customers first buy books to activate few CP later, when they really need power. Why? The more you buy a high model (with more processors), the more IBM will propose you a percent off. Thats a good reason, huh? Better than install hardware few months later.
Model S08 S18 S28 S38 S54
Books 1 2 3 4 4
MIN CP 1 1 1 1 1
MAX CP 8 18 28 38 54
Standard SAP's 2 4 6 8 2
Standard Spares 2 2 2 2 2
Each processor can be specialized, as well see in next chapter
Memory: Clipper memory card using DDR2 DRAM technology, up to 128GB per book
Physical Memory in book Memory Card Config 16 GB 4 x4GB 32 GB 4x8GB 48 GB 8x8GB 64 GB 8x8GB 80 GB 8x16GB 96 GB 8x16GB 112 GB 8x16GB 128 GB 8x16GB
MBA (Memory Bus Adapter) out cards: up to 8 per book. Each can be connected to two different STI (Self-Time Interface), offering high availability.
24 | P a g e
25
I/O Connections
Processor Books are directly connected to I/O cages (up to three), via their MB out cards. Each I/O cage can contain up to 28 I/O cards. There are four types of cards (CHPIDS is a I/O cards port) ESCON, up to 15 CHPIDS (16MB/s) FICON, up to 4 CHPIDS (4GB/s) OSA, up to 2 CHPIDS, for network connections Crypto Express, for encryption data process, using coprocessor
With such design, System z9 has a very high availability I/O processing, and proposes a total system I/O bandwidth of about 172.8GB/s! To configure this hardware, administrators use a Support Elements which is an IBM Think Pad. One is running, the other being its spare. Its also used to operate console commands, activate LPAR, define network setup, schedule an operation, inspect the system via an events monitor, or even IPL the machine. Its then a very important part of System z9, offering a nice Java based interface.
25 | P a g e
26
Key Concepts
Up to 512GB memory Up to 64 x 2.7GB/s STIs Up to 336 FICON channels Up to 1024 ESCON channels Up to 54 Processors, each activated remotely and temporarily if needed As every component has potentially its own spare, this hardware offers the greatest high availability possible in IT environment. Each element, from an entire processor book to I/O cards is hotpluggable and never need an IPL. These features offer an optimum up-time, which is required as a Mainframe should never, ever stop, especially in banks. IBM System z9 is the only system providing the ability to activate a processor on demand. It can be used both ways: Customers can activate it permanently. Do to so, one CP must be available on a processor book. If so, it will cost nothing if its part of the contract. If not, customers will have to pay a new processor book and it will be far more expensive that if he had bought it with the Mainframe. People should ask why all processor are not activated since the beginning, and it would be a good question. Answer is quite simple: most software used in Mainframe environment has a price based on the activated processors number. Then, its not interesting for customers to use them if they are not really needed. That would cost too much. Customers can also activate it temporarily, for one or more days. In big infrastructures such as banks, dealing with huge unplanned workloads, it can be very nice. To execute these workloads, customers activate a processor. This operation cost much more than if it was a permanent activation, but customers dont care as they only need it for a moment, and dont want to permanently pay more software licenses, like after a normal activation. This power proposed on demand is one of the greatest advantages of the IBM System z9.
26 | P a g e
27
2.2 Specialty Engines

zAAP
Java based applications are more and more used in big IT infrastructure. This programming language became very popular these last years for its reliability and its portability. Java applications use JVM (Java Virtual Machine) to execute themselves, using a JIT (Just in Time) compiler, converting intermediate bytecode into a machine code. It can thus be executed on many platforms such as z/OS. As numbers of customer use Java software such as Websphere, their general purpose CPs are utilized by considerable workloads to execute them. Then, it should be very interesting to have CPs which will only execute Java Application code. Thats what zAAP processors propose.
zAAP, for IBM System z Application Assist Processors, are specialized and dedicated processors which provide a Java execution environment for z/OS, in order to exclusively run Java workload code
zAAPs are used to operate asynchronously with the others processors which are part of the zSeriess Processor Books to execute Java code under control of the exclusive z/OS IBM JVM. As a result, it helps to reduce use of general purposes processors and make them available for other workloads. Capacity requirements are then cheaper than they were before. They are designed to help free up general purpose CPs capacity which may be utilized by more important workloads. zAAPs can help simplify, reduce server infrastructure and improve operational efficiencies.
One of the most interesting things about these processors is the fact that they wont ever need customers to change their Java application code. Every processing Java code executed on the JVM is directly and dynamically treated to be dispatched on zAAP processor. This function is entirely held by the IBM JDK and PR/SM, which make it completely invisible to IT staff, once configured. Please also note that z/OS XML System Services can now exploit zAAP for eligible XML workloads. XML System Services is a new feature available since z/OS V1R8, which offers to customers a systemlevel XML parser. This function supports either zAAP or zIIP in order to benefits from their advantages, such as non software charges which helps to save a lot of money!
27 | P a g e
28
How does it work? In fact, zAAPs physical architectures are very similar to the other processors available on zSeries, such as IFL, zIIP and standard processors. Then, only the microcode differs, in order to only execute Java code. As a result, zAAPs can do nothing but execute Java code, they cant be used to run operating systems, to initiate an IPL (Initial Program Load), and do not support manual operation controls. However, customers might not expect their Java performance to be improved. zAAPs offer a way to differentiate Java workloads to others, not to improve it. They help to save critical capacity demands on general purpose processors. Even if the amount of general purpose processor workload saved can vary in function of the Java application code really effectively executed on zAAP, its often significant to be really interesting. It also depends on the zAAP execution mode used by the customer. Note that they wont support Java software executed under Linux based systems such as RHES, only on z/OS.
A zAAP processor cost about $125k in USA, so it costs less than a general purpose CP, and its maintenance price is also lower than that of the general purpose CP. Its thus interesting for customers using Java Apps and significant XML parsing.
28 | P a g e
29
Limitation As for every technology, zAAPs cannot be used without conditions. Then, customers should be aware that: zAAPs can be used with z/OS V1R6 minimum zAAPs have to be configured to be used within z/OS LPARs only zAAPs number may not exceed the general purpose CPs (active or inactive, whatever) z9 Business Class can handle a maximum of 3 zAAPs, Enterprise Class can deal with 27 For each zAAP installed, one has to own a permanently purchased and installed general purpose CP
Why should customer use it? zAAPs enable customers to create a specialized and more cost effective execution environment for z/OS Java workloads. Java applications which were once executed on general purpose CPs will be dispatched on zAAPs. The new cool and hype XML format can also be treated, during parsing operation by zAAP, which will also save workload on general purpose processor. As this format is very popular and will be more and more used in big infrastructure such as in banks (as XML will be the new format for bank Exchange as defined in the SEPA project), this feature is welcome Customers can then purchase additional processing power without affecting their current workloads. As IBM does not impose software charges on zAAPs, they then help them to save money and decrease TCO of their Mainframe, lowering the overall cost of Java based application thought hardware, maintenance (zAAP themselves), and software(MSU/MIPS used) cost save.
Who really need it? In fact, most of IT environment using Java products on z/OS might use zAAPs. However, its not that easy to know if it will really be interesting in an infrastructure. As its price is significant, ROI (Return on Investment) must be interesting. We must admit that cost saves vary according to the society using zAAPs. To help them to project how much they can save and how it will change the way their workload will be treated, they can use the zAAP Projection Tool for Java 2 Technology Edition, which gives information about how much CP percentages are used executing Java code, and how a zAAP could have dispatched the Java workload on a given system. It then should be useful to predict the zAAPs number which are necessary, and if they are, to save money and improve the System z infrastructure.
29 | P a g e
30
Here is an example of projection we can do in order to define number of zAAP we should use.
First, we have to use RMF reports to know how many percentages CPs are running.
In this example, we general purposes CPs are used at about 49% and zAAPs would be used at about 30% if they were equipped. If this was the case, this charge would be in the parameter AAP. With these values, we can study workloads which run for an extended period of time, such as an entire day, in a 24 by 7 environment. Lets use an IBM known case, with a machine using ten processors.
During all day long, Java applications are used; consuming an average of about 5 CPs. zAAPs processors would clearly be an advantage here and will help to save money. First analysis would come to this conclusion: lets use 5 zAAPs and 5 general purposes CPs. This could be a good solution, but in fact, its terribly awful...
30 | P a g e
31
Lets have another look of the current situation
Well, it seems quite different now, huh? During night, batches use about 8 CPs. Differences between nights and day workloads type appears more clearly with that chart. The first solution appears now to be incredibly mediocre, indeed, you should remember that zAAPs execute nothing but Java code. This is what most IT staffs forget when theyre doing their capacity planning. As a result, with only 5 general purposes CPs, night batches will be too slow, and will never be finished at time. Then, you just HAVE to use a minimum of 8 general purposes CPs to match the nights power needs. Two more zAAPs will be used to handle the Java workload.
With that solution, general purpose CPs will be available to support the normal z/OS work as well as the Java workload which will exceed the capacity of the two zAAPs.
31 | P a g e
32
zIIP
Since their beginning, the most important thing in big infrastructure such as banks, with security, is their customer records. Without a structured collection of their records, they couldnt provide financial services, correct follow-ups, etc In fact, everything in our world is about collection of data of all kind. Thats why Databases are one the most used application; enterprise have to just use them. They can be executed on every platform, including, of course, z/OS. As numbers of customer use database such as DB2, their general purpose CPs are used to treat considerable workloads to execute them and to execute every SQL transaction. In a bank, number of transaction can reach more than one thousand per second. Then, it should be very interesting to have CPs which will only care about DB2 eligible workloads. Thats what zIIPs processors are able to do. So far so good, thats not all! Indeed, since April 2K7, zIIPs are also able to deal with network encryption workload, such as IPSec used by z/OS Communication Server. Its now doubly interesting!
zIIP, for IBM System z Integrated Information Processors, are specialized and dedicated processors which run DB2 and network encryption processing eligible workloads
zIIPs are used to operate asynchronously with the others processors which are part of the zSeriess Processor Books to execute DB2 workload under control of the IBM z/OS V1R6 to z/OS V1R9. As a result, it helps to provide an economical DB2 workload redirection environment and to reduce use of general purpose processors and make them available for other workloads. Capacity requirements are then cheaper than they were before. They are designed to help free up general purpose CPs capacity which may be utilized by more important workloads. zIIPs can help simplify and reduce server infrastructure and improve operational efficiencies One of the most interesting things about these processors is the fact that, as with zAAPs, they wont ever need customers to change their DB2 installation. Every processing DB2 workload will be dynamically treated and dispatched on a zIIP processor. This function is entirely held by z/OS, which make it completely invisible to IT staff, once zIIP configured. Concerning IPSec and other network encryptions, it will not be entirely executed on zIIP processors. As a result, if you used general purpose CP between 6 and 10 percent to perform IPSec operations, youll probably use between 5 and 6 percent of general purpose CP. Workloads saved may not seem that big, but remember IBM does not impose software charges on its specialized CPs, such as zIIPs! Please also note that z/OS XML System Services can also exploit zIIP for eligible XML workloads. XML System Services is a new feature available since z/OS V1R8, which offers to customers a system-level XML parser. This function supports either zIIP or zAAP in order to benefits from their advantages, such as non software charges which helps to save a lot of money!
32 | P a g e
33
How does it work? As zAAPs, zIIPs physical architectures are very similar to the other processors available on zSeries. Only its microcode differs, in order to only execute DB2, network encryption and XML parsing workloads. As a result, zIIPs will do nothing but these specialized tasks, they cant be used to run operating systems, to initiate an IPL (Initial Program Load), and do not support manual operation controls. However, customers might not expect their DB2 performance to be improved. zIIPs offer a way to differentiate DB2 workloads to others, not to improve it. They help to save critical capacity demands on general purpose processors. Even if the amount of general purpose processor workload saved can vary in function of the DB2 eligible workload, which is in SRB mode, effectively executed on zIIP, its often significant to be really interesting. It also depends on the zIIP execution mode used. Eligible DB2 workloads executed in SRB mode which can be dispatched on zIIP also include applications (running on z/OS or Linux for system Z) accessing a DB2 database hosted on zSeries. A zIIP processor cost about $125k in USA, so it costs less than a general purpose CP, and its maintenance price is also lower than that of the general purpose CP. Its thus interesting for customers using DB2, network encryption such as IPSec and significant XML parsing. Limitation As for zAAPs processors, zIIPs cannot be used without conditions. Then, customers who will use them should be aware that: zIIPs have to be configured to be used within z/OS LPARs only zIIPs can be used with z/OS V1R6 minimum with adequate PTF zIIPs number may not exceed the general purpose CPs (active or inactive, whatever) z9 Business Class can handle a maximum of 3 zIIPs, Enterprise Class can deal with 27 For each zIIP installed, one has to own a permanently purchased and installed general purpose CP. If you use one zIIP, one zAAP and one IFL for one general purpose CP, its ok.
Why should customer use it? zIIPs enable customers to create a specialized and more cost effective execution environment for DB2 and network encryption workloads. Transactions which were once executed on general purpose CPs will be dispatched on zIIPs. As said before, zIIPs will be able to treat XML format during parsing, with the same effectiveness of zAAPs. Dispatches will save workload on general purpose processor. Customers can then purchase additional processing power without affecting their current workloads. It will help them to improve their resource optimization and. As IBM does not impose software charges on zIIPs, they may help them to lower cost and decrease TCO of their Mainframe, lowering the overall cost of application using DB2 access and network encryption thought this news hardware. 33 | P a g e
34
Others Engines
There are three others engines available on IBM System z9. As zAAP and zIIP, the best advantage is to save critical capacity demands on general purpose processors, without having any effects on software licenses based on activated processor number and daily MIPS used. SAP SAP (System Assist Processor) is a needed processor in System z9, dedicated to deal with I/O. It helps to improve efficiencies and reduce average I/O processing, on every operating system running on the system, from z/OS to zLinux. Customers can add one or more SAP to improve there I/O workloads. IFL IFLs (Integrated Facility for Linux) are processors dedicated to Linux workloads. Providing a very attractive price, about $95k, IFLS enable customers to purchase additional processing capacity exclusively for their Linux partition. As it doesnt deal with other usual workloads, it doesnt raise any software charges. Its both supported by LPAR zLinux partition and z/VM zLinux partition. Also note that Linux systems can both use HiperSockets technology to communicate with others operating systems running on the same System z System and IFLs, to execute their workloads. This thesis will focus on these processors in next chapters, in particular in servers consolidation projects. ICF ICF (Internal Coupling Facility) are processors dedicated to Coupling Facility workloads. A coupling facility is a major component of Parallel Sysplex, a high availability technology, allowing several LPARs running z/OS to share, cache, update and balance data access. ICF is not a prerequisite to use Coupling Facility and Parallel Sysplex, but allow Internal Coupling (IC) links to help eliminate requirements for external CF links. These are complex concepts, which will treated in next chapters.
Key Concepts
Specialty engines allow customers to lower cost of ownership, as they help them to decrease specific workloads treatment on general purpose processors. Furthermore, their price is really attractive, about $95 to $125k. They can complement each other, IFL running Linux workloads, zIIP DB2 workloads and zAAP Java workloads. Customers should also remember that these CP cannot deal with usual workloads, and wont be interesting in every infrastructure, according to their software.
34 | P a g e
35
2.3 z/OS : the IBM Operating System

z/OS is the IBM Operating System used on its Mainframe. Since it benefits from more than forty years of innovations, its design offers a very stable and secure environment, with an average up-time of 99.999%. This availability percentage means that z/OS tolerates only 5.26 minutes per years of down-time, not more. z/OS means Zero down time OS and claims to be the most reliable system.
z/OS is designed to use the z/Architecture, and then only run in 64bits. However, it can still execute old MVS (Multiple Virtual Storage) or OS/390 applications, whether they use 24 or 31 bits addressing. It helps customers to capitalize on their existing applications, and to use the 64 bits addressing for their new ones. As it includes since MVS a direct built in UNIX system called OMVS, z/OS is full POSIX compliant. It also uses TCP/IP and SNA protocols for networks workloads, and then offers a high compatibility with old and new applications, whether the protocols used. z/OS offers a packaging of over 70 functions, such as z/OS XML System Services enhancing zAAP and zIIP processors exploitation during XML parses, Z/OS Communication Server allowing network encryption as IPSec, and supports a lot of hype languages such as Java, PHP, Perl, etc
Key Characteristics
Actually in V1R9 Implements UNIX built-in with OMVS Has been designed for security since the first day (no hacking!) Control a large number of users and terminals (several thousands) Manage a large number of jobs (multiprocessing, multiprocessors) Manage workloads with automatic balancing (based on task priorities) Support is planned for up to 4 TB of real memory on a single z/OS image Manage a high I/O load and connections (providing backup and restore capabilities) Offer system and software recovery level (preserve integrity and restart, great maintenance
35 | P a g e
36
2.4 An horrible user interface

Here is the worst part of IBM Mainframe system: the GUI (Graphical User Interface). Most users might think they miss something when theyll log themselves on a system z9. Indeed, the interface will seem archaic, and people will surely think What is that thing?! It looks like it was thirty years olds Give me my MS Windows back please, I wanna click! Well, it seems so old because it IS that old! First Operating Systems on Mainframe were used to deal with batch processing jobs, able to run without any human interaction, or just a little. Administrators more and more needed to interact with the system, meaning to be able to operate commands and to have a fast and direct answer from the system. And then the User Interface came.
Time Sharing Option

The first one being proposes was TSO (Time Sharing Option), on OS/360, as an optional feature. It was standard with MVS, on S/370 family model. TSO is like a UNIX shells, without auto completion. It allows users to directly access to some MVS functions, such as logon, data set allocation, etc Thank to it, JCLs (Job Control Language) are not necessary for each simple request. TSO also allows several users to connect themselves on the system at the same time. They deal with their own interface, as if each user had their own MVS. Its a line-by-line interpreter; as a result, it will do nothing but wait to users command and display responses. This interface isnt really used nowadays. However, here are some important commands. Command ALLOCATE COPY DELETE EDIT LIST LISTCAT LISTDS RENAME LOGOFF LOGON SEND ISPF Effect Allocate A Data Set With Or Without An Attribute Copy A Data Set Delete A Data Set Create, Edit, And/Or Execute A Data Set Display A Data Set Display User Catalogued Data Sets Display Data Set Attributes Rename A Data Set End Terminal Session Start Terminal Session Send Message To Operator/User Launch Interactive System Productivity Facility
36 | P a g e
37
Interactive System Productivity Facility

Most users prefer to use ISPF (Interactive System Productivity Facility) than TSO. Indeed, this interface provides access to many of the functions most frequently needed by users, as it was possible with TSO, but with nice menu and panels. It then offers a more user-friendly interface, and assists users. ISPF make it easier for people with various levels of experience to interact with the system. In fact, many administrators exclusively work with ISPF, as it increases speed and productivity. It includes a text editor and a file browser allowing users to list locate specific data set and perform utility functions such a renaming, deleting, etc
Nowadays, every z/OS products use ISPF panels to offer user an easy interface. Each one proposes an online help using F1 shortcut key. This interface is rudimentary, but remains the most modern way to administrate a z/OS. Thats why this system is seeing as old and archaic: this user interface is far from what is proposed under others platforms, such as Microsoft Windows or Linux Debian. They are no windows, no mouse clicking, and no auto complementation. But it consumes few resources, and administrators who work under z/OS are accustomed to use it. Thats why it never changed.
37 | P a g e
38
z/OS UNIX interactive interface

As z/OS contains a built-in UNIX system called OMVS (Open MVS), it offers special interactive interfaces to administrator, allowing them to easily use it. z/OS OMVS can be managed via two different interface: ISHELL and OMVS standard interface, theyre used according to users experience.
ISHELL (Unix System Service ISPF Shell): this interface uses ISPF panel interface. Its used for Mainframe users who are familiar with TSO and ISPF, who dont know UNIX commands but who need to use its services. Many commands can be executed via this interface, such as file system mount, file creation, browsing, etc
38 | P a g e
39
z/OS shell: based on the UNIX System V shell. Its used by users who are familiar with UNIX systems, and who want to be able to use its commands. This shell then provides an environments offerings most functions and capabilities a user would find in a standard UNIX.
How administrators connect themselves to a Mainframe? One can easily understand that it is difficult to be physically connected to a Mainframe. As every system, z/OS can be managed remotely, requiring a 3270 display device. Then, administrators use a connection providing access to an IBM zSeries host over a TCP/IP network, using TN3270 or TN3270E interface. It can support Service Location Protocol, SSL V3 and TLS1.0 secure layer, and can also be used to connect to an IBM zSeries host through a firewall which supports NVT terminals.
Then, they will be connected as if they were UNIX administrators using Telnet or SSH to work remotely, or Windows Administrators using remote desktops for example. 39 | P a g e
40
2.5 z/OS file system

File system concepts are not that simple in Mainframe environments. Indeed, they are not based on the same technology used under distributed environments systems. As z/OS file system is very special and complex, it could be the subject of a whole Thesis. However well try to briefly describe it. Contrary to traditional systems such as UNIX or Windows using a unique hierarchical file system once installed, z/OS uses several types of data very different each others, including their access methods and structures, in order to be more efficient according to their utilization. Usual Operating Systems use a byte stream file system, while z/OS uses record-oriented file system. In a byte stream file system, every file is sequential, meaning a huge stream of bite, split into several fixed numbers of bytes called records by a special character. This special character can be visible using programs such as Notepad++ on Windows. If the program has been configured to do so, users should see a CR/CF characters (meaning Carriage Return and life Feed) after each line. z/OS doesnt only used sequential files accessed as byte streams and doesnt use a hierarchical system. Instead, it uses a catalogued file system, referencing each Data Set available on the system.
Data Set
Data Sets is the name given to z/OS files. As they are record-oriented, administrator will need to reserve space for them before to be able to write data into them. Maybe it sounds like archaic, but by explicitly defining the attributes of its data sets records, administrator helps to save system resources, as it wont have to check the CR/CF characters each time it will read data. When it opens a file, the system will then already know how its formatted. As a result, performances are very good. There are several types of data sets, but the three more used are the following: Sequential Data Set: Also called PS (Physical Sequential), this is a very simple type, which can be seeing as a file, constituted by a sequence of one or more records. Partitioned Data Set: Also called PDS, they can be seeing as folders, as they are a collection of Sequential Data Set. PDS are composed of two elements: o o Members which are the PS included in the PDS, as file contained in a folder A directory which is a list of every PS available in the current PDS, as a list of pointers.
Partitioned Data Set offers numbers of great features, such as making possible to process all PDS members as a unit, concatenate multiple PDS to form huge libraries, etc However, their utilization imposes some disadvantages, as wasted space. Indeed, when a member is deleted, its pointer in the directory is also deleted, and there is no mechanism to reuse its space unless compressing PDS with a utility such as IEBCOPY. These disadvantages arent present in the evolution of PDS called PDSE Partitioned Data Set Extended), which automatically reuse free space for new members. They also extend other PDS limits as members max records. 40 | P a g e
41
VSAM: Virtual Storage Access Method. This term is used for special data as well as the associated access method. With their structure, VSAM files incredibly improve the read access performances. For example, DB2 and IMS use them. This is one of the most complex Data Set type, containing four subtypes, each one having its characteristics. o KSDS (Key Sequence Data Set): Probably the most used VSAM type. Each record is associated with a key value. As a result, each record can be accessed using this key index, allowing read access to be very efficient. We can see it as a tiny data base. ESDS (Entry Sequence Data Set): Records are accessed sequentially, without any key RRDS (Relative Record Data Set): Allow records access according to its number: first record, second record, etc Its like a numbered index. LDS (Linear Data Set): a recent and not really used VSAM type, which is a byte-stream data set, and which is used as a traditional sequential file, with CR/CF, etc
o o
Other types of data set are available, such as GDG (Generation Data Group) which are a collection of historically related data sets arranged in chronological order. It can be seen like a feature as shadow copies in MS environment, or like time-machine from Apple... but with thirty years of experience. GDG data set use sequentially ordered numbers following their name to represent their age. As a result, the 0 refers to the latest version, -1 next to the latest, and so on. GDG are often used to stock statistics. For example, the data set IBM.ZSTAT(0) will be the most recent data set, IBM.ZSTAT(-1), will be the second most recent, etc Administrators can also use in their script the (+1) value to manually specify they want to create a new generation. z/OS can also use byte stream files such as HFS (Hierarchical File System) or zFS (zSeries File System) which are containers holding a whole UNIX directory tree. These files can represent mounted Linux.
PDS Limitations
As in every system, file cannot be allocated without having to follow some special rules. To better understand how a PDS work, lets see an example: POP.HOME.JAVA. This PDS: Is composed of three qualifiers, also called segments: POP, HOME and JAVA. They represent a level of qualification. Each are limited to eight characters, must begin with an alphabetic characters (A-Z), or a special one (# $ @). Other characters can be alphabetic, numeric or special. Each segment is separated by a comma. Is composed of an HLQ (High Level Qualifier): POP Is composed of an LLQ (Low Level Qualifier): JAVA
The most important thing about PDS to remember is: PDS can have a maximum of 22 name segments IBM advice to use PDS with three level qualifiers PDS name must not exceed 44 characters, including all name segments and periods 41 | P a g e
42
Allocating a Data Set

As z/OS file system is record-oriented, administrators have to allocate data set before being able to write data into them. As a result, they must define their characteristics when allocating them, in order to define the data set internal structure which will inform the system the way it will read them. Here is an example of interface allowing administrators to allocate data set. This is the ISPF panel.
Several attributes are not mandatory, such as the one used by DFSMS (Data Facility Storage Management System), which we will present. The most important attributes are the following: Volume Serial: the name of the disk or tape volume where the data set will be created
Example: DASD01 (Always a six character name)

Device Type: the model of the disk device used. Nowadays, its almost always 3390 models Space Units: Unit used to stock the data: blocks, disk tracks or cylinders, or more understandable ones as KB, MB or Bytes. It defines units you will use for allocation. Primary Quantity: The number of space units chosen for the file allocation
Example: 10
Secondary Quantity: The number of space units used if the file allocation exceeds primary quantity. It can be seen as an extended space quantity. The value is multiplied by 10.
Example: 3 (then the system will extend 30 spaces units for the file allocation)
Directory Blocks: Number of directory block for PDS. As a result, having a non-zero directory block will cause a Partitioned Data Set to be created, and a zero value will cause a Physical Sequential to be created. The more this value is important, the more administrators will be able to create members in the PDS. Indeed, numbers of potential members in a PDS directly depends on its directory blocks value, as its the index for PDS members. 42 | P a g e
43
These attributes define how the data set will be allocated, on which DASD, with how much space, etc But they dont define their internal structure. To do so, administrators use three parameters: Record Format: Records have either fixed or variable length in a data set. Format define how the data set records are structured, there are five type of records: o F (Fixed): Every blocks and record are the same size. As a result, a physical block on disk is one logical record. This is the simplest record format. o FB (Fixed Blocked): There can be multiple records in one block, providing good space utilization. This is the most used format, with a Block Size of 80. o V (Variable): Blocks and records are the same size, but there value can be different, according to the different records. As the system must know how the data set is formatted before reading it, this format use a RDW (Record Description Word) of 4 bytes describing the record and its length. o VB (Variable Blocked): Uses the same RDW system than Variable record format, but here, multiple records can be placed in one physical block.
o
U (Undefined): Blocks and records dont have any defined structure. Its used for executable modules, and may not be used for other applications.
Record Length: The length (number of characters) in each record. Also called LRECL, record length is the logical record size (F and FB record format) or the maximum allowed logical record size (V and VB record format) for the data set. Format U records have no LRECL. Block Size: Also called BLKSIZE, its always a multiple of the record length value. Its the physical block size written on the disk for Fixed and Fixed Blocked record format. For Variable, Variable Blocked, and Undefined records, its the max physical block size that can be used for the data set. System can be configured to calculate the most efficient BLKSIZE.
Summarization of Record Format and Block concepts
43 | P a g e
44
Using Catalogs to locate Data Sets

As z/OS doesnt use a hierarchical system and its file system doesnt have a root concept, it has to use another way to locate data. Thats what the Catalog proposes. A catalog describes data sets attributes, including their location. You can see it as a database which is used by the system to know where its resources are. As a result, when a data set is catalogued, it can be accessed referring to its name, without having to precise where it is physically. Data Set can then be catalogued to be seen by most users, uncatalogued to become invisible to them, unless they know where it is stocked physically, etc... Users dont have to use a catalog to physically access their data. Indeed, if they know when they are, precising their name and their volume (DASD used), they will be able to catch it, unless RACF security policies explicitly prevents them from reaching these data sets. A catalog only allows user to access their data by their name, wherever they are. Even if one catalog could be enough to reference every data set, enterprises often use several catalog, to avoid too many accesses to a unique catalog, in order to have better performances. As a result, there are two types of catalogs: Master Catalog: where system data set, such as critical load module are referenced User Catalog: where other specific data set are referenced, according to the enterprise infrastructure and policies. These catalogs are catalogued in the Master Cat. A standard z/OS often use a master catalog for its critical data set, such as the ones used during the IPL (Initial Program Load) procedure, and then several user catalogs, to split data access references. When a user searches for a data set, he first asked to the Master Cat if it knows where the data set is, and if not, it passes the request to the adequate user catalog. This catalog concept can be very powerful. Indeed, administrators can for example have two different master catalog: the first one referencing load modules on a specific DASD, and the second one referencing the same load modules on another DASD, but configured differently. With such configuration, administrators can use a different system without having to change anything but the master catalog used during the IPL. This is all about file pointers, so their use is very convenient.
44 | P a g e
45
z/OS file system is not easy to understand for UNIX or Windows users. Here is a brief analogy with usual hierarchical file system, which will help to appreciate concepts weve talk about.
To clear these hash concepts, theres nothing better than examples. Here is one easy to understand.
45 | P a g e
46
2.6 JCLs for batch processing

JCL (Job Control Language) is a scripting language instructing the system how to execute a program. Indeed, JCL is usually a description of a batch program, and describes its parameters, inputs and outputs resources, etc If one has to make an analogy with distributed system, one could say that JCLs are like a sophisticated Shell Script, using libraries to launch several programs.
JCLs Syntax
Writing JCL is not really hard, but as every scripting language, it has a very special syntax and rules. As a result, administrators have to write it with much attention and rigor, and be aware that: Every line must begin with // Every line beginning with //* is treated as a comment As z/OS is non case-sensitive, every character has to be in uppercase JCL instruction have to be in columns 1-71, every characters in and after 72 cause an error If an instruction has to exceed 71 columns, its first line is finished by a , and continue on the next line between the 4 and 16 column
JCLs Statements
Any JCL should have a least three statements, each one having several parameters: JOB: the first JCL instruction, providing a name for the JCL and the information treatment. The job name must be eight characters long maximum, and alpha-numeric. Its parameters allow defining the user who submits it to precise its operation. o o o o o o o REGION: Memory resources allocated to the JCL TIME: Define the maximum total CPU usage of the JCL MSGLEVEL: Define the system messages number to be received CLASS: Define the input queue used by the JCL, and define its priority NOTIFY: User to be notified of the JCL result, in particular its return code USER: Define the user who will use the JCL, allow to inherit its authorities MSGCLASS: Define the output queue used by the job output (tape, printer, etc)
JOB Statement example:

//JOBCOPY1 JOB 1,'IEBCOPY',CLASS=A,MSGCLASS=W,TIME=1440, // MSGLEVEL=(1,1),NOTIFY=&SYSUID,REGION=0M 46 | P a g e
47
EXEC: Define a step in the job, using a particular program. It must be the first statement after the JOB one. It identifies the program to use, and how it will be run. A job can comprise up to 255 steps. Its parameters define according to which conditions the program must run: o o PARM: Allow to pass data to the program, as sub parameters COND: Define condition define the condition according to which the program must run. Other parameters such as IF, THEN and ELSE can be used under EXEC statement TIME: Define the maximum total CPU usage of the step
EXEC Statement example, executing IEBCOPY:

//STEP1 EXEC PGM=IEBCOPY
DD: Define input and output resources needed by the program. Each DD card is associated with a particular EXEC statement, and then a particular step of a JCL. It the most complex statement for JCL, as it needs number of parameters which define how we access data: o o o o o o VOL=SER: Serial Number of the unit used UNIT: Type of the disk used (3380, 3390, etc) LIKE: Define the DSN attributes as being those to use DSNAME: Data Set Name to be used as a I/O resources SPACE: Allocation needed for a new Data Set to be created DISP: Data Ste Disposition: if it exists, if it has to be created, cataloged, etc
DD Statement example, defining the Data Set SYS1.IPLPARM to be used on 3390 disk named SYSZ8B
//SYSUT1 // DD DSNAME=SYS1.IPLPARM,UNIT=3390,VOL=SER=SYSZ8B, DISP=SHR
JCLs Example
The following example is a JCL used to copy a data set SYS1.IPLPARM from SYSZ8B DASD into a new data set also named SYS1.IPLPARM on TARG00 DASD, using IEBCOPY utility. As it use the previous statements examples, it should be easily understandable. //JOBCOPY1 // //STEP1 //SYSPRINT //SYSUT1 // //SYSUT2 // /* JOB 1,'IEBCOPY',CLASS=A,MSGCLASS=W,TIME=1440, MSGLEVEL=(1,1),NOTIFY=&SYSUID,REGION=0M EXEC PGM=IEBCOPY DD SYSOUT=A DD DSNAME=SYS1.IPLPARM,UNIT=3390,VOL=SER=SYSZ8B, DISP=SHR DD DSNAME=SYS1.IPLPARM,UNIT=3390,VOL=SER=TARG00, LIKE=SYS1.IPLPARM,DISP=(NEW,KEEP) 47 | P a g e
48
Using SDSF to check JCLs execution

Once a JCL is submitted, administrators can check its operation via an ISPF panel called SDSF (System Display and Search Facility). Indeed, this program provides a nice way to monitor every program running on the system, as process monitor in Windows, or the PS aux command in UNIX systems.
Thanks to SDSF, administrators will be able to check in real time I/O resources used by a specific JCL, CP time, and so one. It also provides a way to read the output messages generated by JCL, and then help administrator to know why some of them crashed, etc SDSF allows administrators to: Display job output Control the jobs order Operate system commands Monitor jobs when theyre running View the whole system log and searching into it for any string Control job process (hold, released, canceled, and purged jobs)
When a job is finished, administrators have to check their RC (Return Code), which indicates if the program ended well, or if errors occurred. If an administrator was in the NOTIFY parameter of the JCL JOB statement, he wont have to use SDSF to know its results. Otherwise, SDSF will allow him to check the JOB output, and then its Return Code. A job which finished normally has a RC of 0. Other value means there has been a problem, such as 12 value for critical crashes. JCLs are very old, and still have a syntax which seems archaic nowadays, and which doesnt make sense for many persons. However, it remains the universal way to run a program on a Mainframe, and software such as SDSF simplifies their management, like a process monitor
48 | P a g e
49
2.7 Jobs, performances, and network management

There are several others product integrated in z/OS allowing the whole system to work. The most important are the one dealing with batch, performance and network resources: JES2, WLM and CS.
JES2
JES2 (Job Entry Subsystem) is a collection of program used by z/OS to handle batch workloads. It receives every jobs submitted, schedule them, and deal with their input/output. It will manage the job queues, including different jobs types: already running, waiting for execution, waiting their output to be generated, waiting to be purged, etc Thus, JCLs output messages readable with SDFS are managed by JES2. Its also this program which will verify if every I/O resources is well defined, read, and eventually written.
Once the JCL is handled by JES2, its passed to the z/OS initiator, which will verify that there are not any access conflicts for data set and make sure that device used are well allocated. It will also search for the executable program used in the JCL, such as IEBCOPY for example. JES2 will manage and control every step of every job life through the system, beginning its submitting to it purge. This flow is quite simple to understand and composed of five steps: 1. Input step: JES2 handle the job, accept it, and give it a unique job identifier, which will be readable under products such as SDSF for example. 2. Conversion step: JES2 convert the jobs JCL into a format which can be use by itself, but also by the z/OS initiator. During this translation, it will check if there are any syntax errors, and if so, will not pass into the execution step but into the output one. 3. Execution step: Job is executed by the initiator, and keep running until it ends, and according to its parameters, such as time. 4. Output step: Every output stream, such as output system message generated by the job, output having to be written or to be processed locally or at a remote location is controlled by JES2. It analyses their characteristics, according to their class. 5. Purge step: As the job has been executed, checked, as well as its output, its purged by JES2. It means that it releases the spool space than had been assigned to it.
To summarize, JES2 is used to handle workload and to JCLs execution. It must be running and without it, z/OS couldnt run, as it couldnt deal with its programs and their output messages. 49 | P a g e
50
WLM
WLM (Workload Management) is a z/OS component used to manage the system resources. It works in collaboration with every program, checking their performances, response time, resources used, etc It helps to manage the system resources, such as processors, memory, and storage, to achieve program priority goals. Workload Management is used to achieve business goals, also called goal achievement. These are the most important objectives. As a result, each workload has a different importance, and thus weight (priority). Some workloads are more critical than other, WLM deals with this concept, and helps administrator to define what the system MUST do. It also uses hardware resources the best it can. This is called the throughput. Thats why Mainframes are always running at about 90% of their capacity: they will always be solicited. Basically, administrators set a list of policies in WLM, defining each workloads goal, such as a needed response time, and its weight. In companies, these policies are based on a SLA (Service Level Agreement), which is the quality level of services promised to customers and users, for every application. A SLA could be, for example, that your bank promise to treat any transaction in less than two second. WLM is used to match the system capacities with defined SLA. To do so, it works in collaboration with JES2. WLM checks everything on the system, CP Time consumed, I/O resources used, etc and compared it with goal needed. It then indicates to JES2 how to reorder the job queue, and readapt their resources.
WLM will manage every system resource in order to reach these goals. For example, if there are several batch running, and that one specific job needs to be finished in few time for business reasons (critical transactions), then WLM will dynamically readapt the system in order to give it more power. To do an analogy with distributed system, its as if windows administrators manually redefine, thousands time per day, the system process priority. Thats the same deal, but WLM does it automatically, according to its policies. As a result, we could see z/OS as a motorway, JES2 being the roads, each job a road, and WLM their speed limitation, each road having its own limitation, which could change anytime: its role is only to check overall performances, and give more or less resources. 50 | P a g e
51
To summarize, WLM is used to match with needed performances, achieve business goals, and to benefit from the installed hardware and software platform.
Communication Server
z/OS CS (Communication Server) which is used for Network communications. Its composed of a set of many programs, allowing the system to use many different protocols. Communication Server is used to deal with two major protocols: o SNA (System Network Architecture), which is an old protocol developed by IBM and still used in many infrastructures for critical applications. Its handled by VTAM (Virtual Telecommunications Access Method), which can also support other LAN technologies such as Token Ring and SDLC (Synchronous Data Link Control). o TCP/IP (Transmission Control Protocol/Internet Protocol), the most used communication protocol, delivered with every modern system. CS benefits from all its features, as well as well-known command such as PING, NETSTAT, TRACERT, etc
Thanks to Communication Server, administrators can also benefit from other great feature. Indeed, VTAM can be configured to use APPN (Advanced Peer to Peer Networking) and HPR (HighPerformance Routing) permitting z/OS to send SNA data through existing TCP/IP network equipment. It allow big infrastructure to use SNA over intranet or internet. Communication Server is also available on other systems such as Microsoft Windows or Linux, in order to benefits from the TCP/IP and SNA functions which can be interesting. 51 | P a g e
52
2.8 Transaction Servers and Database: Middleware

Sub-systems of z/OS are part what one calls its middleware. Its composed by several applications, such as transaction servers, database or products such as MQ (Message Queue) and WebSphere.
Transaction Servers
In big infrastructure such as banks, transaction servers are very important, as they directly deal with business needs. Mainframe environments propose two major transaction servers: CICS and IMS TM.
A transaction is a collection of operations on the physical and abstract application state Jim Gray and Andreas Rueter
In France, most customers use CICS, thats why I will only present this transaction server. However, IMS TM (Information Management System Transaction Manager) is very close to it, and is usually used for a high number of transactions, which are different each others. The notable difference between CICS is that this one treats one transaction a time, whereas IMS TM can deal with several transactions simultaneously. Basically, IMS is composed of various sub transaction server, which can handle different type of transaction according to their characteristics (weight, volume, etc). Its also interesting to note that IMS can work with its own database: IMS DB. CICS (Customer Information Control System) is an online transaction processing system which controls information by providing system management, database and network communication functions. It provides an interface between application programs and operating system. It runs as a unique z/OS batch job and allows hundreds of users to interactively access several applications. As IMS, its a core system for about 490 of the fortune 500. Its a must have for any financial system (ATM machines, credits cards, etc), stock trading system, insurance, etc Most transactions processed each day are handled by CICS, when you buy something with a credit card, etc
CICS represents over 30 years and $1 trillion invested in applications. Its used for more than $1 trillion business transaction per day! IBM
CICS helps to ensure that every transaction is ACID, meaning: Atomicity: All or none: either all related changed are done or not Consistency: Action done dont violate any integrity constraints Isolation: All transaction dont care about the others, and are not aware of their presence Durability: All transaction committed have to survived to any failure that could occur 52 | P a g e
53
Every step of a transaction handled by CICS must be verified. You shouldnt want to make an important transaction in your bank and see it hasnt been well committed, right? CICS is also available on other platforms such as Windows, Solaris, or Linux, but it then know as TXSeries. It nearly offers the same features than in its z/OS version. However, its used under this environment, because of z/OS usual strengths: scalable, per formant, secure and reliable. Furthermore, CICS is optimized for this environment, as its been developed since years on MVS.
Database
Transaction Servers often directly deal with database. Most known under Mainframe environments are without any doubt DB2, IMS/DB and Oracle for z/OS. This one is used by some customers, but most of them use IBM DB2 UDB (Universal Data Base), because its really optimized for this system. A data base centralizes data used by several applications. Then, multiple programs can access the same data simultaneously, using SQL (Structured Query Language). Data integrity is always checked. DB2 is a very efficient relational data base, and its more interesting to use it on a z/OS environment than under distributed system. Indeed, under z/OS, it uses VSAM files and as a result, its performances better than if it used standard bite stream files. Its tablespaces can be up to 16TB. One of the most interesting things about DB2 in its last version is its XLM integration. Indeed, XLM documents can be stocked in a CLOB column, in their native format or in multiple columns (which is called shredding). Customers can still use SQL, as DB2 manage the XPath parsing, retrieve data and their XLM result. As it uses XLM workloads and usual DB2 workloads, DB2 can be executed under zIIP and zAAP processors, which can help customers to save money (no software charges). Both SQL/XML and XQuery language can be used with DB2, which make DB2 very interesting products for banks.
53 | P a g e
54
2.9 RACF Security Server

Security is a must have in critical systems. As z/OS is used to host applications dealing with sensible and confidential data, such as bank accounts, it needs a very reliable application to restrict all accesses. There are multiple applications allowing customers to deal with security their on Mainframe environment, most known are RACF, Top Secret and ACF2. As RACF is probably the most used, it seems important to present its main functionalities. RACF, for Resource Access Control Facility, is a security program, which controls what users can do on a z/OS. It can also be used to define security polices on z/Linux. Its a very reliable solution as its placed on the EPL by the NCSC at a B1 level of trust. It can then deal with massive attack attempts.
On a typical day, the security team logs 38,000 attempts by unauthorized individuals or automated probes to access the states networks Dan Lohrman, Michigan Chief Information Security Officer
RACF is used to provide user verification, resource authorisations, logging capabilities and identification. RACF is not a unique product, but also include tools which simplify the system administration. It can for example create security reports resuming every access attempts and RACF command failed, or helps you to erase a user identifier, as all its correspondences in the RACF base.
Defining users in RACF

As a software controlling accesses, RACF must first identify the user who is trying to access a resource, and then verify if this user really is who he claims to be. To perform this operation, RACF check a user ID and its password, which is system encrypted. First things come first; administrators will initially have to define users accessing the system. When you create a user, its password will be temporary. As a result, during it first logon, he will have to change it. With that method, the user will be the only one to know its password, unless he divulges it to someone. Add a user:
ADDUSER TOTO
This user whose ID is toto wont initially have any password, but will have to create one during its first logon. This user definition is really minimalist; he is a more complete one:
ADDUSER FLAVIEN OWNER(RACFMST) NAME(F. SALSMANN) PASSWORD(KIKOOLOL) TSO (ACCTNUM(000011) PROC(IKJACCNT)) WHEN(DAYS(WEEKDAYS) TIME(0700-1800))
This command will create a user whose ID will be FLAVIEN, and its profile will be managed by the RACFMST user. He will be able to use TSO as he has a TSO account number and a defined logon procedure. However, he will only by able to log himself Monday through Friday, from 7h to 18h.
54 | P a g e
55
Every parameter has specific syntax rules. User ID must then be eight characters long, without a number at the beginning. It must be unique in the system. Password is also eight characters long. Please note that the ADDUSER command can be far more complex in a production system. There can be more than twenty parameters, each, as said before, responding to precise rules. Of course, you can change information in a user profile, or temporary revoke its user ID. To do so, you can use the ALTUSER command.
ALTUSER TOTO PASSWORD(TATA) ALTUSER TOTO REVOKE
The first command changes the TOTO user password, and then revokes this user ID. You can also delete its profile with the DELUSER command. It will clear of its correspondence in RACF as well.
Defining groups in RACF

RACF also use the group concept, very famous is UNIX or even Microsoft Active Directory environment. As in these other operating systems, groups will simplify security administration and help to avoid human errors when defining a new security rules, as it deal with much users automatically. Indeed, when administrators have to manually define the same security policies for hundred of users, they can easily forgot few of them. This can be a long and rebarbartive task. With groups, administrators can apply security models to a lot of users, in seconds and with the same efficiency. To create groups, we use the ADDGROUP command.
ADDGROUP SALSMANN SUPGROUP(SYS1) UNIVERSAL
Here, we defined a group named SALSMANN (eight characters max) and its superior group is SYS1. As a result, SALSMANN will be a subgroup of SYS1, which is also a group. If SUPGROUP is not specified, the current group of the user who operate this command is used instead. Universal is used for groups which will have a high number of users, potentially infinite. As with users, you can edit group information with the ALTGROUP command, and delete them with the DELGROUP command. Once administrators have created your users and groups, they have to link each others. To do so, they can use the CONNECT command. Here is an example:
CONNECT (FLAVIEN) GROUP(SALSMANN) OWNER(RACFMST) AUTHORITY(CONNECT)
Well, with that command, Flavien will be in the salsmann group. He will also have the connect authority on that group. Lets have a look to this concept of group authority. USE: Allow user to access resources to which group is authorized. Its the default authority CREATE: Allow user to create RACF data set profile for the group (we will see that concept) CONNECT: Allow user to connect others users to the group JOIN: Allow user to add new subgroup or users to the group, as well as assign group authorities to new members. Its as a mini-administrator if you prefer, or admin delegation. 55 | P a g e
56
When administrators have finished to define users, groups and to link them, they can have a look of their global RACF definition using the LISTUSER and LISTGRP command. Please also note that administrators can define in which group a user will be directly during its creation using the DFLTGRP parameters. Administrators can also define the UNIX UID GID to use RACF users in the OMVS environments which is part of Z/OS. To do so, administrators can use these commands:
ALTUID FLAVIEN OVM(UID(10)) ALTGROUP SALSMANN OVM(GID(110))
Defining Data Set Profile in RACF

Defining security policies is not that easy. A security administrator must clearly define all the rules needed by every team manager before to do anything. Each policy must have a priority, and must be testing before being approuved and validated. With RACF, administrators can set permissions for file patterns. As a result, files can be secured according to their name, and then associated permissions can be defined even before their creations. In a company which use naming convention, it can be very useful. Also note than when RACF is installed, every data are protected. In Windows environments, default security is everyone/full control. In RACF, it will be everyone/none, so administrators have to define each access. Its far longer, but also far more secure! Lets study some examples allowing defining dataset accesses to some users. First of all, if we want to use enhanced generic naming dataset, we have to activate this function:
SETROPTS GENERIC(DATASET)
Well, now we can create our data set profile. These will allow administrator to secure accesses using data set name. They create rules for some dataset which will apply to all users. Dataset profiles have specific rules. Dataset aimed must have at least two qualifiers, and the first one (called high level qualifier) has to correspond to a user or a group. A data set profile contains: A data set name A Owner: by default : the data set profile creator An UACL : Universal Access List, which is the default access level to define Etc. (auditing information for example) There are two kind of data set profile: Discrete: a unique data set, which needs unique security requirement Generic : protecting similar naming structure data set, and using joker characters Generic Profile Data Set, are, of course, the most used, because far more simple and powerful. Furthermore, discrete data profile directly deals with its physical volume. Then, if you change its volume, security is not effective anymore 56 | P a g e
57
Lets create a generic data file profile:

ADDSD (**.FB*) UACC(NONE) ADDSD (FLSA**.FB*) UACC(UPDATE)
The first rule specifies that any data set with a second qualifier beginning with FB will have a Universal Access List of None. Example: SYS1.FB89, FAC.FBP, etc The second rule specifies that any data set with a second qualifier beginning with FB AND the first qualifier beginning with FLSA will have a UACC of Update. Example : FLSA00.FB98 Then, if you have a data set called FLSB00.FB80, it will be None, and FLSA00.FB80 Update Administrators should know the generic profiles rules, especially for the enhanced generic naming: % match any single character in a data set name * matching as : A character at the end of a data set profile name (for example, FLSA.FB*) to match zero or more characters until the end of the name, zero or more qualifiers until the end of the data set name, or both As a qualifier in the middle of a profile name (for example, FLSA.*.FA) to match any one qualifier in a data set name As a qualifier at the end of a profile name (for example, FLSA.FB.*) to match one or more qualifiers until the end of the data set name As a character at the end of a qualifier in the middle of a profile name (for example, FLSA.OP*.DA) to match zero or more characters until the end of the qualifier in a data set name. You can delete a rule with the DELDSD command, and list them with the LISTDSD command.
Giving special permission to Users and Groups

When you need a user or a group to by-pass the UACC, you have to specify it in RACF. To do so, we use the command PERMIT, as defined here:
PERMIT (**.FB*) ID(FLAVIEN) ACCESS(UPDATE)
This data set previously had a UACC of NONE. With that command, the user FLAVIEN will have to UPDATE permission on all data set matching the naming structure **.FB*. You can then delete this special ACL when you want, with the command:
PERMIT (**.FB*) ID(FLAVIEN) DELETE
Flavien wont have special permissions anymore once this command will be executed.
57 | P a g e
58
There are different five kinds of permissions: NONE: Should be the default UACC for all your data set! Does NOT allow users to access. EXECUTE: Allow users to load and execute a library, but not to read it or copy. READ: Allow users to read the data set. He can copy it. UPDATE: Allow users to read from, copy from, or write the data set. CONTROL: For VSAM data sets, equivalent to the VSAM CONTROL password and then allows users to perform improved control interval processing. For non VSAM, CONTROL is equivalent to UPDATE ALTER: Allow users to read, update, delete, rename and move the data set
RACF Permissions, more limited to most permissive
Special groups
There are three groups in RACF which allow administrators to use this security product. As a result, they might be used with caution. Auditor: analyse logs and is aware of access violations Operation: allowed to by-pass the UACC Special: create all the rules, he is some kind of root in RACF Usually, when security needs are met, companies dont use Special profile. Its too powerful and can potentially be a security hole. Administrator should use it to define their rules, and once its done, dont use it anymore. You should note that thanks to these key roles, delegation is possible in RACF. 58 | P a g e
59
Permission priorities considerations

As in every security product, there can be different security policies applied on a data set for a user, its group, and the others. Administrators need to know the priority affected to each level. RACF obeys to a simple rule: the more the permission is precise, the more it will be a priority.
RACF will only take care of the most precise profile. As a result, it permits administrators to secure their system without having to focus on concepts such as inheritance. However, they might not forget that a user owning OPERATION authority will have a default alter control.
59 | P a g e
60
Interaction with other products

Every product of z/OS can directly interact with RACF to match with its security policies. To do so, they use an interface called SAF (System Authorization Facility). It provides an interface between a product or any component requesting an access to a resource in the system and the current security product. This concept is very important, because SAF can also by use with other product than RACF, such as Top Secret. In our example, we use RACF, so SAF and RACF will work together to determine the access permissions to a resource. We can approximately see it as a universal API.
RACF is thirty years old Whatever?

RACF seems old and archaic, but as all things in the world, and particularly in Mainframe environment, old means tested, approved and working. RACF has acquired features for years: Multilevel security Enhanced PKI Services support Very advanced password recycling detection Support for the Network Authentication Service Access control lists (ACLs) for z/OS UNIX System Services IBM Health Checker RACF Support, helping diagnose configurations errors Receive user and password updates from LDAP, and notify user and password changes Many other features! Note that this presentation is far from being exhaustive and only shows basic RACF commands.
60 | P a g e
61
2.10 DFSMS: Managing Data

DFSMS for Data Facility Storage Management Subsystem is a software suite which allows customers to manage their data from creation to suppression. It provides tools able to control data space allocation, performance and availability needed according to specific data set, backup, etc
Why do we have to manage data?

In Mainframe environments, when you allocate a data set or a member, you have to define space it will use as well as other information such as its target volume. System programmer will then waste time to indicate where and how their file might be created. Furthermore, number of programming mistakes can occur due to human errors. As a result, devices can be full whereas they dont contain much data. Indeed, allocation space needed for data set is considered like a reserved space. Then, you can reserve a huge space for a data set which will be however very small. Its a huge loss of space. In big infrastructures, this could be measured in megabytes or even gigabytes losses! Other errors can occur, for example if a programmer wants to allocate a file in a volume which is completely full, it wont work. It should be placed in another volume. It would be nice to use a program which could deal with these considerations, according to specific rules and template. Thats what DFSMS can do. Its designed to determine object placement and automatically manages data set backup, movement, and space. Another advantage of DFSMS is to avoid rewriting old JCLs when changing device type. Indeed, if a customer replaces its old 3380 volumes with 3390, he will have to change much parameters in his JCL programs, such as the UNIT one (3380/3390) or SPACE, which effects directly depends on the device attached, as it uses track and cylinders concepts. When SMS is installed, you can use the AVGREC parameters in JCL, which can be specifying in absolute space: bytes (U), kilobytes (K), or megabytes (M). As it wont change with time, we talk about Device Independence.
With DFSMS, users usual operations are automated, and optimized, avoiding errors and numbers of boring checks. It allows system administrators to define to each parameter which was once required a default value, using template classes called SMS Constructs.
61 | P a g e
62
Defining SMS Constructs

DFSMS establishes classes called SMS constructs defining how data set will use device resources to meet user requirements and automate operations such as allocations. SMS Constructs define parameters used as default when youll deal with data set. They are defined in ISMF panel: Data Class (optional): Define allocation defaults. It supplies information on the how data will be created. It then includes space parameters, attributes, data set types, etc
Any values explicitly specify in programs always override values specified in a data class. This is to prevent the system from modifying the intent of your allocation.
62 | P a g e
63
Storage Class: Define different levels of performance and availability services for the data sets. Thanks to these parameters, you can define a needed level of service, according to specific data set which will use that class. SMS will then define where it will allocate data set, in order to meet performances needed. You can also supply information such as dynamic cache management, sequential data set striping, and concurrent copy.
63 | P a g e
64
Management Class (optional): Define a list of backup and retention values for DASD data sets. It then specifies how we will manage our data after their creation. It then also allows you to supply such information such as expiration attributes or migration attributes.
64 | P a g e
65
Storage group: Represent the physical device on which the data sets reside. There are six types of storage groups:
Pool: Contains SMS-managed DASD volumes managed as a single entity Dummy: Contains volume serials of volumes no longer connected to the system which
are treated as SMS-managed; allows existing JCL to function unchanged VIO: Contains no volumes; allocates data sets to paging storage which simulates the activity of a DASD volume Object: Contains optical volumes and DASD volumes used for objects Object Backup: Contains optical volumes used for backup copies of objects Tape: Contains SMS-managed private tape volumes
Storage Group also allows defining whether or not automatic migration, backup, and dump are allowed within the pool, as you can see.
SMS constructs are rules templates. As every template, you have to apply them on something. Lets guess To what thing could be applied SMS constructs? With data set, of course. Data sets allocated with SMS are called SMS-managed data set. When allocating it, we can manually specify classes used by a data set. However, administrators use SMS to automate actions as much as possible It would be quite stupid to manually define SMS structures, unless being having to do it in some precise cases. Thats why we can use ACS routines to define which classes will be applied on a data set, according to its name. Remember the enhanced generic naming dataset concept from RACF? Well, ACS routines use the same concept: applying a class template on a data set template.
65 | P a g e
66
Assigning SMS constructs to data set using ACS Routines

An ACS routine is a mechanism used under DFSSMS helping system administrators to automate storage management using data set profile template. Its a sequence of multiples instructions using parameters for having the system assign SMS classes and groups to data sets. These parameters are multiples and although we often use the data set name as reference to define the SMS constructs to choose, ACS can also match the volume serial number, the data set size, the job name, etc It will fetch all these parameters to assign classes to data sets. There are four types of ACS routines, and a SMS configuration can use one of each of them. They are read by DFSMS in that order: Data Class routine (optional): Assign data class, used to simplify allocations by using default values for JCL parameters or DCB attributes, such as UNIT, SPACE, etc Storage Class routine: Assign storage class, used to deal with performance considerations Management Class routine (optional): Only read if a storage class is assigned to the data set fetched. Assign management class, used to define backup, retention, etc... Storage Group routine: Even if its only read if a storage class is assigned to a data set, this ACS routine is required. Its used to assign storage group. ACS routines may be a few confusing without examples, here is a template for a data class routine:
PROC 1 DATACLAS IF &DATACLAS = '' THEN DO IF &HLQ = 'FLSA00' && &LLQ = FB* THEN SET &DATACLAS='FLSADC' ELSE SET &DATACLAS= '' END END
This example uses ACS three parameters: DATACLAS, HLQ and LLQ. Of course, there can be much more. In this Data Class ACS Routine, we first check if a data class hasnt been manually assigned. If so, we check if the High level qualifier equals FLSA00 and if the Low level qualifier begins with FB. If so, we assign the FLSADC data class (also known as SMS constructs). It then fetches, FLSA00.FB80, FLSA00.FBA, but not FLSA01.FB80 or even FLSA00.FAB. This is a really basic example, which helps to better understand things we can do with ACS routines. These are extremely powerful, and allow administrators to properly define their data sets allocation according to different characteristics. As ACS routines use conditions such as IF, ELSE or OTHERWISE, and numbers of parameters such as LLQ, HLQ, USER or even GROUP, its very easy to customize them. However, they often are as the one presented.
66 | P a g e
67
Applying DFSMS configuration

Once configuration is done, administrators have to apply it. DFSMS use a special VSAM file, called ACDS (Active Control Data Set), containing your ACS routines. You then have to create it:
//STEP EXEC PGM=IDCAMS //SYSUDUMP DD SYSOUT=* //SYSPRINT DD SYSOUT=* //SYSIN DD * DEFINE CLUSTER(NAME(SMS.ACDS1.ACDS) LINEAR VOL(SMSV02) TRK(6 6) SHAREOPTIONS(3,3)) DATA(NAME(SMS.ACDS1.ACDS.DATA)REUSE) /*
Then, in ISMF, you specify which ACS routines ACDS will have to use.
After having validated your ACDS file (option 4) you will need nothing to do but apply it (option 5)
Note administrators can backup ACDS file into SCDC files (Source Control Data Set). It will allow them to have different ACDS potential templates, without having to link several time ACS routines. To do so we allocate the SCDS file: same JCL as before except data set name and SHAREOPTIONS(2,3)
Then we copy current ACDS used to it: SETSMS SAVESCDS(SMS.SCDS1.SCDS)
Thats it, DFSMS basic configuration is finished!

Remember that DFSMS is a software suite composed of several products. We only saw few concepts of DFSMSdfp (Data Facility Product) which can also deal with Copy Services functions not described here. Note that DFSMS also proposes optional features such as DFSMShsm (Hierachical Storage Manager), DFSMSdss (Data Set Services), DFSMSrmm (Removable Media Manager) and DFSMStvs (Transactional VSAM Services), which offer several backup, recovery, migration, and management functions.
67 | P a g e
68
2.11 Health Checker: Auditing system

IBM Health Checker is an advanced software helping administrators to identify problems in their system before they impact the business. It checks several products settings and compares them with the ones suggested by IBM, or with those defined by administrators as having to be used. If it doesnt fit, Health Check will output detailed messages to administrator to let them know the potential problems, with a detailed suggested solution. You can see it as a Nagios solution for z/OS system, or more precisely as a Microsoft Operation Manager: a product using template to check your system, and providing know solutions. Note It wont change anything, just warn you on suspect parameters. With such a solution, z/OS management is simpler. Numbers of significant errors can be avoided:

Configurations which are less than optimal Threshold levels approaching the upper limits Single points of failure in a configuration in product such as Sysplex Changes in IPL parmlibs which could be disastrous once machine rebooted
Configuring Health Checker

This software should be installed and configured in each big infrastructure, as it helps to optimize products settings and to avoid much human errors. It can be done very easily. First, we have to allocate the HZSPDATA data set, which is used to save data required between IBM Health Checker restarts. To do so, we use that JCL.
//HZSALLCP //* //HZSALLCP //HZSPDATA // // //SYSPRINT JOB EXEC PGM=HZSAIEOF,REGION=4096K,TIME=1440 DD DSN=SYS1.HZSPDATA,DISP=(NEW,CATLG), SPACE=(4096,(100,400)),UNIT=3390,VOL=SER=TRGVOL DCB=(DSORG=PS,RECFM=FB,LRECL=4096) DD DUMMY
Then, we create a new HZSPRMxx member, or we use the HZSPRM00 default PARMLIB for Health Checker. It includes policy statements and logger parameters. Once done, you can start it.
//HZSPROC JOB JESLOG=SUPPRESS //HZSPROC PROC HZSPRM='00' //HZSSTEP EXEC PGM=HZSINIT,REGION=0K,TIME=NOLIMIT, // PARM='SET PARMLIB=&HZSPRM' //HZSPDATA DD DSN=SYS1.HZSPDATA,DISP=OLD // PEND // EXEC HZSPROC
68 | P a g e
69
Launching Health Checker

To start this task we have to use the command:
S HZSPROC,HZSPRM=00
It will be started using the HZSPRM 00 dataset and will send a HZS0103I system message. Administrators will then be able to consult every exception, and should be concerned about the red one, since they represent high severity exceptions. These last one will be consultable via the SDSF panel, using the CK command. Severity and interval check time exceptions will also be edited here.
Administrators can also use the HZSPRINT utility to generate report resuming all system exception checked by Health Checker. Here is a sample exception check output, with explanation provided.
Note that Health Checkers also allow administrators to write their own checks, which can be very useful to standardize product configurations. It can also use RACF to only allow administrators to check its exception, define them, etc 69 | P a g e
70
Checking Health Checker Data

Exceptions message can have three different severities: high, medium and low. Every product can be checked with IBM Health Checker, but the ones having the most interesting templates are IBM product such as RACF, CICS or IMS. Even if you can consult them with SDSF, IBM also provides a great solution, available under Microsoft Windows environment: zMC (for z Management Consol). This program is free, but needs JDK to be installed to run, as its written in Java. It also needs DB2 for Window free edition, and local administrator authority. On z/OS system, Tivoli Enterprise Portal has to be configured. This interface is great and allows administrators to quickly check their system.
Screenshots from NCACMG Health Checker User Experience Presentation
70 | P a g e
71
2.12 Virtualization technologies

Virtualization became a very used term in IT environments. Nowadays, its often used for server virtualization, which consists of hosting multiple operations systems, independent each others, on a single host machine. It offers more convenient administration solutions, space and machines cost save, etc But it can have other significations. Indeed, there is much than one kind of virtualization; this is not only about Operating Systems or Software, as well see. Virtualization is not as modern as people often think. In fact, IBM used this process about fifty years ago! Indeed, virtualization really began in the 60s during the System/360 Model 67 Mainframe. In this one, all hardware interfaces were virtualized through a VMM (Virtual Machine Monitor), which was called after years the Supervisor. Finally, when the ability to run Operating Systems on others comes in the 70s, it was renamed as the Hypervisor. Virtualizes something means to change its form and to make it appeared differently. As an example, virtualizes a single computer could be make it appear as multiples computers. Contrary, it could also mean making many computers appear as a single one. This is often used in clustering, we then talk about Server Aggregation or Grid Computing.
Some kind of Virtualization

Hardware Virtualization This is an interesting and complex virtualization solution, in which entire hardware configurations are emulated. You can thus have very different virtual hardware on the same machine.
The main problem is the average performance Its slow. Very slow. Indeed, every instruction must be simulated on the physical hardware. This configuration demands a very powerful machine, and is not advised. Its only used in very few situations such as development. Indeed, its advantage is that you can directly use unmodified OS. With this solution, you can thus run an OS which have to run under PowerPC architecture on an ARM processor. According to specialists, it can also be used during development of hardware firmware. Even if the real hardware is not yet available, developers can test their code on a virtual one. Definitively not a good solution for most users, but can be convenient. Bochs solution uses it. 71 | P a g e
72
Processor Virtualization During the 60s, another kind of virtualization was used to proceed BCPL (Basic Compiled Programming Language), which was a simple typeless language created by Martin Richards. The source code was first compiled by a compiler into some kind of intermediate machine code called OCode. As a result, the second step was to compile this code with a O-Code Virtual Machine, to provide native language code for the target machine, also called bit code. The same process was used for the Pascal language in the 70s, with the P-Code Machine (pseudocode). Thus, Pascal was first compiled into P-Code, and this one was executed on P-Code Machine which generated bit code. These virtual machines and this way to produce program code was really interesting and modern. Indeed, it allowed programmers to write highly portable applications and to run them anywhere a P-Code or O-Code machine was available. This way of doing portable apps is still used. The Java Language was based on the P-Code model for its Java Virtual Machine. It allows a wide distribution of Java programs, and its success remains on this ability. But Sun didnt invent anything, just improved and well used this concept. Instruction Set Virtualization The most recent kind of Virtualization is the instruction set virtualization, also called binary translation. This is used to dynamically translate a virtual instruction to a physical instruction set. To better understand this concept, we can have a look to the Code Morphing Technology used in the Crusoe CPU by Transmetta. It allows you to use any kind of instructions set from any architecture on a single one. For example, if our program is compiled to use x86 instructions set, it can be launched on a Power PC. Code Morphing will translate x86 instructions code to its PPC corresponding. In fact, Crusoe uses VLIW (Very Long Instruction Word) instructions. It only translates any instruction to that kind of instruction. This is the same concept of game system emulators such as MAME or zSNES, this is only about instructions translations. Its also interesting to note that the DAISY (Dynamically Architected Instruction Set from Yorktown) Project from IBM uses the VLIW architecture. It seems to be the future of instructions set use, in a world in which standards and processes are more and more important.
Operating System Virtualization on x86

OS virtualization is the most known kind of virtualization also the most used and interesting. Its been more and more democratised during these last years, thanks to solutions such as VMWare or Virtual PC on x86 architecture. This architecture was not conceived to run several OS at the same time. OS Virtualization uses several techniques to allow customers to install multiple OS on a single machine, as if they were running on a unique machine. Their administration is thus much simple and servers cost less important. There are much ways to virtualize Operating Systems. We can thus count four virtualizations methods, which have their strengths and weakness.
72 | P a g e
73
Virtual Machine Virtualization This is the most know virtualisation solution, as the easier to use and implement. VMWare and Microsoft Virtual PC use that technique, which is very simple: it runs software on a host system, adding a virtualization layer. Guest OS are running on this software, and dont directly interact with the main system hardware.
Every I/O instructions, for example, are executed and translated thought the virtual machine. The advantage is that most OS can run under these products. However, performances are not that good as it adds another layer.
Operating System-level Virtualization This solution allows customers to create secure and isolated virtual environments on a single physical machine. It thus allows admins to use the whole machine power, with fewer performance penalties.
This kind of virtualization works on the Kernel Layer, you can then create numbers of virtual servers, which will act as isolated machines. There partitions are called VE (Virtual Environment) or VPS (Virtual Private Servers). The solution will theorically ensure that applications wont conflict each others, but its not always true. As a result, each virtual server perform and execute applications like 73 | P a g e
74
independent servers with their own memory, configuration files, users and applications. Each one can be rebooted independently. We can see it as an extension of the chroot procedure. Its advantage is its performance: certainly one of the best we can find for virtualization solutions, as its based on the same hardware and executed on the kernel layer. Moreover, it can be used on standard x86 architectures, which are inexpensive. It can be very interesting for small business. However, as its based on the OS kernel, it wont be able to run different operating systems (which use different kernel). Then, you cant run a Windows Server on this architecture, only Linux or AIX servers, based on the same Kernel. Moreover, it needs a huge security, because DOS (Deny of Services) can be used against a partition from another OpenVZ, Virtuozzo solutions and Linux VServer use that technology.
Paravirtualization This solution uses a Hypervisor (VMM) that is quite similar to the real physical hardware. The most notable thing in paravirtualization is that all guest OS have to be modified and integrate a kind of virtualization awareness code into themselves. They must be aware they are Virtual Operating Systems. Thus, if you use it, you have to select OS that have been ported to run under a VMM.
This solution is being more and more used in big infrastructure. The famous Xen Server uses this technology. As its been said, youll have to use custom OS, and then, then some of your favourite wont be able to run on this infrastructure. Moreover, some paravirtualization solutions need special hardware configuration. This is not easy to install this solution, but once its done, it will be quite easy to manage it, and its performance is very good. However, as guest OS are modified to use to Hypervisor, they may be updated when the Hypervisor will be. Note that VMWare Workstation uses that kind of virtualization, but has a large compatibility list. But Xen still remains one of the most interesting virtualization solutions on x86, has its performance are really impressive But its not really a surprise, as theyve been helped by IBM engineers, specialized in virtualization.
IBM is a major contributor to the Xen Project Dr. Ian Pratt, Xen project leader and XenSource founder
74 | P a g e
75
Full Virtualization This solution is quite similar to paravirtualization. It thus uses a virtual machine (called Hypervisor) which mediates between the physical hardware and the different Guest OS.
As Guest OS are unmodified, they are not aware they are virtual. Thus, the Hypervisor has to protect some hardware instructions, because its not owned by a unique system, but by many of them. His jobs is to trap these instructions (often I/O instructions), and handle them. The aim is to manage the whole instructions set used, and the whole hardware for all the Guest OS. It has proven its reliability and security for years.
Operating System Virtualization on System z9

According to x86 microcomputer market actors, virtualization concepts are based on very modern technologies. Most people say its the future, and that IT infrastructure will benefit as never from its advantages. Even if theyre right on this fact, they also forget that virtualisation has been introduced in IBM Mainframes since years. Indeed, first virtualization began in the 70s; on the S/370 model family. Today, it thus profits from decade of work and improvements. As a result, the most advanced virtualization technologies are running under system Z. PR/SM PR/SM (Processor Resource/System Manager) is a Hypervisor provided with system Z. As every Hypervisor, its transforms physical resources into virtual resources which will be available to each guest OS running on logical partitions called LPAR. PR/SM enables partitions to share I/O resources, but also processors, memory and networks cards. As a result, every logical partition will share the same hardware, and operate like an independent system: unlike paravirtualization, guest OS are not modified and are not aware to be virtual. This partitioning system is EAL 5 certified, which means that each partition is like a servers without any connections to others, physically as well as logically, unless explicitly defined in their configuration. Administrators can define up to 60 LPARs on IBM system z9, each one able to run z/OS, z/OS.e (a light version of z/OS, e for express), z/VM, TPF (Transaction Processing Facility), z/VSE and CFPP (Coupling Facility Control Code). Each partition will have dedicated virtual resources. 75 | P a g e
76
With PR/SM, administrators can dynamically modify virtual resources of each partition, adding them or removing them, without having to shutdown affected LPAR. Then, they can dynamically redefine all available system resources to reach optimum capacity for each partition. This system is based on weight, which is the priority of each logical partition
. PR/SM also benefits from great features such as Intelligent Resource Director, which attribute virtual resources to guest OS according to their workloads and priorities. z/VM z/VM (z/Virtual Machine) is a Hypervisor emulating and distributing physical hardware resource to several machines. As a result, you can create numbers of virtual machines which will be contained in one logical partition. Each virtual machine will be independent, and will share physical resources with others, without knowing their existence. Unlike LPARs system, administrators can define an unlimited numbers of logical operation systems. It only depends on your available hardware resources. The more its powerful, the more you can define logical partitions. Note that z/VM can host guest running another z/VM. As a result, administrators can use several z/VM running in another z/VM, without any limitations but your resources.
76 | P a g e
77
What are the differences between z/VM and PR/SM? PR/SM Max Numbers of partitions License cost Partition Adding Best Use Case 60 Free Needs LPARs shutdown Static environment z/VM Unlimited Paid (depends on installed CPs) Dynamic Test environments Needing changes, Add/remove servers on the fly
Virtual Network Management

When using z/VM and LPARs, administrators can use virtual networks. They consist of virtual devices and adapters which are physical resources shared among several virtual systems. As a result, virtual machines wont need any equipment such as physical router or switch, because there functions will be virtualized under z/VM. Each virtual machine will then belong to a virtual guest LAN. Guest LANs are closed virtual TCP/IP LANs which are running under a z/VM environment. They simplify internal virtual machine communication. Security is also reinforced; as each virtual machine will only be able to communicate with other virtual machine belonging to the same guest LAN. Of course, several guest LANs can communicate each others, but only if theyre configured explicitly. Communications initiated within a guest LAN use HiperSockets. This is an IBM technology allowing IP data to use the memory bus. This allows virtual systems to communicate each other at memory speed without using any processor cache, which minimize contention with other I/O activity. Furthermore, since they operate at memory speed, bandwidth offered by this technology is far greater than any LAN technology and overall reliability and availability is far better, as it doesnt use networks equipments which could potentially shutdown. Security is also improved, as IP data transferred cant be sniffed anymore on the network with software such as Ethereal. Virtualization on System z is really mature, and IBM is a pioneer in that domain. Every great x86 solutions are based on LPARs system, and the hype Xen projects took IBM technologies in example. Most advanced projects use paravirtualization, since IBM used full virtualization for years Its then interesting to know that technologies seeing as the more modern ones are used since decades on the machines seeing as the oldest ones Seems quite ironic, huh?
77 | P a g e
78
2.13 Solutions for high and continuous availability

Concepts of high and continuous availability have become more and more important these last years. Indeed, companies want there information systems to be operational every time. Its even more important for huge structures such as banks in which their systems directly deal with their business and customers. Most people dont know the distinction between high and continuous availability, and it seem important to precise their signification: Continuous availability directly deals with the software: if a program crashes, such as a database, another should be available to handle its workloads. In traditional distributed system environment, clustering is a good solution to offer continuous availability. High availability directly deals with the hardware: for example, if a processor crashes, another one should be available as its spare. Nowadays, most servers offer a high availability for their power supply, but others elements are seldom redundant. The most important thing to remember it these two concepts goal is to avoid SGOF (Single Point of Failures). Indeed, SGOF are not possible in huge production having to run under any conditions. A SGOF is an easy concept: suppose you have a unique modem in your company. If it crashes, you cannot use external resources anymore, and your customers cannot access your website anymore too. In such a situation, the modem would be a SGOF. Then, to make it simple, a SGOF is any single piece of equipment that, if it fails, can stop a whole part of your business. In our example: every access to the Internet, which could be disastrous for most companies. It just cant happen; serious companies have to verify if any SGOF exists on their infrastructure, and must solve this problem urgently. Then, they will have continuous and high availability. As we could see it in previous parts, Mainframe environments, especially System z, offer a really great high availability to its customers, as every component of their hardware are redundant. In x86 distributed environments, many solutions of continuous availability exists, which are relatively stable and efficient. The most known solution is the cluster. Its a group a coupled computers which work together so closely they can be seeing as a single one. As a result, if a node (a computer being part of a cluster) fails or crashes, it wont be dramatic as the others will take care of the workload it would have to execute. This solution considerably decreases numbers of SGOF. Microsoft proposes its own solution: MSCS (Microsoft Cluster Server) allowing machines to work together as a single one and providing failover/failback features. Its a good cluster system with products such as SQL Server or Exchange. Linux systems also propose their own solutions, such as Linux Virtual Server or OpenMosix, offering efficient load balancing features. Clustering Services are very successful in distributed environment. However, it would be interesting to study system z solutions for continuous availability. Thats the purpose of Parallel Sysplex. 78 | P a g e
79
Parallel Sysplex
Available since the 90s on MVS/ESA, Sysplex (meaning System Complex) is a collection of several z/OS systems or logical partitions able to cooperate together. Multiple systems can then be linked, even if theyre part of different machines. The main idea of a Sysplex is to deal with multiple system images as a single one. Thus, standard clusters on x86 distributed server use the same concept. However, Sysplex benefits from decades of innovations and offers a very advanced clustering system. Coupling with one or more CF (Coupling Facilities), this system aggregate becomes a Parallel Sysplex. Parallel Sysplex provides the highest level of application availability on a System z platform. It implements advanced data sharing and dynamic workload balancing (called load balancing in distributed environment). It also includes features such as physical resources sharing. In fact, Sysplex was first used to benefits from the power of several zSeries machines, as they were not enough powerful to deal with some precise workloads. Thus, the problem evolved from a preoccupation with power to a preoccupation with systems availability. In a Parallel Sysplex infrastructure, each node can share all kinds of resources with others systems being part of the same Sysplex. Nodes then optimize there shared resources to efficiently deal with workloads having to be executed. Furthermore, as WLM in a single image, Parallel Sysplex directly checks every partitions available capacity, and workloads are directed according to these availabilities. As a result, every partition is used efficiently as well as its resources. Parallel Sysplex also allow concurrent read and write access the same shared data from all nodes being part of the same Sysplex. This feature doesnt impact data integrity and dont significantly decrease systems performance. Each node can then work on a same workload, in parallel processing. It then speed up request and overall performances, as it split a workload in few parts, each one being processed by a different LPAR (Logical partition) of the Sysplex.
The technology allowing multiple LPAR of a Parallel Sysplex to share all resources, such as catalogs, disk or even systems logs is called CF (Coupling Facility). There can be one or more CF in a Parallel Sysplex, but as everything in a Mainframe environment, its very advised to have at least two CFs. A CF is just a logical partition running a microcode called CFCC (Control Facility Control Code). It doesnt need to be IPLed as its system if automatically loaded when its the partition if activated, and must be managed under the HMC (Hardware Management Console). A Control Facility includes piece of data cache called structure. These are where shared data are buffered, and accessed by every partition of the Parallel Sysplex. Then, structures can be seen as huge shared memories.
79 | P a g e
80
To make it simple, a Coupling Facility: Is a normal partition including CFCC microcode Must be linked with EVERY partition being part of the Parallel Sysplex, three ways to do so: o IC: Internal Coupling: logical links in a same machine o ICB: Integrated Cluster Bus: to link the CF with another z9 being at less than 7 meters o ICL (ESCON/FICON cable): to link the CF with another z9 being at more than 7 meters Can be executed on a specialty engine (ICF: Integrated Coupling Facility) Customers should double their Control Coupling Facility, even if its not a prerequisite. They can also use their CFs both ways: in a duplex mode, every CFs is duplicated, if one of them crashes, it wont have any consequences. In a standard mode, is one crashes, its data will be transfers on other CFs, but it theyre full, the customers will lose data. Most of case, companies use non Duplex CFs, and quantifies their memory, if one of them had suddenly crashed. In this situation, each CF must have sufficient power processor and memory to allow handle data of another one. However, Duplex mode remains the best solution, as it avoids any potential SGOF. Parallel Sysplex needs others elements, as a Sysplex Timer; to synchronize the clocks of all systems but a Server Time Protocol (STP) can also be used. Couple Dataset are also needed, to define available Coupling Facilities, the Sysplex state, its WLM policies as well as its structures definitions.
80 | P a g e
81
Parallel Sysplex is a very advanced clustering solution, and is used in most big infrastructures for its: Continuous Application Availabilities Single point of control reducing administration costs Performances, Data Sharing and workloads balanced
Geographically Dispersed Parallel Sysplex

Every huge company should be aware of possible natural or human-induced disaster effects. As a result, most structures like banks have a business continuity plan which defines what to do in order to recover from a disaster: this is naturally called a disaster recovery. Many solutions proposing a disaster recovery exist in distributed environments, but they remain quite difficult and long to set up. The best way to deal with disaster recovery is to have two sites for its production center. Indeed, in such situations, if an entire site crashes, in a plane accident, earth quake, or something else, production wont be stopped, as the second site will handle business workloads. Weve seen that Parallel Sysplex offers a great solution for clustering, in a local area. GDPS (Geographically Dispersed Parallel Sysplex) deals with the same technology, but in a wide area, up to 100km in synchronous , and even more in asynchronous (theoretically an infinite distance). GDPS is then an IBM automated high availability and disasters recover solution. GDPS offers many different types of implementation, which are based on several Copy Services, which well present in that chapter: GDPS/PPRC: based on the Metro Mirror replication system, used for Continuous Availability and Disaster Recovery. Works in synchronous, on a limited distance. It can also support the HyperSwap technology, allowing an application-transparent swap of storage devices, which is quite convenient in multiple sites environments. GDPS/XRC: XRC (eXtended Remote Copy) is an asynchronous mirroring technology, also known as z/OS Global Mirror, and which is close to Global Mirror, but working directly on z/OS and not on ESS (Enterprise Storage Servers). This solution is used for Disaster Recovery, supports any distance and is very similar to GDPS/Global Mirror. GDPS is then designed to enable data consistency and integrity, with none or minimal data loss.
Copy Services
Parallel Sysplex and GDPS are very interesting technologies, but if a site crashes, customers will also need their data. These are the most important thing in an IT environment, as they directly deal with the business needs. If a company loses its machines, it can be ok; it will just have to buy new ones. But if it loses its data, theres nothing to do about it. Then, its very important to save their data, containing very important information in Mainframe environment, such as bank account, customers profile, confidential studies, etc Conscious of these problems, IBM proposes multiples services in its enterprise storage servers in order to simplify data backup en synchronization. These solutions are directly used in GDPS solutions, to have two identical production/backup sites, as well see it. 81 | P a g e
82
There are several technologies included in Copy Services; here are the most used and interesting: Flash Copy: used within the same site, this technology is also known as PITC (Point-in-Time Copy). It allows customers to create an immediate copy of one or more logical volumes. To do so, it first establishes a bitmap between source volumes and target volumes, describing the copy process state. Then, in this bitmap, each volume track is represented by a bit. This operation takes about more or less three seconds, according to volumes saved. After that, source and target volume can both be read and written. Target objects are then exact copies of source objects, but theyre empty, physically speaking. When a user needs to access an object, Flash Copy will read its corresponding target volume bitmap: if the resource is accessed in read mode, it will be read on the source file (thus located on the source volume) if the file had not been yet written to the target source. But if the object is accessed in write mode, Flash Copy will first backup the source file to the target volume, and then user will modify this file in the source volume. Flash Copy can be processed both ways: o In NOCOPY function: Only modified files are written on the target volume. Files accessed in read only will be accessed through the source volume. Thanks to this option, performances are boosted, since source volume is not entirely saved, only modified file will be backup on the target volume. o IN COPY function: it acts like in NOCOPY, but also uses background processes which save every files of the source volume to the target volume. Thanks to this option, the target volume is a real backup of the source one, when Flash Copy was initiated. Flash Copy is not that simple to understand, as its not a usual way to backup file. In distributed environment, most administrators just backup files or even entire volumes without using any special technologies. These schemas will help you to understand.
82 | P a g e
83
Metro Mirror, also known as PPRC (Peer to Peer Remote Copy), this technology is used to mirror one or more volumes to another site being in a remote location. It thus works on two different sites. Its a good solution for disaster recovery, as an entire site can be copied in another one, avoiding a very long data recovery process before restoring usual operations.
Once the Metro Mirror relationship is established between the two volumes, each on being in a different site, both of them are updated simultaneously. Indeed, Metro Mirror technology is based on a synchronous copy. Then, each data written on a source volume being in the primary site will be also written on a target volume being on the recovery site. In such a configuration, an I/O is not seeing as completed as long as its record has not been written to both volumes. Then, data on the primary and backup site are always identical. Since its based on the microcode of Enterprise Data Storage, it doesnt have any impact for the host systems. However, as its a synchronous technology, its effectiveness is based on the distance between the two sites. Metro Mirror can be used with a distance up to 300 kilometers, but each I/O would take more than 3.5ms in such infrastructure. In fact, this technology is often used for sites being to approximately 10 kilometers one of the other. Global Mirror, which is a combination of Global Copy, also known as PPRC XD (Peer to Peer Remote Copy Extended Distance) and Flash Copy, presented above. Global Copy, as Metro Mirror, is a feature included in the microcode of Enterprise Storage Servers. As a result, it doesnt have any impact for the host systems. The main difference between PPRC (Metro Mirror) and PPRC XD (Global Copy) is that PPRC XD is asynchronous. Indeed, its often used to mirror one or more volumes to another site being in a remote location at a significant distance. As its asynchronous, local Enterprise Storage Servers dont have to wait for the writing acknowledgment from backup site.
83 | P a g e
84
The primary storage system then uses a bitmap containing the changed data, and stocks it until its able to send it to the backup site. Data Migrated on backup site are not consistent, as the asynchronous mirror is used. It thus remains a problem, because not consistent data cannot be used in case of disaster recovery. As a result, Global Copy periodically changes its mode to synchronous, when response time delays are acceptable, to fully synchronize data between primary and backup site. The, first volume from backup site are FlashCopied to a tertiary set of volumes, providing a consistent set of data for Disaster Recovery or Business Continuity. With such a configuration; Recovery Point Objective can be few seconds. Metro Global Mirror, which is a combination of Metro Mirror and Global Mirror. In such a configuration, there are three sites: the first one being the primary connected to the second with a Metro Mirror link, and then the second connected to the third with a Global Mirror link. There are two backup sites in that kind of configuration, which is really appreciated in banks, in particular to fulfill the requirements of Bale II.
Being sure to have a consistent backup of their data is very critical for customers. Several technologies exists, but as we could see it, Metro Mirror, Global Mirror and Metro Global Mirror; combined with solutions such as Geographically Dispersed Parallel Sysplex. Furthermore, these Copy Services solutions also offers features to deal with Open Systems distributed environment data, as they are independent of hosts systems. They then effectively meet great infrastructures needs and propose advanced solutions for high availability, continuous availability and Disaster Recovery.
84 | P a g e
85
3/ Mainframe in the future: Dead or Messiah?

In the 90s, Mainframes were seeing as obsolete machines. There death has been predicted since years, even by prestigious magazines, consultants, and economists.
I predict that the last Mainframe will be unplugged on 15 March 1996 Stewart Alsop, 1991
This famous sentence seems quite funny nowadays, as Mainframes are still in most IT infrastructures. Now, press when distributed servers environment begins to show their limits and defects, press discovers once again qualities of the Mainframe. The Mainframes technologies still remain the most advanced, and its not surprising. Indeed, IBM invests more than $1 billion for each Mainframe generation, to offer the most advanced hardware and system. There is not much marketing about system Z and others IBM platform products, only 20% of its budget is dedicated to it. The goal is to offer nothing but innovations. Much consulting groups, such as Gartner or IDC believe that System z is going to be the reborn of Mainframes. Without any doubts, Mainframes will be still used for the same reasons they are currently used. But what is new in our decade, is that Mainframes are surely going to conquer some markets. Indeed, the zLinux killer feature included in System z, allowing hundreds of Linux to run on a same machine, combining such Open OS qualities to Mainframe performances seduces more and more customers and helps them to deal with the new challenge of IT infrastructures: Server Consolidation.
3.1 Server Consolidation
Server consolidations goal is to combine workloads from separate machines or/and applications into a smaller number of systems or/and applications. More and more used in many enterprises, it helps to efficiently use computer server resources and to reduce the total number of machines. It thus supposes a reorganization of the IT Infrastructure, which will reduce its total cost of ownership and improve its resources control. Indeed, too much computers means too much cost.
One server to rule them all
85 | P a g e
86
First, there are the direct costs. Data Centers or even small and medium IT infrastructure will encounters a lot of problems if they have to deal with many machines. Indeed, more and more computers mean more and more problems: multiple same applications which are doing the same (such as email servers or database), under-utilised servers, space and energy needed, etc Number of x86 arch computers has dramatically increased during these last years so did their total cost. Secondly, what we can call the hidden costs which are not always considered, and which can however be huge. There are number of them, but we can define two main hidden costs:
Utilization: According to famous auditors such as D.H. Brown Associates or IBM, majority of servers often run at about 20% of their capacity, which is quite disastrous if we study the return on invest. Its not interesting to buy a machine which wont be really used. Non used power can thus be considered as a hidden cost.
People: The more you own physical machine, the more youll have potential hardware problems. That induces people costs which are not negligible. Idem for the IT people who will deal with the IT architecture. The more a Data Center contains machine, the more its difficult to make a correct topology of it. Server consolidation will be one of the main drivers to cut unnecessary costs and to maximize the Return on Invest (ROI). Most of big structures will have to do it. Here are the results from a recent study from Gartner Group done on about 520 enterprises.
It clearly shows Server Consolidation is and will be one of the most hype projects in IT Infrastructure. There are two ways to do Server Consolidation, and its very important to separate them. Indeed, most of people usually think that Server Consolidation is all about virtualization. They are wrong. Even it can effectively be done with virtualization; it can also be done in another way.
86 | P a g e
87
The main goal is to combine small workloads from separate computers and/or applications into a smaller number of computers and/or applications. You can thus:
Combine them on a single larger computer. Thus, there will be logically and physically less OS which will be running in your IT Infrastructure. Its dangerous because by doing that, all resources, even pure Operating Systems are centralized. If it crashes, this is a disaster.
Use Server Virtualization technologies. There will be less physical computers running, but the logistical number of OS running will be the same than before, in order to keep resource sharing possibilities, which will avoid disasters thanks to technologies such as clustering.
Please also note that Blade Centers are seeing as part of the Server Consolidation concept because they save much place. Combine with virtualization technology, they can be great too.
As virtualization gives much more advantages than the other solution, it often seen as the best way to do Server Consolidation and is even became is synonym in most documentations. Thanks to it, you can use less physical machines, and run them at nearly full capacity. A Server Consolidation Project which is well done means:
Reduced Computer Number
Servers Administration improved thanks to standardization
Reduce essentials costs, such as servers cost, energy and place needed
Reduce hidden costs, as you computer will be more used and run at nearly full capacity
Reduce management cost, as you have less processes and physical machines to deal with Moreover, in addition to help you to do a great Server Consolidation, virtualization will help you to efficiently build new environment very quickly. Indeed, if you need some servers, in production to have more backup virtual machines on critical applications such as firewall or web servers, youll just have to copy existing virtual machines to a new one. And thats it. No need to buy a new machine, configure it, etc It will help your IT team to have much more time to work on a more important task than a boring server install. With rationalized processes, virtualization will make your IT team more efficient. 87 | P a g e
88
Mainframes helping Server Consolidation

Linux has become these last years the new hype and cool OS to use, even in companies. Its rapid adoption by the developers community, open standards and its increasing on servers make it a growing and evident long-term player in the OS market. Linux runs on thousand of servers in big IT Infrastructure, and reorganize such big structure is quite difficult. However, its inevitable. Customers can use Mainframes to efficiently plan your Server Consolidation, in particular for Linux based system. Although IBM is known for its proprietary technologies, it also leads the industry in promoting compliance with open standards and interoperability with most systems such as Linux. When IBM announced its decision to make it possible to run Linux Systems on its Mainframes, it wasnt really understood. People used to be ironical and thought it was just an announce to say Hey, please remember guys! Look at us! Were still alive. However, fact is that the possibility to run multiple Linux OS on a single huge system is the new strength of Mainframes. I would even say its its chance to reappear as the most modern system. Even its performances were criticized when it was launched; Linux on zSeries is now powerful and effective. According to IBM, there are more than 1800 customers running Linux on their Mainframes.
Using Linux on System Z

First of all, we have to clarify used terms. Indeed, people often talk about Linux on zSeries but dont really what its about. Here is a summary of common terms and their significations.
Used Term Linux on S/390 Linux on IBM System z9 Linux on eServer zSeries Linux on IBM System z
Applies on S/390 system specifically z9 (Enterprise and Business) system specifically z990, z890, z900, z800 system specifically All systems above
Linux on IBM System Z are ports of usual Linux to the System z9, S/390 and zSeries architectures. It benefits known strengths of IBM servers as reliability and security while preserving all Linux qualities such as openness and stability.
88 | P a g e
89
Why zSeries is interesting for customers?

First of all, Operating Systems virtualization on IBM Large server benefits from more than 35 years of innovation in that domain. Indeed, the Virtual Machine (VM) technology which is used by IBM was created during the OS/370 development! Its then known as a robust computing platform. z/VM helps customers to quickly create as many virtual machines as they want, and to benefit from all zSeries famous advantages such as security and robustness. This solution lets administrators to easily share available physical resources on all their virtual machines, which will consist of virtualized processor, storage, networking, and all I/O resources. The zSeries system can then be run at full capacity, as its used to be. If customers need very quickly a test environment which is like their productions one, then they just have to just copy/paste their virtual machines HFS or zFS files representing each machines, ET voila! Much saved time. As time is money, virtualization is good to save a lot of dollars during Server Consolidation, and even after as youve seen it.
Communications between virtual machines is much faster and secure using System z
Furthermore, using Linux on z/VM make it possible to use the HiperSocket (HS) technology, this allows high-speed communications between partitions. It then provides in-memory TCP/IP connections between all your OS running under z/VM. Use HiperSocket greatly increase overall performances, as every transfer between two OS running on the same machine will be treated a memory-speed. There is nothing to do on the Linux Guest OS to use HS, so its simple and intuitive. Please also note that it also greatly increases security, as not exposed and vulnerable to sniffers.
As its part of z/VM, a Linux running on a zSeries can also benefices from all great backup and recovery features which are available on these platforms such as:
Capacity BackUp, which is a robust disaster recovery solution, and which can add a reserved capacity (CPs activation) in case of unplanned situations. Parallel Sysplex and even Geographically Dispersed Parallel Sysplex, for disaster recovery IBM Tivoli Storage Manager, which helps to reduce the risks associated with data loss by storing backup and archive of all OS Linux Based Image.
Linux on System z provides security advantages not available on other platforms
They include exclusive code patches which allows them to use security features only available on system Z servers Technologies used to run Linux on System z, such as LPAR or z/VM earned Common Criteria Certifications, and are Evaluation Assurance Level (EAL) 5. Linux advised by IBM are EAL4. 89 | P a g e
90
Virtual LANs can be configured, and only specified OS can access these networks. As its using HiperSocket, it provides insulation from other networks and data cant be sniffed. They can use hardware available on System z for cryptographic acceleration, such as Crypto Express 2 (CEX2) for clear key RSA. They can also benefits from Assist for Cryptographic Function (ACF) instructions available on IBM System z9 Enterprise and Business, which include hardware instructions for AES, SHA, and DES in both user and kernel space applications using special libraries. It speeds up every security application using cryptography. They can use the famous and secure RACF (Resource Access Control Facility) for users authentication. We just need to use the appropriate Pluggable Authentication Module (PAM).
In addition to these points, by using IBM Mainframe virtualization technologies and not distributed servers, you wont need anymore to buy a new machine when youll want to add a new server. Even if you use virtualization with x86arch solution such as Xen, you will never be able to have as many servers running on the same hardware. Costs needed for machines, energy and space are saved.
IBM System z Integrated Facility for Linux (IFL)

The IFL is an optional processor only available on IBM System z dedicated to add additional processing capacity and to exclusively process Linux Workload, running under z/VM or LPAR. Although its nearly often used in infrastructure using Linux for System z, its not required, as Linux can run on standard CPs. As system z9 Business Class lower price is about $100k and IFL costs 95$, enterprise can buy an IBM z9 using IFL for less than $200k. IFL are interesting because once youve pay it, you wont have to pay something anymore. Each upgrade will be free, and then your Linux performance will always grow up with time. It saves a lot of cost and avoids buying new machine to add power, as with standard x86 arch servers. Moreover, software licenses based on running CPs are very interesting if you use IFL. Indeed, a unique IFL running numbers of Linux will be seen as one CP. Then, every environment such as test, development or quality assurance can run on only one IFL without price rising. Please also note that customers can choose to only use IFL engines on their IBM System Z. If so, we talk about a dedicated IBM System z Linux Server. Its often use in small and medium-sized companies. Customers car also use the On/Off Capacity on Demand for these IFL CP, as for the others. This feature gives to customers the ability to use CP capacity by the day. They can thus turn them off for months, and use them when needed. In such example, user will only pay for the day he used it.
90 | P a g e
91
Which distribution can customers; use on zSeries?
Even if most famous distributions, such as Debian, Slackware or even Gentoo can run under Linux for zSeries, In fact, any distributions that conforms to the requirements of the System zSeries Architecture will run. IBM advices its customers to use Novell Suse or Red Hat, because of their great software support. Moreover, if you use these distributions, you can sign supports contracts with IBM which will include a fulltime coverage help in case of problems. Thus, these distributions are nearly always used.
SUSE Linux Enterprise Server
Red Hat Enterprise Linux Advanced Server
So whats the deal with Linux on System z?

Well, Linux on System z offers really great feature to customers. Combining the legendary security, scalability, and reliability from IBM Mainframes with all known Linux systems advantages such as the rapid innovation from the Linux and Open Source communities, it will be the foundation of your IT infrastructure and a nice choice for your Server Consolidation projects. As weve saw, Linux on System z will bring customers numbers of advantages such as: The best of both worlds IFL processor dedicated to Linux workload Mature and efficient virtualization technology A total cost of ownership (TCO) very interesting Overall security with secure layer such as RACF Communication fast and secure using HiperSocket Needed costs for distributed servers saved (energy, place) Hardware equipment saved (machines, FICON adapter, etc) Very advanced Backup and Fail Back/Fail Over technologies IT environment growing and quick adaptation to satisfy new business needs Furthermore, its important to know that IBM really involves itself to improve Linux performances and hardly work with important distribution developers such as Red Hat Enterprise Linux.
91 | P a g e
92
3.2 An interesting total cost of ownership

Much specialists use to say that Mainframes are too much expensive, especially compared to distributed environment. There are true, System z9; even in business class; are much more expensive than any other servers. The only consideration is about hardware price. As a result, many customers fail to consider direct and hidden costs when they buy a machine. Indeed, energy, floor place and cooling solutions costs may be very important and thus should be considered. We will study each of them, which are the new Data Centers challenges.
Energy Costs Considerations

Every industry or human activity has to control its energy consumption, for the new ecologic politics, as well as for its costs. Beyond the financial aspects, it becomes more and more difficult for companies to have a reliable power: in some areas with a high concentration, notably in United States, the energy suppliers are unable to cope with demand quantitative and qualitative requirements of these major consumers. Energy and power are cited as leading concerns of Data Center managers whereas a few years ago it was hardly an issue.
In 2009, energy costs will become the second highest operating expense for 70% of Data Centers Gartner Group
This becomes even more a problem as energy costs themselves are increasing at a rate of about 3% per year, and some specialists expect this percent to increase with time. The energy costs increase in Data Centers can be explained by both the energy prices increase and by the electricity consumption increase. But why energy costs has become such a problem in few times? Processors providing computing power are increasing. As they represent between 50 and 60% of computers energy consumption, one can easily understand their effects on energy considerations in a Data Center. Manufacturers of processors are really concerned about reducing consumption, as AMD which recently present its projects to deal with this problem. But much progress remains to be done to significantly improve performance per watt. Proliferation of systems such as BladeCenters causing overheating. As a result, it more and more solicits heat dissipation and regulatory systems, also consuming electricity. Many Data Centers do not meet recent standards that would reduce their overall energy consumption of, especially there use of thermal dissipation systems. IT owns a much larger place in the world than before and is really crucial now. Critical applications have multiplied, and must operate in 24/24. Furthermore, they require more and more computing power and storage devices which also consume energy. 92 | P a g e
93
This vision of the situation may seem pessimistic, but its clear that energy is a major component of Data Centers. According to analysts from IDC, 50% of investments in computer equipment are devoted to their energy consumption needs and it should increase to 71% in four years. To convince you, a recent study from the serious EPA (Environmental Protection Agency) presents the reality of a dramatic situation, and quantifies the consumption of Data Centers in the U.S. The conclusions of this report are impressive: American Data Centers consumed more than 60 billion kilowatt/hours in 2006.
It represents more than 1, 5% of the total electricity consumption in the U.S.

Energy consumed by Data Centers has doubled over the past 5 years and will certainly double the next 5 to reach about 100 billion kilowatt/hours.
It represents an annual cost of $7.4 billion

Existing technologies and strategies can reduce energy consumption by 25% This chart from EPA presents the situation as it will surely happen, but also if Data Centers managers take the right decisions to limits energy costs. The difference between the state of the art scenario and the historical trends scenario is about 85 billion kilowatt/hours, representing an overall economy of more than $5.5 billion! This amount can not reasonably be ignored.
In 2008, 50% of existing Data Centers will not be able to meet the demands of power and heat dissipation of high-density equipment such as blade Gartner Group
93 | P a g e
94
Power will be the number one issue for most large company IT executives to address in the next 2-4 years Robert France Group
This situation may appear exaggerated; however its already a problem for some Data Center!
The Data Center energy crisis is inhibiting our clients business growth as they seek to access computing power. Many Data Centers have now reached full capacity, limiting a firms ability to grow and make necessary capital investments. Mike Daniels, senior vice president, IBM Global Technology Services
Energy costs considerations have reached a critical point. It becomes the nightmare of most Data Center infrastructure managers, as it represents a significant part of their budget.
94 | P a g e
95
Managers of Data Centers have to choose machine which will consume less electricity, as well as where their Data Centers should be placed. Indeed, electricity doesnt have the same price everywhere in the world, and can even change between each state of U.S.
Some Data Centers cant be installed in some world areas, because of their energy costs. Its then a very important choice when companies outsource their production sites. For example, a recent outsourcing was quite disastrous, because hidden cost such as electricity had not been planned.
We thought our construction in Bangalore (India) was going well, until we found out that the land ownership was not clear Confidential Report of a Global Communication technology provider
A recent study from IDC clearly shows that rack servers will be the most used in big companies infrastructures. Standard computers seem to be doomed to disappear but still more used than Blade.
95 | P a g e
96
We will base our study on the average energy price in U.S, changing according each state, and the two more architecture used according to IDC: tower and rack servers.
Minimum kW Cost in USA, Avril 2K7 ($ cents) 4,85 (Idaho) Maximum 20,38 (Hawaii) Average 9,37
The most used Tower Servers in IT infrastructures are without any doubt Power Edge from Dell. Their consumption is really important. One year can represent up to 70% the based hardware price! Dell Model Power Edge 840 Power Edge 1900 Power Edge 2900 Power Edge 6800 Power Edge SC1430 Power (watt) 420 800 930 1570 750 Cost/Day 0,94$ 1,79$ 2,09$ 3,5$ 1,68$ Cost/Month 28,33$ 53,97$ 62,74$ 105$ 50,59$ Cost/Year 340$ 647$ 752$ 1271$ 600$ Minimal Cost 950$ 1450$ 2300$ 5000$ 900$
Rack Servers are more and more used in companies. Although their consumption is more interesting than Tower Servers, one year can still represent up to 35% the based hardware price! Dell Model Power Edge 860 Power Edge 1950 Power Edge 2900 Power Edge 2950 Power Edge 6850 Power Edge 6950 Power Edge SC1435 Power (watt) 345 670 930 750 1470 1570 600 Cost/Day 0,77$ 1,5$ 2,09$ 1,68$ 3,3$ 3,5$ 1,34$ Cost/Month 23,1$ 45,2$ 62,74$ 50,59$ 99,17$ 105$ 40,47$ Cost/Year 277,2$ 542,41$ 752,89$ 600$ 1190$ 1271$ 485,74$ Minimal Cost 1110$ 2100$ 4600$ 2300$ 5700$ 7600$ 1500$
An interesting fact about x86 architecture distributed environment is that usual server at 10% of its capacity calculation consumes almost as much energy as if 100% of its power was used.
96 | P a g e
97
These costs may seem very high, but they remain realist. Here is another example of energy consumption, for x86 processors only. It gives an idea of how much they really cost, once bought.
Electricity costs are based on the Annual Electric Power Industry Report presented above. Then, low kW cost represent energy price in Idaho and high kW cost the one proposed at Hawaii. Scenario Worst Hardware and Low kW Cost Worst Hardware and High kW Cost Best Hardware and Low kW Cost Best Hardware and High kW Cost Worst Hardware and Average kW Cost Best Hardware and Average kW Cost Average Hardware and Average kW Cost Cost Per Day 0,136$ 0,572$ 0,332$ 1,362$ 0,263$ 0,643$ 0,400$ Cost per Month 4$ 17,16$ 9,98$ 40,86$ 7,89$ 19,29$ 12,00$ Cost Per Year 49,70$ 208,87$ 121,5$ 497,20$ 96,035$ 234,75$ 146,10$
97 | P a g e
98
Here are the system z electric consumptions. They may appear to be very high, but they represent less than 1% of the total cost of the machine. Keep in mind that a system z9 can run hundreds of zLinux, contrary to x86 servers which can run in the best case about six operating systems at the same time, using virtualization solutions presented in previous chapters, as Xen Source. Z9 EC Model S08 S18 S28 S38 and S54 Power (watt) 12100 14700 16900 18300 Cost/Day 27,21$ 33,06$ 38$ 41,15$ Cost/Month 816,31$ 991,72$ 1140$ 1234,59$ Cost/Year 9795$ 11900$ 13681$ 14815$
If one bases solely on hardware capabilities and consumption, without taking account of x86 virtualization solutions such as Xen Source or VMWare, System z9 is far more interesting, as we can see on this recent IBM study.
It would seem unfair to compare these technologies without use virtualization. Then, lets take an example: how about hundred zLinux on system z9 and five Linux on each x86 server. This last hypothesis would be a great performance in most production infrastructure. You should also consider results below dont take account of software price such as VMWare (thousands of dollars). Model Tower Servers Rack Servers System z9 servers Cost/Year Average 730$ 730$ 12550$ Cost per virtual machine /Year Average 150$ 150$ 125$ 98 | P a g e
99
Managers of Data Centers shouldnt count on racks to solve energy consideration problems. Indeed, although many manufacturers make efforts to improve the electrical consumption, their consumptions will continue to increase, at a dramatic speed, as shown in this IDC study (June 2007).
Mainframes then seem to be a good alternative, and although some blade technologies offer virtualization capabilities (such as Hypervisor, virtual I/O, etc) none of them offer the maturity of virtualization provided by System z. They benefit from nearly 35 years of experience. As a result, there workloads are often near 100% utilization, whereas distributed servers run at a very low utilization level, from 10% to 30% for the most used. Customers want to pay for what they can do with their machines. A server having so much white space is not interesting, as its not profitable. Energy used by machines is significant, but the required infrastructure they suppose is even more important. According to this study from EPA, non IT equipment (cooling, ventilation, pumps, etc.) represent an average of 60% of the Data Centers electricity consumption.
99 | P a g e
100
Heat in Data Centers

The major reason that electricity has become a problem in the Data Centers is the fact that IT equipments generate much more heat than before. Therefore, new problems for managers are about heat removal. A Data Center should be at about 19, whereas most of them are at about 26.
Heat load per footprint evolution (Source: IBM Journal of Research and Development)
Every big company has to install its own air-conditioning system, in order to keep the components of the electronic equipment within the manufacturers specified temperature/humidity range. Servers which are in a confined space generate a lot of heat, and then their equipments reliability is reduced if they are not adequately cooled. It can be disastrous for the production.
When a Bladecenter is full with a classic breakdown, servers in the middle and top level are so heat so that their rate of failure becomes unusable Bertrand Buxman, Emerson Network Power Cooling Director
BladeCenters are very interesting equipments because they typically need about 2 kilowatts of power per rack (except High Density Blade requiring more than 20kW per rack) but generate much more heat than over servers. Energy saved on hardware is passed on cooling system consumption. According to a study from EPA, rack servers are expected to require an additional 20-25 kW of power for the cooling and power conversion equipment that supports it.
100 | P a g e
101
Despite advices provided by companies such as Ashrae, which is an American Society specialized in Heating, Refrigerating and Air-Conditioning advanced technologies, heat in Data Centers remains a problem. Number of best practice guides more and more increases and how to place its servers has become very important to optimize its floors cooling in Data Centers.
Even with such practice guide, there will always remain hot aisles. The use of alternating hot and cold aisles is a method of configuring server, when rack servers are arranged in parallel rows. Naturally, hot aisle are at a much higher temperature than cold aisle. When warm room air mixes in a colder air, the results tend to be very difficult to control at a precise temperature.
We estimate that in 2006 $29 billion was spent on powering and cooling IT system. IDC Analyst Firm
To preserve their own machines and prevent any failures due to overheat, companies must use cooling systems, but their costs are incredibly high, particularly since they dont really contribute to the company, it is an obligation and nothing more. These facilities would never bring money, and with rigour and attention, hot spot areas will still be present. These are area which generate much heat, even if they usually occupy a limited floor space, resulting wasted Data Center space. Indeed, when some machines generate much heat, companies usually isolate them, leaving a significant space between them and the rest of machines. The goal is to not heat the other machines more than they are, and to control areas following their heated outbursts. Place used by machines also remains a very big problem for companies and their Data Centers. Its not rare to deal with Data Center of thousand square meters. Mainframes tend to be a solution for space and electricity problems, as it can easily replace hundreds of tower and dozen rack servers, without their physical constraints. 101 | P a g e
102
A recent study from IBM confronts power and space consumption needed by usual x86 servers for from Intel and System z9. It tends to say Mainframes could avoid much problems presented above.
Some key numbers from the Wall Street Journal Online showing hidden costs are not that hidden... Air-conditioning: Cooling units cost $25k to $50K Electrical system: a diesel generator costs $50k to $200k Floor space costs: Most companies build new facilities for their Data Centers: $250/Square feet to $1500/Square feet, design an deployment costs: $30k to $75k
Equipments and People considerations

Other costs imposed by distributed environments are network equipments such as routers or switches, allowing machines to communicate each others. These equipments are not necessary in Mainframe environments, since every network communication between each virtual machine is done thought the HiperSocket technology, presented above. They can then allow significant economies (hundreds of dollars per network equipment, such as Cisco products), and even boost performances as HiperSocket runs at memory speed.
Network Equipments needs for hundreds of servers, based on Switch 24 ports (such as Cisco Catalyst 2950)
102 | P a g e
103
People costs are also very important according to the choose platforms. The more you have machines, the more youll need peoples for maintenance. Distributed server environments then need for more people than in Mainframe environments. As people costs are the main source of expense, companies should seriously think about the System z9 alternative.
The number of operators and system programmers required per Mainframe MIPS has fallen ted-fold in past seven years, and is expected to at least halve again in the next five years. Arcati (The Dinosaur Myth)
What about Blade Servers?

The most fashionable servers nowadays are without any doubt Blade Servers. Indeed, they have great qualities, but also defects, as seen before, especially their heat release. Still, it remains interesting to see why they are so much appreciated in IT infrastructure.
103 | P a g e
104
Results of this recent poll from NWC are quite ironic, as the main drivers to choose Blade Servers are the same which make the strength of the new Mainframe, as seeing above! This study clear shows that Mainframe could attract most people, who wish to benefits from all Blade Servers qualities, which are even more interesting under Mainframe environments, and without there defects.
Big Green
This approach may appear far too optimistic, but it is revealing of the new reality of Mainframe servers. IBM recently launched a new project Big Green, referencing to its nickname Big Blue, and will redirect more than $1 billion per year to mobilizing the companys resources to dramatically increase the level of energy efficiency in IT.
To begin its project, IBM consolidated about 3900 distributed servers on only 33 Mainframes, thanks to the z/VM technology. This new environment will consume 80% less energy than the current configuration and will also allow IBM to realize significant savings (energy, software and hardware support) in the next 5 years. Holding of floor space will also be reduced by 85%. The replacement of real servers by virtual servers allows IBM to significantly reduce operating costs: Energy saved represents the annual electricity consumption of a small town for a year. Software is often billed according to installed processor. The 33 Mainframes contain far fewer processors than 3900 servers today. The project will release technical personnel and assigned them to projects with higher value.
This infrastructure capable of handling 350,000 users serves as a perfect illustration of the Mainframe transformation, and a perfect showcase for customers. With this consolidation, IBM wants to prove that the Mainframe is the best solution to meet customers requirements in terms of infrastructure cost reducing and optimal management of energy.
Mainframes seem to be the perfect platform for Server Consolidation: it provides various significant costs save: power, space, software and people. Furthermore, it has qualities not available on other platform, as a very advanced security, dynamic allocation of compute power, hardware with redundant components, etc
104 | P a g e
105
Effects on market
Customers seem to be sensitive to the new Mainframe qualities, especially the possibility to run several zLinux on a single system. This chart from IBM presents the evolution of MIPS growth, and it appears that IFL utilization is very popular, since its not software charged.
The IBM System z Mainframes experiments a great resurgence of interest in the world. According the IDC, in 2006 the income growth of IBM Mainframes was superior to those Windows platforms. High utilization of Linux servers and big server consolidation projects explain this situation.
105 | P a g e
106
3.3 A mature and credible platform

Mainframes can conquer a new part of market with their virtualization capabilities. But its also very interesting for all its old but efficient technologies. They will then be still used for the reasons they are actually used, which we have presented in the chapters above. Still, much companys critical applications are hosted on Mainframe, and cant be migrated.
More than 75% of professional transactions pass at least once through Mainframes applications written in COBOL. Gartner Group
Its in the 60s banks and major companies developed their historic business applications. They have not ceased to be improved. Their replacement or rewrite became to expensive Evelyn Bernard-Thewes, ECS Mainframe Director
Mainframe will then keep its niche place in very high quality servers market for these reasons. Its also the only platform able to answer to big Business Continuity Plan imposed in banks by prudential designed to prevent banking risks such as BALE II. Parallel Sysplex or GDPS dont really have equivalent in distributed server environment. Serious companies need to be sure of their IT infrastructure, and breakdowns can be disastrous.
Combined to its advanced hardware, z/OS is the only Operation System offering a system availability of 99,999% and EAL5 certified. They provide reliability, availability and serviceability. 106 | P a g e
107
Mainframes benefits from decade of experience in big infrastructure, and big companies need them. Indeed, all most enterprises from Fortuna 500 use Mainframe, even the ones which dont use z/OS but only z/VM, such as some banks in Japan, using Linux systems consolidated on System z9.
There couldnt be any migration from Mainframes to UNIX in banks. Very few banks are created today, but even the newest choose the Mainframe, such as La Banque Postale Stphane Deliry, Overlap President
With Mainframes, customers are sure to invest on a strong hardware, and to capitalize on their IT infrastructure. In distributed servers environments, update its infrastructure can be a real nightmare, as it deals with thousands of servers. Its too much complex. X86 configuration dramatically change with time, and thus are not reliable, which is not the case in with Mainframe. If one has to remember five reasons why Mainframe is going to grown on market, we should say: 1. Security and High Availability 2. Investment protection and overall operating costs 3. Scalability: Scale Out and Scale Up thanks to its hardware and virtualization capabilities 4. High old and new workloads (COBOL, Java) with great performances 5. Emergency management: Procedures are documented since years: customers are serene Finally, a recent study from Arcati present the average cost per end user in 2010. It takes into account the various parameters we presented in an IT infrastructure. It appears than the Mainframe with be the most interesting architecture on a five-year costs consideration.
107 | P a g e
108
3.4 Emerging applications

When IBM decided to open its Mainframes to Linux systems, many specialists didnt understand its strategy, and thought it was a huge error. Today, zLinux is the new strength of System z9, and is crucial for IBM, notably in all server consolidation projects. But innovations are not over and Mainframes are going to discover new horizons.
Gameframe
Indeed, a new kind of System z9 will appear in few months, integrating a Cell Broadband Engine. This machine, called Gameframe will be designed to support MMORPG games and virtual communities. This project is born of a partnership between IBM and a Brazilian game developer Hoplon Infotainment and plans to create system which will host massively multiplayer online games.
As online environments increasingly incorporate aspects of virtual reality -- including 3D graphics and lifelike, real-time interaction among many simultaneous users -companies of all types will need a computing platform that can handle a broad spectrum of demanding performance and security requirements. Jim Stallings, IBM System z General Manager
The IBM system z9 will add a great level of realism to visual interactions in addition to gaming, as well as much security, thanks to its EAL level 5. It could be also used to enhance the scalability and performance of existing virtual worlds, as Second Life. Many consultants think its just a huge gadget announcement. But they should reconsider the online game market, which is exploding, especially since their democratization through Word of Warcraft.
According to me, IBM aims a very promising market with its Gameframe systems. 108 | P a g e
109
zSolaris
Gameframe is not the only innovation promised in Mainframe environment. Indeed, after having opened its system to Linux, IBM will now open it to Solaris.
zSolaris will then be available in few months, according to a recent agreement between IBM and Solaris. As Solaris 10 is a very stable and reliable system, combined with the known qualities of Mainframes, this combination will for sure interest most people, notably webhosting companies which use Sun Operating System for its complete fault and security isolation with Solaris Containers.
z/OS Simplification
The worst thing about z/OS is its interface, which is more than thirty years old. It considerably reduces productivity in some case. IBM is aware of the problem, and will launch a huge project representing an investment of $100 million to make the System z easier to use for a greater number of IT Specialist. It particularly aims zNextGen Members, who are more efficient with graphical interfaces. The goal is to enable administrators to more easily manage their Mainframe systems, with automated configuration checking, modernizing user interface and development environments with visual tools available on microcomputers. IBM demonstrates Mainframes can be flexible for use.
IBM aims for user-friendly Mainframes CNET Networks, Inc.
Other consoles are available under zMC, allowing administrators to configure RACF, WLM, DB2 and much more with a graphical interface. These innovations will give to the Mainframe a new life. 109 | P a g e
110
3.5 SWOT and future market

Mainframes are in a niche market, but for the first time since years, they can conquer new market, which where once only composed of x86 servers retailers. Here is, according to me, the System z9 SWOT. SWOT is a strategic tool used to evaluate Strengths and Weaknesses of a product, as well Opportunities and Threats existing on aimed market. It helps to have an overall view of the situation.
In my opinion, IBM can conquer new market and destabilize many actors, on hardware market as well on software market, Business Class System z9 equipped of IFL engines being very competitive. zLinux and zSolaris will surely be the salvation of Mainframes, and most of them will surely be sold during next years only for their incredible virtualization capabilities. I think that Mainframes market will be split and will address two distinct types of customers. On the one hand, well have usual Mainframes customers, who will use z/OS as well as z/VM. On the other hand, well have new Mainframes customers, who will surely only use z/VM capabilities, in order to execute thousands of zLinux and zSolaris. These customers will be web hosting companies, needing many servers based on the same template, and customers quickly needing tests environments. In any case, the future of the Mainframe on the market looks very good, for all reasons Ive presented. I would be very surprised if it does not take back a prominent place in a few months.
110 | P a g e
111
Conclusion
Mainframes are often seen as old dinosaurs doomed to disappear. However, we have seen throughout this thesis that this simplistic vision is largely incorrect.
Mainframes are machines running programs written 30 years ago, its what makes them so interesting: with this platform, companies can capitalize on their existing infrastructure, and dont lose any money invested for many years notably in old COBOL critical applications. At the same time, they can use it for recent programs written in Java. Then, they benefits from modern and old applications. System z9 is still the preferred machine in major infrastructure for its reliability; availability, serviceability and security, and the world still need it. Companies know they can count on this platform in case of Disaster Recovery, which wont be the case with other technologies. In addition, the migration to UNIX Systems would be far too expensive, both in terms of hardware than software. The early death of Mainframes is then a utopia. We have seen that the hardware of System z9 meets large requirements, and is the only one capable of providing an availability of 99.999%. In addition, its specialized processors not only allow saving money but also improving the distribution of various workloads following their nature (Java, DB2, XLM, etc...). Technologies used under z/OS are far from being obsolete, as Parallel Sysplex, GDPS, and Copy Services offer very advanced features which dont have any equivalent in distributed server environments. Older products such as RACF benefit from decades of innovations, making them stable and effective (EAL 5 certification). The file system, which appears at first sight completely archaic, is actually very interesting, because it provides very fast read and write access, as the system knows of its formatted since its allocation. Overall system performances are also extremely good, since a System z is often used at more than 90% by its various tasks, which priorities are managed by WLM. Virtualization is the new hype technology to use in IT environment, and System z has very significant advantages, as it benefits from years of experience, particularly through z/VM. Mainframe therefore seems to be the ideal platform to run Linux servers, and its evident it will have a decisive importance in server consolidation projects. We have also shown the TCO of a Mainframe is more interesting than the one of distributed servers especially considering hidden costs such as energy, space infrastructures needed, and other considerations such as network equipment or people costs. New applications available on Mainframes, such as zSolaris, make it very credible, and the administration simplification may have a very positive impact on small and medium enterprises. Today, Mainframes have the ability to penetrate new markets, and their Business Class ranges can easily attract customers who never thought they could buy a Mainframe. We can therefore say that the future of Mainframe appears to be bright.
111 | P a g e
112
References
Online Web Resources: www.01net.com Interviews http://en.wikipedia.org/wiki/Virtualization Study Groups: Data Centers Challenges (IDC) The state of the Mainframe (Gartner) Online Game Market Forecasts 2007 (DFC Intelligences) Energy Information Administration (Power Plant Report) Meeting the Data Center Power and Cooling Challenge (Gartner) Financial and Functional Impact of Computer Outages on Businesses (University of Texas) Power Conservation Inside and Outside the Box - A Systemic Approach to Energy Efficient Information Management (Pund-IT) IBM Documentations and Redbooks: Confidential Study Cases Getting Started With SMS Positioning zOS and Linux for zSeries zOS IBM Health Checker for zOS Users Guide Security Server RACF Security Administrators Guide Introduction to zOS and the Mainframe Environment Mainframe Computing and Power in the Data Center Why the IBM Mainframe Is an Effective Choice for Banks GDPS Family - An Introduction to Concepts and Capabilities Clustering Solutions Overview Parallel Sysplex and Other Platforms IBM TotalStorage Productivity Center for Replication on Windows 2003
112 | P a g e

Salsmann Mainframes

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Salsmann Mainframes

Hochgeladen von

Copyright:

Verfügbare Formate

End of Study Thesis

1/ Mainframe Computers: Myths and Realities

1.1 Whats all about these old dinosaurs?

Fail, crash or slow-down is NOT an option

1.2 Who is enough mad to use it?

1.3 Why are they still running?

Mario wants to be helped: Scale Up!

Mario wants to do it all by himself: Scale Out method!

1.4 What is its place in IT environments?

1.5 The Mainframe market nowadays As dead as itself?

2/ Mainframe Today: Denver the Last Dinosaur?

2.1 An impressive advanced Hardware

IBM System z9 Hardware

Model S08 S18 S28 S38 S54

Each processor can be specialized, as well see in next chapter

2.2 Specialty Engines

Lets have another look of the current situation

2.3 z/OS : the IBM Operating System

2.4 An horrible user interface

Time Sharing Option

Interactive System Productivity Facility

z/OS UNIX interactive interface

2.5 z/OS file system

Allocating a Data Set

Example: DASD01 (Always a six character name)

Summarization of Record Format and Block concepts

Using Catalogs to locate Data Sets

2.6 JCLs for batch processing

JOB Statement example:

EXEC Statement example, executing IEBCOPY:

Using SDSF to check JCLs execution

2.7 Jobs, performances, and network management

2.8 Transaction Servers and Database: Middleware

2.9 RACF Security Server

Defining users in RACF

Defining groups in RACF

Defining Data Set Profile in RACF

Lets create a generic data file profile:

Giving special permission to Users and Groups

RACF Permissions, more limited to most permissive

Permission priorities considerations

Interaction with other products

RACF is thirty years old Whatever?

2.10 DFSMS: Managing Data

Why do we have to manage data?

Defining SMS Constructs

Assigning SMS constructs to data set using ACS Routines

Applying DFSMS configuration

Thats it, DFSMS basic configuration is finished!

2.11 Health Checker: Auditing system

Configuring Health Checker

Launching Health Checker

Checking Health Checker Data

Screenshots from NCACMG Health Checker User Experience Presentation

2.12 Virtualization technologies

Some kind of Virtualization

Operating System Virtualization on x86

Operating System Virtualization on System z9

Virtual Network Management

2.13 Solutions for high and continuous availability

Geographically Dispersed Parallel Sysplex

3/ Mainframe in the future: Dead or Messiah?

3.1 Server Consolidation