------------------------------------------------------ Week 10 Notes for DAT2330 - Ian Allen - idallen@ncf.ca ------------------------------------------------------ Remember - knowing how to find out the answer is more important than memorizing the answer. Learn to fish! RTFM! (Read The Fine Manual) ==================================== MVS - O/S 390 - Job Control Language ==================================== -IAN! idallen@ncf.ca Winter 2003 These notes are largely based on lectures by Harold Smith: - DAT2330 - Harold Smith Class - Thu Jan 20, 2000 - DAT2330 - Harold Smith Class - Wed Jan 26, 2000 Intro to MVS JCL ---------------- Bring your MVS textbook to all classes and labs - only read enough from the book to do the homework! (Don't read it all!) - book is a reference manual; not a tutorial - note which things are important for this course, and which are not - see the file MVSweekly.txt for the statements and parameters each week - I'm not teaching in class everything you need to know - I concentrate on things that are difficult to understand - you must learn the basic things (e.g. continuation rule in Chapter 4) See the chapter1-4guide.txt file before reading Chapters 1-4 - You must study Chapters 1-4 to learn the precise meanings of the terms as they come up in the JCL. See the guide and read these pieces: - Inroduction (Chapter 1) - MVS Concepts and Vocabulary (Chapter 2) - JCL Within a Job (Chapter 3) - JCL Statement Formats and Rules (Chapter 4) - Don't worry about terms not covered in the lectures, labs, and homework MVS vs OS/390 - MVS - old name dating back to 1970's - OS/390 is new name for MVS MVS - The SHOCK of JCL (page 1) - JCL dates from an era where machines were expensive and people cheap - MVS is still the O/S of the future for big mainframes Examine the JCL on page 1 - author objects to amount of JCL used to copy a file - not all the complexity is superfluous - MVS has more control - MVS has good reasons for the complexity - you get very fine control - multi-user, need to bill to particular programmers/users - cost recovery - special input/output requirements - detailed (and thus complex) control over resource allocaton - priority relative to other jobs (can't do in windows) - give a good name to the job, e.g. SMITHPAY instead of RT83838 (8 char) - need to name jobs when there are many of them - give a readable job title in quotes - execute a program - might have many parameters on this statement - suppose I submit a program that might not run properly - I'm not watching it (it's a batch job) - it might eat the machine - e.g. suppose it loops for hours and bankrupts my account - Solution: specify a parameter to EXEC to limit cpu time - all these parameters make the JCL look complex - MVS calls files datasets - SYSOUT=A controls the type of printing (spool) queue - windows has very primitive spool control (in system tray) - MVS JCL can specify to decollate and burst carbon-copy fan-fold - DISP=(SHR,PASS) ; can specify whether or not to share the dataset - DISP=(NEW,CATLG) ; you can catalog everything, including offline storage; you can have a pointer to every file on every tape and disk - UNIT=SYSDA, UNIT=SYSSEQ ; what type of output medium (disk or tape) - Blocking (necessary to avoid wasting space on output device): - bit->byte(character)->field->record->dataset(file)->database - Can put multiple logical records into one physical disk block. - Gaps between blocks on disk/tape may be several hundred bytes long - poor usage of space - MVS lets you adjust block size - logical record vs. physical record - blocking factor (number of grouped logical records) - how many logical records per physical block? - if your record is 80 bytes, how many fit in a 4K block? - if your record is 100 bytes, how many fit in a 4K block? - SPACE= ; how to allocate space on output device Supose I know that a file will grow to some size; can't reserve space in windows - files fragment. MVS will let me reserve size; if job overflows, specify how much new space to alocate. We usually care about allocation, since that's why we use MVS. (You really care about allocation when you're allocating tens or hundreds of millions of data items!) - the MVS IDCAMS utility can copy a file (among other things) - IDCAMS has half a dozen operations! - IDCAMS is a "DEBE - Did Everything But Eat" program - Unix would have separate commands for the functions in IDCAMS - need a way to pass program options to programs such as IDCAMS - Windows: can specify moves and copies depending on shift and ctrl keys! - behaviour also depends on destination! - is that intuitive and user-friendly? - Unix: uses command-line flags - DOS: uses comand-line switches - OS/390: IDCAMS reads its options from a file (dataset) - that "file" can be read from the JCL input stream itself, similar to a Unix shell "Here Document", hence "instream" data - the /* indicates the end of the instream data (like EOF in Unix) - the SYSIN DDname specifies the dataset that IDCAMS will read to get its control statements - REPRO - specifies what thing IDCAMS will do (of the many) - REPRO must be indented one or more spaces (see text) - uses the JCL DDnames of the input file and the output file, instead of using the file names themselves - you can choose your own REPRO DDnames as long as they match your JCL DDnames (follow the syntax and length restrictions) - commas at end of third field continue lines to next card - Rules for Continuation: see section 4.5 on p.57 - embedded blanks either cause JCL errors or cause trailing items to be takes as "comments" and ignored - BE CAREFUL WITH BLANKS! - un-learn your English habit of putting blanks after commas! Text: Virtual Storage (p.17 has an error; fix the error) - program cannot be loaded from disk to disk; must go through memory - part of the program will page to disk if memory fills up - disk is not the same as virtual storage - could run a 5MB program in 2MB, with 3MB on swap - bringing missing pages into memory is a Page Fault - windows loses track of pages - invalid page fault - Windows swap file name: win386.swp - the swap file in MVS is named the: PDS - Page Data Set - PDS is hierarchical: stepped on fast and slow disk -------------------- IBM Mainframe Jargon -------------------- - SYSGEN - system generation, installation, and tailoring (naming things) - a very complex process done *ONCE* when first installing OS/390 - main storage (memory) - DASD - direct access storage device (hard disk) - "hard disk" not used because mainframes only had hard drives - "random access devices" - "random" is a poor business word, says IBM - call them "direct access" instead - "floppy disk" - "floppy" isn't a good business word, says IBM - call them "diskettes" instead - Application Programmers (you) vs System Programmers vs Operators - you write applications (e.g. COBOL), not system programs (e.g. Assembler) - you may have to ask the System Pgmr who installed MVS about some things - NOTE: in Windows and Unix we teach you some system programming In MVS we do not attempt to teach you any system programming; not installation of MVS, nor tuning, nor debugging - MVS system Operators watch the system constantly, responding to requests from all the JCL - put paper in printer; fetch tape; cancel job; adjust forms, etc. - DSORG - DataSet ORGanization - how data is stored on storage media - you must know: physical sequential, fixed, fixed with blocking (IBM has many others, e.g. VSAM, ISAM, partitioned, etc.) - not well supported by O/S in Windows or Unix (mostly sequential) - EBCDIC vs ASCII - MVS uses the EBCDIC character set! - ASCII and EBCDIC files have no printable characters in common - SMS - Storage Management Subsystem - added extension to manage storage - if SMS is installed, it affects how the JCL is coded - may allow you to have more "defaults" and specify less - e.g. with SMS you can say BLKSIZE=0 to get the "best" blocking factor - we will assume SMS is *NOT* available and we will code everything --------------------------------------------------------- The Compile Link and Go (CLG) process (p.44) - Figure 3.5 --------------------------------------------------------- - study well the critical Compile, Link-Edit, GO (C.L.G.) sequence - memorize the important diagram on p.44 (edit, compile, link, go) - you should be somewhat familiar with this process - how much did your program development environment hide? - MVS "PROCs" (canned JCL procedures) hide some of it; but, you have to understand the process to code JCL for it - MVS program development cycle: edit, compile, link, and GO; repeat - this task will make up most of your future MVS job - 1) compile source into object code - 2) link-edit the object and subroutine libraries into a load module - why have link libraries? - could supply *all* the source and eliminate the linkage editor? - don't reinvent the wheel (VSAM access code is a subroutine) - compiling all that extra code is expensive - compiled subroutines are already translated, ready to link - 3) load and run your compiled program - using extensive JCL resource limits and specifications - we will learn to code the CLG process under MVS Review these O/S components used when building a program: boot, shell, task mgmt, memory, CPU, i/o, security, net, utilities, help - need bootstrap loader to first load your O/S (IBM calls it IPL) - don't load all of O/S at boot; load KERNEL or NUCLEUS - might not include all device drivers in kernel (might not have choice!) - need "shell" (command interpreter) to find and start up the compiler - need memory management (virtual memory) to allocate memory for the compiler - O/S needs to allocate some CPU cycles to run the job - task manager will schedule CPU cycles for your job - compiler calls on O/S to do I/O to read the program to compile - device drivers let O/S access physical devices - need linkage editor (linker) to link your compiled code with already existing code in libraries - after linking, the "shell" loads the linked program itself and executes it - this is called the "GO" step Review these O/S terms used when testing a program: kernel, batch vs. interactive processing, spooling, batch files (procs) ------------ MVS Job Flow - what is it like to be an MVS programmer? ------------ - see MVS Job Flow diagram ("MVS Job Flow" on course home page) - Example: update an existing COBOL program - sit at a PC (emulating a 3270 terminal) and connect to MVS via TSO/ISPF (Time-Sharing Option, Interactive Structured Productivity Facility) - edit COBOL program on screen (interactive; but, not graphical!) - prepare JCL and maybe the COBOL source "in stream" - send job JCL from TSO to JES (Job Entry Subsystem) - JES is doorway from interactive to batch world of MVS - JES checks the JCL and queues it based on resources and priority - your job goes in a queue - you wait milliseconds or hours for the resources you requested to become available and allocated to you - your JCL may include a predefined "PROC" (JCL Procedure) - system's stored JCL procedures come from dataset SYS1.PROCLIB - you can "pull a PROC" and merge your JCL with it - you only specify the essential things that differ from the PROC - less code is better code! - see example progtstjcl.txt for a one-step job using a PROC - JES loads your program into an Application Program Region (APR) - virtual storage space - you can pull in system programs from dataset SYS1.LINKLIB - your program can access tape and disk (datasets) - your output is not sent directly to the printer, it is queued in a "SPOOL" queue (Simultaneous Peripheral Operations On Line) - JES decides when resources are available to print your ouput - your output may be routed back to your TSO screen for viewing - Summary: 0) Edit your source on-screen via TSO 1) pull a JCL "PROC" to do the traditional CLG steps 2) CLG PROC pulls compiler, linkage editor from SYS1.LINKLIB 3) the "GO" step pulls my program off DASD and executes it (need a loader) 4) my output goes to the output spool queues that I have chosen 5) spool queue may lead to printer, or back to the TSO screen Learning JCL - p.4: look at job streams and imitate (copy existing JCL) - front of textbook p.xi has all the parameters - this table is condensed and very easy to use (much faster than index) - JOB statement is always first; it starts the JOB stream - EXEC indicates a new STEP - a step is an execution of a program - a job may have many EXEC and run many programs (like a shell script) - DD statements attach I/O devices to the program named by EXEC - the name to left of keyword "DD" is the Data Definition name (DDname) - MVS programs do not open real file names; they open DDnames instead and depend on the JCL to map the DDname to a real file name - DD names are chosen by the programmers who wrote the programs - you will find them mentioned in the program source - e.g. fopen("SYSPRINT","w") - thus, the DDnames in the JCL *must match what is in the program* - each part of a DDname is limited to 8 characters; but, you may have several parts, e.g. GO.SYSIN, COBOL.SYSPRINT, etc. Flow Diagrams ------------- - you may find it helpful to draw "flow diagrams" before coding the JCL - flow diagrams use boxes to show each step (each EXEC) in the JOB - they show how the steps work together - one step's output may become next step's input - you use arrows to show each input and output stream for each step - show the DDnames of each input and output by putting the DDnames on the arrows connecting the boxes to the storage devices (disk, tape) Flow Diagram for PAYROLL - a Three Step Job (no PROC used) ---------------------------------------------------------- Example: Flow Diagram (payrollflow.jpg) of "PAYROLL" JOB (11347),SMITH - the JOB name is PAYROLL - Box: put job name PAYROLL as the title of this flow diagram - this JOB has three steps (three EXEC statements) that will be represtented by three boxes on this diagram page - draw a solid box for each of the three EXEC steps: - programs (PGM=) get solid line boxes - cataloged procedures (PROC=) get dotted boxes (see PROGTST, below) - compare files payrollflow.jpg and progtstflow.jpg - arrows between items in diagram specify the I/O done by this step - arrows connect boxes (programs) to dataset symbols (disk, tape, output) - direction of the arrow indicates Input (arrow leads toward box) or Output (arrow leads away from box) - labels on the arrows correspond to the JCL DDnames of those datasets (the DDnames must match the names inside the program used in this step) - the first of the three EXEC steps in this job is named CRETAP - draw a solid line box for this job step - put the step name CRETAP in the top left of the box - put the name of the program being executed (PGM=) in this step (IDCAMS) in the middle of the box - Arrows: specify the I/O done by this step - four DD statements mean four arrows, one for each DDname - two input arrows labelled with DDnames SYSIN and SYSUT1 - two output arrows labelled with DDnames SYSPRINT and SYSUT2 - DDname SYSPRINT: - very common DDname for printed output - likely for IDCAMS error messages - goes to SYSOUT in the "A" output class - what output class "A" means is defined locally - see your sysadmin for the list of classes - DDname SYSUT2: - has DISP=(NEW,PASS) therefore it is an output DDname - the dataset name is given as DSN=PAYDATA - in the real world, the name will likely be much more complex, requiring your account name and other details, e.g. ER93754.GRP8824.PRJ92.PAYDATA - DDname SYSIN: - IDCAMS reads control records from SYSIN to know what to do - e.g. to choose the REPRO function (a copy operation) - instructions to IDCAMS are short and usualy embedded directly into the JCL as "instream" data using the "DD *" syntax - Oval: an oval in a Flow Diagram indicates instream data - label this oval as "control statements" - DDname SYSUT1: - more "instream" data included right in the JCL stream - instream data has a restrictive, punch-card-like format - Oval: this data is also represented as an oval - label it as "payroll data" (you have to know what the job step does to label its data) - the second of the three EXEC steps in this job is named PRTCHKS - Box: put the step name in box along with name of what program or procedure is being executed in this step (PAYROLL1, here) - Arrows: specify the I/O done by this step - three DD statements mean three arrows - one input arrow labeled with DDname PAYIN - two output arrows labeled with DDnames SYSPRINT and CHKOUT - DDname SYSPRINT: - goes to SYSOUT - is it the paycheques being printed(?) - no - the next DD statement would seem to be the cheques - SYSPRINT DDname is probably for errors or a log file; not cheques - can re-use same DDname in different steps - DDname CHKOUT: - SYSOUT=B - goes to the output spool queue named "B" - this is not the usual "A" queue with cheap paper; queue "B" is probably a special printer with special paper for the cheques - you must consult your local print queue definitions to find out how your MVS system was configured at SYSGEN time - DDname PAYIN: - DD statement mentions as input DSN=PAYDATA, the same dataset that was output in the previous (CRETAP) step - each DSN must be unique (so the names are ususally much longer!) - the third of the three EXEC steps in this job is named PRTREG - print the payroll register created in step 1? - Arrows: specify the I/O done by this step - three DD statements mean three arrows - one input arrow labelled with DDname PAYIN - can use the same DDname in different steps; because, different steps run different programs and the programs may or may not use the same DDnames internally - two output arrows labelled with DDnames SYSPRINT and REGOUT - DDname SYSPRINT: - this would appear to be an error listing? - can re-use same DDname in different steps - DDname REGOUT: - probably the payroll registry, sent to yet another spool queue "C" - a limit is placed on the number of print lines - if you generated a million lines of output, you would get it, and you would be billed for it! - DDname PAYIN: - this is, again, the same dataset generated by step 1 - we read it a second time, using a different program this time - the DISP indicates the dataset exists (OLD) and that we should put it into the system catalog when the step ends - the end of the JCL is indicated by // ---------------------------------------------------------- Flow Diagram for "PROGTST" - a "one-step" JOB using a PROC ---------------------------------------------------------- JCL allows "macros" of stored JCL called "Procedures" to be called up from system libraries in EXEC statements using the "PROC=" syntax. The PGM= syntax calls a single executable program; PROC= calls up a canned procedure that may contain many internal program steps. Using pre-defined JCL in your shop: - a procedure (PROC) is used for long and/or complex JCL - the system's COBOL CLG PROC is maybe 50-500 lines of JCL! - use the existing PROC in your shop - don't re-invent it! - see the end of Chapter 3 for an example of system PROC usage These procedures ("PROC"s) may contain many internal steps (many internal EXEC statements). You may or may not be told about all the internal steps hidden inside a PROC. Example: Flow Diagram (progtstflow.jpg) of "PROGTST" JOB (11348),SMITH In the PROGTST job example, a PROC named "COBCLG" is being called up. This PROC has three internal steps; but, in the particular JCL we are given we can only detect the names of two of the steps: "COBOL" and "GO". Use a dotted box to draw a PROC step. (Use a solid box for a PGM step.) DDnames for PROC steps must also be labelled with the internal step name inside the PROC in which the DDname is used - more on this later. Mnemonic Rule: Dotted PROC boxes need dotted DDnames on the arrows! -------------------------------------------------------------------------- Don't study the whole JCL book - too much stuff - do the homework questions each week; they relate to the tests - see the course outline for the list of things we learn each week - see the file MVSweekly.txt under the JCL Notes button - go through each sample job stream, identifying parameters that you don't recognize and LOOK THEM UP in your textbook Review: Compile, Link, and GO (from last class) ------------------------------------------------ JCL Syntax Practice: Find the errors in TESTPROG ------------------------------------------------ - UPPER CASE ONLY for JCL - learn JCL by looking at existing JCL - teach yourself JCL syntax by finding the errors - style: - order of DD in a step doesn't matter (except when over-riding a PROC) - put output statements first (output never has instream data) - put input statements last (input DD may have instream data) - put instream data very last in the step, when possible JOB statement for TESTPROGRAM: - remove leading blanks - shorten job name to 8 chars or less - remove the blanks after commas - put two positional parameters first - move CLASS=B after the positional parameters (positional always first) - MESGLVL spelled wrong: MSGLEVEL - Flow Diagram: title top of diagram TESTPROG Step ONE - first EXEC statement: PGM=IDCAMS - shorten step name to 8 characters - Style: use meaningful 8-character job and step names, not "STEP1" - Flow Diagram: make this a solid box containing MAKETAPE, IDCAMS - four DD statements (after EXEC, before next EXEC in next step): - SYSPRINT DD - label the flow arrow going out to print spool queue A - some queues may route print to screen, not to printer - label SYSPRINT print sheet "error messages" from the IDCAMS program - "IDC AMS": Access Method Services (IDC - IBM prefix) - IN DD DSN=TESTDATA - error: DDName "IN" is used twice in same job step - this is a "NEW" dataset; therefore it must be output - rename it to OUT (look at the IDCAMS input stream REPRO line) - label arrow OUT going to TAPE - unit name "TAPE" is an MVS sysgen parameter (SYSSQ, CART, etc.) - this unit name is different for each MVS shop; ask your operator - next continuation line must start with slash slash blank - indent continuation over to line up with third column - but stop before column 16! - VOL=SER=127536 is correct syntax - a serial number never changes for a tape - DSN may give it a new name - SYSIN DD * - this is called "in-stream data" - input for this dataset comes from JCL input stream - put instream data last, so that it doesn't space the rest of the JCL down the page - use oval shape on flow diagram, label incoming arrow SYSIN - this SYSIN data is for IDCAMS, telling it what to do - label the oval "IDCAMS control statements" - these are equivalent to DOS flags, switches, UNIX options - having to put all these flags on the EXEC statement would make the statement too long, so we use a separate SYSIN data stream to control IDCAMS - IN DD * - also instream data - draw an oval, label it as "test data" - look down to next step (next EXEC) to see how data will be used - looks like test data for input to our COBOL program - Use "/*" to end instream data; must start in first column - JES *always* splits jobs on //XXX JOB - you cannot ever have //XXX JOB in instream data Step TWO - second EXEC statement: PGM=IDCAMS - change comment to have //* instead of // - remove blanks around "=" - FlowDiagram: use a square, solid box titled PRTTAPE, IDCAMS - four DD statements: - SYSPRINT DD - use this label on flow arrow going out to print spool queue A - remove blanks around "=" - label this print output as "error messages" from the IDCAMS program - WRITE DD - another output to SYSOUT=A - draw a second flow arrow going out to another print spool queue A - remove blanks around "=" - label as "tape records" (look at what IDCAMS is copying) - SYSIN DD * - make sure there is a space before the asterisk - draw an oval for instream data "IDCAMS control statements" - REPRO statemnt (IDCAMS) cannot start in column 1 - missing /* must follow REPRO line - names must match DDnames: change IN to READ and OUT to WRITE (or, change DDname WRITE to OUT and DDname READ to IN) - READ DD - label on arrow coming from passed tape from previous step - DSN=TEST DATA - no blanks, 8 characters for each "qualified" name - qualified names can contain dots to separate components - similar to slashes in a directory hierarchy - maximum length of 5 times 8 chars in qualified names, e.g. DSN=ABCDE.FGH.IJKL.MNOPQ.RSTUVWXY - usually need to have qualified name - ask the sysadmin - simple name not permitted in system catalog - Windows limits full pathnames to 260 bytes from root to end! - Unix limits pathnames to 1024 bytes - DISP=(,PASS) must have leading comma to select positional default for first parameter; but, prefer to use (OLD,PASS) for clarity - see chapter 4 for rules on commas in parameters Step THREE - third EXEC statement: PROC=COBOLCLG - procedure COBOLCLG is three programs - Compile, Link, Go (see p.44) - change comment to have //* instead of // - remove blanks around "=" - Flow Diagram: PROC's get a dotted box, not a solid box - Remember: "dotted boxes need dotted DD names" on their arrows - dotted DD names are of form //stepname.DDname DD ... - stepname is the name of the internal step inside the PROC - DDname is the DDname used in the program executed in that step - label this dotted box: COMLKTST COBOLCLG - four DD statements: - one DD statement is for the COBOL step (the compiler step of C.L.G.) - three DD statements are for the GO step (my running program) - nothing needed for the LINK step (use the defaults in the PROC) - COBOL.SYSIN DD * - COBOL is the stepname of the step in the PROC "COBOLCLG" - this DD statement is followed by instream COBOL program, ending in /* - IMPORTANT: - DD statements for a PROC must appear in same order as the steps in the PROC (find out the order from the manual or a sysadmin) - oval for instream data with arrow into PROC box labelled COBOL.SYSIN - oval itself is labelled "COBOL source program" - GO.PRTOUT DD - GO is the name of the "go" step in the PROC - GO step is where my COBOL program runs and accesses its DDNames - how do you find out the GO step name? - try something and look at JCL errors, or ask someone, or look it up - fix the continuation (can't continue after "="; must split after comma) - put OUTLIM=2000 together on the next line, or move 2000 up - why specify OUTLIM=2000 ? - because nobody is watching the spool queue and if your program loops and prints a million lines, MVS will print it and send your boss the bill! - arrow leading to print spool queue B - label the arrow with its DDname: GO.PRTOUT - label print queue "test output" - GO.DSKOUT DD - this must be an output dataset because of DISP=(NEW,....) - arrow leading out from proc labelled GO.DSKOUT - destination is a disk dataset, name the dataset TESTOUT - fix the unbalanced parentheses on SPACE - you control space growth requirements of the dataset - allocate a bit more space than you need - more details on the SPACE parameter in later examples - cannot indent continuation cards past column 16 - must stop before column 72 (71 is last usable column) - GO.TPIN DD - arrow from TESTDATA tape leading into PROC dotted box - label the arrow with the DDname GO.TPIN - this TPIN is the input data for the running program - fix spelling: CATALOG --> CATLG