266 Host Systems and Environments LIBERATING DATA IMPRISONED IN A VSAM GULAG George P Sharrard, Ph.D. Burton, Greene, Smolka, & Associates GETTING STARTED ABSTRACT VSAM files fill a well defined and well understood niche in the data processing world. VSAM files are the place all important corporate data is sent, never to be seen again! The dreaded 'data police' will see to it that every bit of data your marketing, advertising and strategic planning department could ever want is euphemistically speaking - saved. Should you ever ask for some of it back your name will be entered into a file (generally called "the systems enhancement request list') and you will be monitored in the future. Repeated requests to get your data back will only produce blank stares and reference to some 'COPYUB" somewhere or, worse yet, to a particular CICS screen. Some important things you need to be aware of before you enter the world of VSAM. 1. VSAM is an indexed file structure that allows you to access specific records without having to read through the entire file. Oust like an index On a SAS dataset) 2. key 3. There can be more than one key on a file sometimes called the alternate index or secondary key. 4. The VSAM files you want may be 'owned' by an on-line system. CICS or a specific application (M&D or CA are well known companies whose applications read/write VSAM files) may have an exclusive lock on the files your data is in. If this is true, you can only get at the data when the on-line is down (or the file you want is 'closed' to the on-line). 5. There are different kinds of VSAM files. The SAS guide to VSAM processing version 5 edition (pub number 5605) chapter 2 'Overview of VSAM Concepts' has a very readable discussion of the different types of VSAM files. 6. world, In COBOL the the codebook!machine readable dictionary file description for a VSAM file is called a copybook. Copybooks generally are saved in a PDS called COPYUB. While you may think that the existence of a codebook will Somewhere along the way you will be interrogated as to why the hundreds of standard reports the current system provides will not meet your needs. (Because it's the wrong information is not an acceptable reason.) Don't give up! SAS can set your data free!!! This paper will present three concrete examples of how SAS can access VSAM files: Sequential read Keyed read Keyed sequential read Major points as to what to look for in a COPYBOOK will also be presented. NESUG '92 Proceedings = index Host make your search for data easier - beware!! Copybooks are the stuff nightmares are made of. 7. There is nothing magic about VSAM files. After all, you usually find them in COBOL shops. 3. Syst~ms and Environments 267 The only difficulty is to figure out the offset (column location) of the data elements you are interested in. For the sake of this example let's say you have this information (it was in the manual). The following code is all you need. SEQUENTIAL READ OF VSAM FILE USING SAS We will now look, in detail, at three very useful methods of getting at data held in a VSAM structure. Situation 1 You work for a multinational pharmaceutical company. 10 years ago, they bought a canned application to do AP (accounts payable). This package can generate reports based on predefined fields (defined 10 years ago) and preform simple sorts. A new type of report is needed. The current staff only knows how to run the canned reports and it is suggested that what you are requesting is a 'major systems enhancement' Oots of money - take a long time). With an 'enhanced' level of incredulity, you request the file name(s) stating that you will do the report yourself using SAS. The manager of the AP department is at a lose. Authority has been challenged, ignorance exposed. Casting, what is hoped will be a trump card, the AP manager states - 'yes, but its a VSAM file allocated exclusive to M940.' FILENAME EXPNSIN 'LlVE.PRJV.EXP' DATA EXPNS; INFILE EXPNSIN VSAM MISSOVER LRECL=200 LlNESIZE=200; INPUT FIRMNUM 1- 4 INVOICE 5-9 $ 10- 16 RECNUM RECNUM2 14- 16 $ 17 BORS $ 18- 37 DESC SKU $ 52- 59 @62 PPUNIT PD7.2 @75 TOTUNITS PD6.0 On the infile statement it is necessary to specify VSAM. If you omit the VSAM option, the MISSOVER option will not work. As it is reasonable to assume that you will encounter variable length records when reading VSAM files it is imperative that you have access to the MISSOVER option. Situation 2 so.... 1. Find out when M940 is brought down. Usually, this happens after all the data entry personnel go home for the night. 2. You need every record in the AP file and don't care about Its keys. A simple sequential read will do it. You work at a mail order clothing company. Your marketing department conducted a telephone survey with 1,000 current customers and 500 former customers asking about everything except the specifics of past purchases. After all, that's on the company's customer database. Why use up valuable telephone time asking about purchase information that we already NESUG '92 Proceedings 268 Host Systems and Environments store in the purchase history file? (a file where each customer has one record with number of orders and $ amount spent for each of the 16 clothing groups in our catalog. PUT RETURN; END; END; Here's what you have: 1. 2. The survey data has been entered and saved as a SAS file where the variable CUSTNUM (numeric length 9) is the unique customer identifier on the survey and customer database. Your customer database. Its huge 3,000,000+ records, so you do not want to merge the survey file with the customer file (as in): data both; merge survey Qn=surin) housefil ; by custnum; if surin ; Consider the following: KEYED VSAM READ USING SAS LlBNAME SURVEY 'MKT001.GEORGE.SUR92'; FILENAME HOUSEFIL 'UVE.PRJV.HSF' ; DATA BOTH; SET SURVEY; KEYVAR=CUSTNUM ; INFILE HOUSEFIL VSAM KEY=KEYVAR FEEDBACK=SASRC INPUT ORDERS 19-21 @25 (MTYPE1 - MTYPE16) (6.2) IF SASRC = 4 OR SASRC = 16 THEN DO; ERROR_=O ; IF SASRC=4 THEN STOP; NESUG '92 proceedings ELSE DO; SASRC=O; 'NO RECORD WITH KEY= ' KEYVAR RUN; KEY=KEYVAR, this is how we make use of the index on the data file. Instead of reading each record sequentially and testing to see if it is one that we want. we will go directly to each record that is a match for a customer in our survey. We avoid doing 3,000,000 reads and 3,000,000 customer number compares. To do a keyed lookup on 1,500 survey respondents takes 1,500 read. FEEDBACK=SASRC, we make a new variable (SASRC) and set it equal to the value of (FEEDBACK) the system generated return code. SASRC's value is set after each lookup. If SASRC = 4 or 16, this is very bad! 4 is terminal and you have to STOP. 16 means your lookup failed as the key you used was not found. In either case, reset ERROR_ to 0 so you can continue. If SASRC .NOT. 4 or 16, you have your data. Situation 3 While the above code will work for a lookup where each survey matches only one record in the purchase history file, what to do when we need to look at each order over the past 2 years (up to 9 orders for any single customer) to see the duration of time between orders. We need to retrieve DOF-ORD-DT for each order in the detailed order file for the 1,000 surveyed customers and SOD surveyed former customers. The way to get to the DETAIL-ORDER-FILE is to Host Systems and Environments use the CUSTNUM to read the HOUSE-FILE. In the HOUSE-FILE there is a variable SEGMENTCODE that, when concatenated to the CUSTNUM, creates a partial key to the DETAILORDER-FILE. KEYED SEQUENTIAL VSAM READ LlBNAME SURVEY 'MKT01.JIM.SUR92'; FILENAME HOUSEFIL 'UVE.PRJV.HSF'; FILENAME PURHIST 'UVE.PRJV.PURH'; DATA BOTH; SET SURVEY; KEYVAR=CUSTNUM ; INFILE HOUSEFIL VSAM KEY=KEYVAR FEEDBACK=SASRC ; INPUT SEGCODE $ 19-23; IF SASRC = 4 OR SASRC = 16 THEN DO; ERROR_=O; IF SASRC=4 THEN STOP; ELSE DO; SASRC=O; PUT 'NO RECORD WITH KEY= ' KEYVAR; SEGCODE='XXXXX' ; RETURN; END; END; KEYCUS=CUSTNUM I I SEGCODE ; INFILE PURHIST VSAM KEY=KEYCUS GENKEY SKIP FEEDBACK=SASRC; FORMAT KEYCUS2 $ 14.; INPUT VARKEY $ 41-54 SEQNUM 55-56 ORDDATE $ 227-232; IF SASRC = 4 OR SASRC = 16 THEN DO; ERROR_=O; IF SASRC=4 THEN STOP; ELSE DO; SASRC=O; PUT 'NO RECORD WITH KEY= ' KEYCUS ; RETURN; END; 269 END; KEYCUS2=VARKEY ; DO WHILE (KEYCUS2=VARKEY) ; SELECT (SEQNUM) ; WHEN(1) ORD1 =ORDDATE ; WHEN (2) ORD2=ORDDATE ; WHEN (3) ORD3=ORDDATE ; WHEN (4) ORD4=ORDDATE; WHEN (5) ORD5=ORDDATE; WHEN (6) ORD6=ORDDATE; WHEN(7) ORD7=ORDDATE; WHEN(8) ORD8=ORDDATE; WHEN (9) ORD9=ORDDATE ; END; INPUT KEYCUS2 $ 1-14 SEQNUM 55-56 ORDDATE $ 227-232 IF SASRC = 4 OR SASRC = 16 THEN DO ; ERROR_=O ; IF SASRC=4 THEN STOP; ELSE DO; SASRC=O; PUT 'NO RECORD WITH KEY= ' KEYCUS2 ; RETURN; END; END; END; RUN; Here we have 2 keyed lookups. First a quick stop at the housefile to get SEGMENT-CODE. SEGCODE is concatenated to CUSTNUM to form a 'partial key" to the DETAIL-ORDER-FILE (DOF). The full key to the DOF is: CUSTNUMI ISEGNUMI ISEQNUM where SEQNUM takes values 01 - 09. GENKEY is specified so as to allow SAS to use the partial key to find the first matching record in the DOF. SKIP tells SAS to stop using direct keyed reads and now process sequentially. NESUG '92 Proceedings Host Systems and Environments 270 We continue to read records until a new partial key value is encountered (that's the DO WHILE part). NEW THINK AND DOUBLE SPEAK THE LANGUAGE OF THE COBOL COPYBOOK COBOL boosters like to say that COBOL code is self documenting and that clear/accurate data definition is built into the copybook structure. Here are two common copybook 'features' that can confuse and mislead. ONE DATA FIELD - MANY MEANINGS 05 UMX-PAYMENT-ELEMENT-1 05 UMX-ORDER-PAYMENT-1 REDEFINES UMX-PAYMENT-ELEMENT-1. 10 10 10 UMX-PAYMENT-SEC-CNTRY-1 PIC XX. UMX-PAYMENT-SEC-1 PIC X(9). UMX-PAYMENT-SEC-CHK-1 PICX. 05 UMX-CASH-PAYMENT-1 REDEFINES UMX-PAYMENT-ELEMENT-1. UMX-PAY-CURRENCY-CODE-1 PIC xxx. UMX-PAY-CURRENCY-FILL-1 PIC X(9). UMX-PAY-IS-CASH-1 VALUE' CURRENCY'. 10 10 88 PIC X(12). Each 'REDEFINES' is reading the same data but giving it different meanings. It would be incorrect to count each 'field name' when establishing the offset (range of columns) for each of the variable locations. The numbers at the far left Of each line are called level numbers. They represent the hierarchy of the data structure being defined. That is, the 05 level is made up of all the level 10's that come under it - and before the next 05. Level 10s can be subdivided into level 158 - 158 into NESUG 192 Proceedings 208..... In the above example, there are only 12 columns being described. These 12 columns may be referenced as one 12 byte field (UMXPAYMENT-ELEMENT-1), as 2 bytes - 9 bytes - 1 byte (first redefines), or as 3 bytes - 9 bytes (second redefines). BUT, there are only 12 bytes being described. It is as if you wrote your SAS INPUT statement to keep rereading the same columns over and over again. Most computer systems using VSAM files have some editor which allows you to edit/browse a VSAM file. FILE-AID is a well known and extremely useful tool for doing this on an IBM mainframe. Using FILE-AID and the 'COL' line command you can often figure out your offsets without looking at the COPYBOOK (If the fields are obvious). Or 'MAP' the file, i.e. edit the data telling FILE-AID the name of the COPYBOOK This gives you a PROC FSEDIT like view of the file. Using the MAP option, position field you are interested in at the top of the screen, and read the column indicator (usually in the upper right corner). If you are really adventuresome, FILEAID option 3.8 will allow you to 'compile' the map. Once compiled the starting and ending columns for each field are displayed. ITS NOT WHAT YOU THINK ************-*************************** ** UTABLEX - COPYBOOK FOR INDEX ** ENTRY TABLE - START **************************************** 01 05 * 10 * 10 * IXE-INDEX-ENTRY. IXE-KEY. FIRM NUMBER IXE-ID-FIRM PIC S9(04) USAGE COMPo OFFICE NUMBER PIC X(06). IXE-ID-OFC TABLE NAME (LOGICAL FILE Host Systems and Environments NAME) 10 XE-NM-TBL-INDX PIC X(16). ,. ENTRY NAME (LOW VALUES FOR ,. CONTROL ENTRY) 10 IXE-NM-ENTR 05 IXE-DATA. 10 FILLER ,. ,. PIC X(16). PIC X(06). When one check the field IXE-NM-ENTR (use FILE-AID with MAP=ON) IXE-NM-ENTR 16/AN X'01 F54040404040404040404040404040' This definitely is not what is commonly described as a 'character" field. This is some type of binary stuff - which you cannot read with a $ INFORMAT. It turns out that this is a 'general purpose' file where each field means many things - depending on the value of certain other fields. For the data I wanted, I was to read only the first two bytes of IXE-NM-ENTR (HEX 01 F5) and I was told that these two bytes were 'signed binary'!!!??? (turned out to be PIB2) COPYBOOK field descriptions must be verified. Just because this is listed as a PIC X(16) does not mean that the field contains standard printable characters. 271 COMPutations on these fiJeds). But this takes extra effort to read in SAS. First, the INFORMAT. In the IBM 370 world this would be read as: PD8.10 You get 8.10 by adding the (5) + (10). This makes 15, add one for the sign, makes 16. (If adding 1 for the Sign made an odd number, add one more). Divide the 16 by 2 (as a packed field stores 2 numbers in one 'column') and we find the field to be 8 columns wide. The V9 part is the implied decimal. REFERENCES SAS Guide To VSAM Processing Version 5 Edition SAS Institute, Cary, NC 1985 CONTACT INFORMATION George P Sharrard Burton, Greene, Smolka & Associates 14 Sunwich Road Rowayton, CT 06853 E-Mail [email protected] COMP-3 05 TOT-COST PIC S9(5)V9(10) COMP-3 COMP-3 is used in order to save big numbers in small spaces (also has a number of advantages if you are going to do mathematical NESUG '92 Proceedings
© Copyright 2025 Paperzz