| Q. I’ve heard of this tool called DFSORT. What is DFSORT? Is it any good? |
| DFSORT is a utility program supplied by IBM. The main job of DFSORT is to sort(arrange) the records in a file in some order. Apart from sorting, you can do a lot more things with DFSORT. It turns out, that DFSORT is a very useful utility. DFSORT can be used to filter out a subset of records from a dataset, and then only sort these records. Thus, DFSORT gives you to flexibility to include or omit records. You can even re-format the records - access field level data, add, edit, or delete columns. You can insert whitespace, zeros, date and time information etc. DFSORT can also do group(aggregate) operations. For example, you might want to find the department-wise break of the sum of salary of Employees in an organisation. EmpName DeptID Salary --------------------------------------- Andy 10 1000 | Bob 10 2000 |--Total of Dept 10 = 6000 Cathy 10 3000 | Dominic 20 4000 |--Total of Dept 20 = 9000 Edward 20 5000 | Frank 30 6000--Total of Dept 30 = 6000
So, in other words, you can do analysis of a given set of raw input data, produce findings or conclusions and present them in the form of a report to the top-level managers. Hence, DFSORT is not just a sort utility, it is a complete analysis and reporting tool from IBM. You can convert VB Datasets to FB and vice versa.
|
| Q. Can you show Simple JCL/Batch to Sort the records in Input Dataset using DFSORT Utility? |
| Consider the following Input Dataset CAA0176.DEMO.SORTIN. | ----+----1----+----2----+----3----+--- QUASAR 23423423434534344 1000 10 RAHUL 13342345345345345 2000 20 VEERA 34535345325354324 3000 20 JAYAN 98347385385933987 4000 30 MADHAV 67575789769876785 5000 30 MEGHA 50830948530859340 6000 30 SNEHAL 30495849572938495 7000 40 JAINES 98347385385933987 8000 40 | Suppose we want to sort the records in this input dataset by the NAME Field. The NAME Field starts at position 1 and it has length = 6. The data we want to sort is alphabetic(CH), and we want the records to be sorted in increasing/ascending order. So, SORT FIELDS Control Card should be SORT FIELDS=(1,6,CH,A). The batch JOB/JCL for sorting the records in ascending order of NAME will be
| ----+----1----+----2----+----3----+----4----+----5----+----6----+----7-- //JOB01 JOB A123,QUASAR,NOTIFY=&SYSUID //STEP01 EXEC PGM=SORT,REGION=1024K //SORTIN DD DSN=CAA0176.DEMO.SORTIN,DISP=SHR //SORTOUT DD DSN=CAA0176.DEMO.SORTOUT,DISP=(OLD,KEEP,KEEP) //SYSIN DD * SORT FIELDS=(1,6,CH,A) // | The unarranged input records are in the dataset CAA0176.DEMO.SORTIN The output records are in the dataset CAA0176.DEMO.SORTOUT(which I've already created through ISPF) This is how the Output Dataset CAA0176.DEMO.SORTOUT shall look -
| ----+----1----+----2----+----3----+ JAINES 98347385385933987 8000 40 JAYAN 98347385385933987 4000 30 MADHAV 67575789769876785 5000 30 MEGHA 50830948530859340 6000 30 QUASAR 23423423434534344 1000 10 RAHUL 13342345345345345 2000 20 SNEHAL 30495849572938495 7000 40 VEERA 34535345325354324 3000 20 | |
| Q. Hey that was a piece of cake! You’ve put up the title as SORTING by Multiple Fields. What’s that? How does it work? |
A simple way to understand this is as follows – Let’s say you sorting a file as in the above example. What happens if two people have the same identical first-names? In other words, the field/key by which sorting is done could be same for two or more rows. In this case, how we arrange the rows. Well, the simple solution is to specify yet another field/key – Secondary Key. When the primary key field is the same for two rows R1 and R2, then the secondary key field comes into play. The order of the rows R1 and R2 in the output will then be decided by the secondary key field/value. Consider the following input dataset AgentID NAME Country State City Sales | ----+----1----+----2----+----3----+----4----+----5----+----6----+ AG10001 RAMESH CHANDRA GUHA INDIA MAHARASHT MUMBAI 1000 AG10002 QUASAR SHABBIR CHUN INDIA MAHARASHT PUNE 2000 AG10003 VEERAMANI DARYAGANJ INDIA MAHARASHT NAGPUR 3000 AG10005 DELPHAN FERANDAZ INDIA TAMIL NAD CHENNAI 4000 AG10005 PADMAVATHI SUNDARAM INDIA TAMIL NAD SALEM 5000 AG10006 NEELESHWARI IYER INDIA TAMIL NAD ERNAKULAM 6000 AG10007 AJAY NANUBHAI PARSA INDIA WEST BENG KOLKATA 7000 AG10008 MEHUL POPATLAL SUTA INDIA WEST BENG KHARAGPUR 8000 AG10009 KRITHIKA RAMANUJAM INDIA WEST BENG SUNDARBAN 9000 AG10010 ANKUR CHOUDHARY INDIA PUNJAB AMRITSAR 10000 AG10011 HIREN NITIN SHETH INDIA PUNJAB JALANDHAR 11000 AG10012 DARSHAN SHAH INDIA HARYANA CHANDIGAR 12000 | This is the Input Dataset. The above data depicts the sales of Insurance Agents in different cities. For a given state around 2-4 agents cover all major cities. The top-level manager has asked for a report that shows the sales of different agents arranged in the alphabetical order of the State. We would write the SORT FIELDS Control Statement as, SORT FIELDS=(40,10,CH,A). However, there are several records(Agents) in the same state. Then, we can sort them by their CITY. SORT FIELDS=(40,10,CH,A,50,10,CH,A) The Batch JOB/JCL for sorting the records by State, City is
| ----+----1----+----2----+----3----+----4----+----5----+----6----+----7-- //JOB02 JOB A123,QUASAR,NOTIFY=&SYSUID //STEP01 EXEC PGM=SORT,REGION=1024K //SORTIN DD DSN=CAA0176.DEMO.SORTIN,DISP=SHR //SORTOUT DD DSN=CAA0176.DEMO.SORTOUT,DISP=(OLD,KEEP,KEEP) //SYSIN DD * SORT FIELDS=(40,10,CH,A,50,10,CH,A) // | Upon submitting the above the Batch JOB/JCL, we get the following Sorted Output in the file CAA0176.DEMO.SORTOUT => | ----+----1----+----2----+----3----+----4----+----5----+----6----+ AG10012 DARSHAN SHAH INDIA HARYANA CHANDIGAR 12000 AG10001 RAMESH CHANDRA GUHA INDIA MAHARASHT MUMBAI 1000 AG10003 VEERAMANI DARYAGANJ INDIA MAHARASHT NAGPUR 3000 AG10002 QUASAR SHABBIR CHUN INDIA MAHARASHT PUNE 2000 AG10010 ANKUR CHOUDHARY INDIA PUNJAB AMRITSAR 10000 AG10011 HIREN NITIN SHETH INDIA PUNJAB JALANDHAR 11000 AG10005 DELPHAN FERANDAZ INDIA TAMIL NAD CHENNAI 4000 AG10006 NEELESHWARI IYER INDIA TAMIL NAD ERNAKULAM 6000 AG10005 PADMAVATHI SUNDARAM INDIA TAMIL NAD SALEM 5000 AG10008 MEHUL POPATLAL SUTA INDIA WEST BENG KHARAGPUR 8000 AG10007 AJAY NANUBHAI PARSA INDIA WEST BENG KOLKATA 7000 AG10009 KRITHIKA RAMANUJAM INDIA WEST BENG SUNDARBAN 9000 | |
| Q. Well that was quite a comprehensive example.. What do you mean by Merging two datasets? Is is the same as just concatenating them together? |
| Suppose we have 3 input files which are sorted. We would like to merge them, in such a way, that resultant output dataset is also sorted. You can merge two input datasets, and produce a single output dataset, such that the output is sorted using DFSORT tool. For merging, instead of using SORT FIELDS Control card, we will now use MERGE FIELDS Control Card. Remember that, for merging the records in datasets, the records in the Input Datasets should already be sorted(they should be in the right sequence). (If it is not so, then DFSORT throws ABEND Code 16) INPUT DATASET 1=>CAA0176.DEMO.SORTIN01 | ----+----1----+----2----+----3 00005 QUASAR CHUNAWALLA 00008 RAMESH CHANDRA GUHA | INPUT DATASET 2=>CAA0176.DEMO.SORTIN02 | ----+----1----+----2----+----3 00004 AHILYABAI SALVI 00009 RAMABAI PESHWA | INPUT DATASET 3=>CAA0176.DEMO.SORTIN03 | ----+----1----+----2----+----3 00003 AMEYA BOBADE 00007 A.R. RAHMAN | We can merge the above records, and taking the key field as EMPID, which starts from column 1 and has length 5. So, the MERGE FIELDS control statement would be : MERGE FIELDS=(1,5,CH,A) The Batch JOB/JCL for merging the above files looks like this : | ----+----1----+----2----+----3----+----4----+----5----+----6----+----7-- //JOB04 JOB A123,QUASAR,NOTIFY=&SYSUID //STEP01 EXEC PGM=SORT,REGION=1024K //SORTIN01 DD DSN=CAA0176.DEMO.SORTIN01,DISP=SHR //SORTIN02 DD DSN=CAA0176.DEMO.SORTIN02,DISP=SHR //SORTIN03 DD DSN=CAA0176.DEMO.SORTIN03,DISP=SHR //SORTOUT DD DSN=CAA0176.DEMO.SORTOUT,DISP=(OLD,KEEP,KEEP) //SYSPRINT DD DSN=CAA0176.DEMO.JOBLOG,DISP=(OLD,KEEP,KEEP) //SYSIN DD * MERGE FIELDS=(1,5,CH,A) // | Upon submitting the above Batch JOB/JCL, we get the following output sorted file : | ----+----1----+----2----+----3 00003 AMEYA BOBADE 00004 AHILYABAI SALVI 00005 QUASAR CHUNAWALLA 00007 A.R. RAHMAN 00008 RAMESH CHANDRA GUHA 00009 RAMABAI PESHWA |
|
| Q. How do you extract a subset of records from an Input Dataset using DFSORT? Like take out all records having sales > 5000 into another separate file. |
| For Records Selection, we have to use the INCLUDE/OMIT Control Statements of the DFSORT Utility. Inside the INCLUDE/OMIT Control cards, we can filter out particular rows, by specifying various conditions. Thus, you can first extract a subset of records using the INCLUDE COND or OMIT COND Control Statement and then sort only these particular records using SORT FIELDS, instead of the entire file. Suppose we have the following input Dataset, from which we would like to filter out records. | ----+----1----+----2----+----3----+----4----+----5----+----6----+----7-- AG10001 RAMESH CHANDRA GUHA INDIA MAHARASHT MUMBAI 01000 AG10002 QUASAR SHABBIR CHUN INDIA MAHARASHT PUNE 02000 AG10003 VEERAMANI DARYAGANJ INDIA MAHARASHT NAGPUR 03000 AG10004 DELPHAN FERANDAZ INDIA TAMIL NAD CHENNAI 04000 AG10005 PADMAVATHI SUNDARAM INDIA TAMIL NAD SALEM 05000 AG10006 NEELESHWARI IYER INDIA TAMIL NAD ERNAKULAM 06000 AG10007 AJAY NANUBHAI PARSA INDIA WEST BENG KOLKATA 07000 AG10008 MEHUL POPATLAL SUTA INDIA WEST BENG KHARAGPUR 08000 AG10009 KRITHIKA RAMANUJAM INDIA WEST BENG SUNDARBAN 09000 AG10010 ANKUR CHOUDHARY INDIA PUNJAB AMRITSAR 10000 AG10011 HIREN NITIN SHETH INDIA PUNJAB JALANDHAR 11000 AG10012 DARSHAN SHAH INDIA HARYANA CHANDIGAR 12000 | Suppose we want to filter only those Agents, whose Sales are greater than 5000. Thus, we can code the INCLUDE COND as follows : INCLUDE COND=(start-col(s),length(l),format(f),operator(o),constant(c)) -or- INCLUDE COND=(field1-col,length,format(f),operator(o),field2-col,length) Now, we would like to filter all the agents based on their SALES Figure. The SALES Column begins from position s=60 and is of length l=5. The Sales field, we assume is of type textual/character. We shall talk about numeric comparisions later. Since, we want Agent-SALES > 5000, Operator o = Greater than(GT), and constant c=5000. Character(string) constants are given as c’<constant>’. So, the constant would be c’05000’. If you are wondering why we couldn’t write simply c’5000’ has to do with the type of comparision. Okay, so the INCLUDE COND Control statement for the above filter would be - INCLUDE COND=(60,5,CH,GT,C'05000') The Batch JOB/JCL for the above task of filtering out records would be like this - | ----+----1----+----2----+----3----+----4----+----5----+----6----+----7-- //CAA0176H JOB A123,QUASAR,NOTIFY=&SYSUID //STEP01 EXEC PGM=SORT,REGION=1024K //SORTIN DD DSN=CAA0176.DEMO.SORTIN,DISP=SHR //SYSPRINT DD SYSOUT=A //SORTOUT DD DSN=CAA0176.DEMO.SORTOUT,DISP=(OLD,KEEP,KEEP) //SYSIN DD * SORT FIELDS=COPY - INCLUDE COND=(60,5,CH,GT,C'05000') // | Upon submitting the above Batch JOB/JCL, we get the following output - | ----+----1----+----2----+----3----+----4----+----5----+----6----+ AG10006 NEELESHWARI IYER INDIA TAMIL NAD ERNAKULAM 06000 AG10007 AJAY NANUBHAI PARSA INDIA WEST BENG KOLKATA 07000 AG10008 MEHUL POPATLAL SUTA INDIA WEST BENG KHARAGPUR 08000 AG10009 KRITHIKA RAMANUJAM INDIA WEST BENG SUNDARBAN 09000 AG10010 ANKUR CHOUDHARY INDIA PUNJAB AMRITSAR 10000 AG10011 HIREN NITIN SHETH INDIA PUNJAB JALANDHAR 11000 AG10012 DARSHAN SHAH INDIA HARYANA CHANDIGAR 12000 | Note : Here CH specifies that the field SALES is a Character(textual) item. BI will be used if you want to compare it in Binary format. Similarly, ZD and PD are used Zoned Decimal and Packed Decimal.
|
0 comments:
Post a Comment