Bulletin Board

What’s new at MAINFRAMES 360
(Updates June '10)...
- I am writing articles on CICS. The effort would be to make the material of certain standard, practical and easy to comprehend.
- JCL needs fresh look, and more weightage needs to assigned to conventions and practises adopted in Production Environments at most sites
- A note of appreciation : Thanks to all our followers, for making Mainframes360 a success!

Mainframes360 Search

Loading

Quick Links

Jump to :  

Sunday, August 2, 2009

DFSORT – Sum(Aggregation), SUM FIELDS Control Statement


Q. What is Aggregation(Summation) of rows? How Summary Reports could help top-level executives/managers?
Aggregation/Summation is the process of taking a list of fields/items and adding them up to find the Sum Total. For example, if Employee details were stored in an Input Dataset  in the format – EmpName DeptID and Salary, aggregation can be done on Salary field of the Employee Records to find the total salary of all the Employees working in the Organisation. That is, using Aggregation, you could perform the following operation -
Salary(Employee 1) + Salary(Employee 2) + ...= Total Salary
i.e. Sum(Salary of all Employees) – This operation is called Aggregation.

All business organisations generate business data. For example, a product based company generates/keeps track of Customer data, Sales Data etc. All Data is of two types –
(i)Low-level Data(Transactional data) (ii) High-level data(Aggregate data).

Suppose a telecommunication company like Vodafone wants to launch a new subscriber-based plan. The company’s top level executives need to decide on the call-rates and sms-rates. This is a strategic decision. Vodafone has a huge database of past 5 years’ call-records. In order to aid the decision-making process of the managers, and help them decide, what call-rates would be best and appropriate, the company asks its’ team of Business Analysts(BA) to conduct an Analysis on the Huge Low-level Database – which stores past 5 years data. After analysing the data, and digging deeper, the business analysts come out with findings/results – High Level Summarised Data, and suggest that the appropriate call rate would be $0.50 to $1.00 per minute. Based on these findings/conclusions, the top-level managers give the go-ahead and green signal to their plan, and decide that call-rate would $0.75 per minute. 

The essence of the above example is that, the high-level summarised results helped the top-level executives in their strategic decision making. This is far important, since strategic decisions are long-term, and they involve long-term commitment of resources. Thus, Aggregation/Summarisation is the process of extracting KNOWLEDGE(High Level Summary) from RAW DATA(Low-level Transaction Data).

Aggregation/Summarisation is a Knowledge Discovery Process. Now-a-days, you need to dig(mine) knowledge from the data.


DFSORT is the IBM Supplied utility which may used to aggregate or summarise your data.

To perform a Sum/Aggregation, in DFSORT, we must use the SUM FIELDS Control Statement. To add to this, you could divide the employees into groups – e.g. Think of all employees working in HR Department is one group, all employees working in Marketing Department could be thought of as another group. If you not only wanna know the total Salary for all employees, but also department-wise salary, i.e. you would like to know total salary given to all employees of HR Dept, or the total salary given to all employees of Marketing department, this is possible using DFSORT. You can divide the Input Data into groups, and then perform group-wise summation(aggregation) to find group totals.

Thus, you can dig deeper, and perform aggregation or summarisation at different levels. For example, you could generate
1) Report of Total World-wide Sales
2) Report of Country-wise Sales Totals – to find which country accounts for major chunk of sales
3) Report of State-wise Sales Totals in the country United States of America -
to find out which state contributes to more sales.
4) Report of City-wise Sales Totals in the State of Wisconsin

Thus, the hierarchy in the above example is World->Country->State->City.
Q. How do you write a simple Batch JOB/JCL to perform aggregation/Summation using DFSORT?
To perform aggregation/summing, you need to used the SUM FIELDS Control Statement of DFSORT. The SUM FIELDS Control Statement has the following format -

SUM FIELDS=(start-position,length,format)

For example to sum a field which at position 60, having length=5 and numeric, that is format = Zoned Decimal(ZD), we must write -

SUM FIELDS=(60,5,ZD)

Consider the following Input Dataset -

----+----1----+----2----+----3----+----4----+----5----+----6----+
AG10001  RAMESH CHANDRA GUHA INDIA     MAHARASHT MUMBAI    01000
AG10002  QUASAR SHABBIR CHUN INDIA     MAHARASHT PUNE      02000
AG10003  VEERAMANI DARYAGANJ INDIA     MAHARASHT NAGPUR    03000
AG10004  DELPHAN FERANDAZ    INDIA     TAMIL NAD CHENNAI   04000
AG10005  PADMAVATHI SUNDARAM INDIA     TAMIL NAD SALEM     05000
AG10006  NEELESHWARI IYER    INDIA     TAMIL NAD ERNAKULAM 06000
AG10007  AJAY NANUBHAI PARSA INDIA     WEST BENG KOLKATA   07000
AG10008  MEHUL POPATLAL SUTA INDIA     WEST BENG KHARAGPUR 08000
AG10009  KRITHIKA RAMANUJAM  INDIA     WEST BENG SUNDARBAN 09000
AG10010  ANKUR CHOUDHARY     INDIA     PUNJAB    AMRITSAR  10000
AG10011  HIREN NITIN SHETH   INDIA     PUNJAB    JALANDHAR 11000
AG10012  DARSHAN SHAH        INDIA     HARYANA   CHANDIGAR 12000


Suppose we would like find the Sum Total Sales of all Insurance Agents. The SALES Field starts  at column 60, has length 5 and is numeric data(ZD). The batch JOB/JCL for generating the desired report is as follows :

----+----1----+----2----+----3----+----4----+----5----+----6----+----7--
//CAA0176I JOB A123,QUASAR,CLASS=G,MSGCLASS=Y,NOTIFY=&SYSUID           
//STEP01 EXEC  PGM=SORT,REGION=1024K                                   
//SORTIN DD    DSN=CAA0176.DEMO.SORTIN,DISP=SHR                        
//SYSPRINT DD  SYSOUT=*                                                
//SYSOUT DD    SYSOUT=*                                                
//SORTXSUM DD  DSN=CAA0176.DEMO.REPORT,DISP=SHR                        
//SORTOUT DD   DSN=CAA0176.DEMO.SORTOUT,DISP=SHR                       
//SYSIN  DD    *                                                       
  SORT FIELDS=(30,10,CH,A)                                             
  SUM FIELDS=(60,5,ZD)                                                 
//                    


After Submitting the above Batch JOB/JCL, this is the Output Report which should be displayed -

----+----1----+----2----+----3----+----4----+----5----+----6----+
AG10001  RAMESH CHANDRA GUHA INDIA     MAHARASHT MUMBAI    78000


Note that some of the fields like AGENT-ID, AGENT-NAME, STATE and CITY are not relevant in this report. These irrelevant/unnecessary field can be removed from the output report by using OUTREC FIELDS Control Statement. Below is the JOB Log from my System -

SYSIN :                                                        
  SORT FIELDS=(30,10,CH,A)                                              0006010
  SUM FIELDS=(60,5,ZD)                                                  0006020
WER276B  SYSDIAG= 14, 306472, 306472, 1155861                                 
WER164B  1,772K BYTES OF VIRTUAL STORAGE AVAILABLE, MAX REQUESTED,            
WER164B     0 BYTES RESERVE REQUESTED, 1,000K BYTES USED                      
WER146B  32K BYTES OF EMERGENCY SPACE ALLOCATED                               
WER108I  SORTIN   : RECFM=FB   ; LRECL=    80; BLKSIZE=  8000                 
WER110I  SORTOUT  : RECFM=FB   ; LRECL=    80; BLKSIZE=  8000                 
WER410B  744K BYTES OF VIRTUAL STORAGE AVAILABLE ABOVE THE 16MEG LINE,        
WER410B     0 BYTES RESERVE REQUESTED, 732K BYTES USED                        
WER036B  G=400                                                                
WER177I  TURNAROUND SORT PERFORMED                                            
WER110I  SORTOUT  : RECFM=FB   ; LRECL=    80; BLKSIZE=  8000                 
WER410B  744K BYTES OF VIRTUAL STORAGE AVAILABLE ABOVE THE 16MEG LINE,        
WER410B     0 BYTES RESERVE REQUESTED, 732K BYTES USED                        
WER036B  G=400                                                                
WER177I  TURNAROUND SORT PERFORMED                                            
WER045C  END SORT PHASE                                                       
WER055I  INSERT          0, DELETE         11                                 
WER211B  SYNCSMF  CALLED BY SYNCSORT; RC=0000                                 
WER416B  SORTIN   : EXCP'S=1,UNIT=3390,DEV=5035,CHP=(DBDCDDDE1B1C1D1E,1),VOL=SO
WER416B  SORTOUT  : EXCP'S=1,UNIT=3390,DEV=51A6,CHP=(DBDCDDDE1B1C1D1E,1),VOL=SO
WER416B  TOTAL OF 2 EXCP'S ISSUED FOR SORTING                                 
WER246I  FILESIZE 960 BYTES                                                   
WER054I  RCD IN         12, OUT          1                                    
WER072I  EQUALS IN EFFECT                                                     
WER169I  RELEASE 1.2 BATCH 0478 TPF LEVEL 3.1                                 
WER052I  END SYNCSORT - CAA0176I,STEP01,,DIAG=8200,40CB,BB31,24E6,CACE,6CE3,AA2

0 comments:

Post a Comment

Related Posts with Thumbnails

Quick Links

Jump to :  

Note :

© Copyright – Quasar Chunawalla, 2010.
Note : The copyrights of all the material, text and pictures posted in this website belong to the author. Any instance of lifting the material from this website, shall be considered as an act of plagiarism. For any clarifications, please mail at quasar.chunawalla@gmail.com
 
back to top