Mainframes 360
The one stop destination for System Z professionals

Sunday, August 2, 2009

DFSORT– SUM FIELDS


Q. What is Aggregation?
To aggregate, is to sum things up. When you add up values together, to find a total sum, its an Aggregation. For example, if a salesman is able to sell products worth Rs. 1000 in January, Rs. 4000 in February and Rs. 5000 in March, his total sales = 1000 + 4000 + 5000 = Rs.10,000. This operation is called aggregation.

A Fast Moving Consumer goods company(FMCG) like Unilever manufactures the beauty-soap Lux. Say, a manager at his disposal, possesses the sales data for the year 2011-2012, for the soap Lux, all over the Indian Sub-continent. The data runs into lakhs of records. It is of little help. You begin to aggregate the raw disjointed data, over the territory or city of sale and analyse the results. The Lux Soap Sales, city-wise distribution gives a better picture to the manager about the geography, where the brand is performing better, and the territories, where they need to focus and requires more promotion. If you aggregate the sales data over different age groups, you may figure out conclusively, as to which age group prefers Lux Soap over other Soap bars, is it the 20s something, or the 30s, or the middle-aged.

The essence of the above example is that, summary reports having aggregate data provides useful insight to the higher management in an organisation. Its a knowledge-discovery process. We are knee-deep in data, there's tonnes of data around us, but we need to mine, extract knowledge from it.


As you read further, you shall learn how easy it is to run SORT, the free software program supplied by IBM, to aggregate data.
Q. How to add up numbers using the free software program DFSORT?
Have you ever tried adding up numbers in your favourite spread-sheet application, Microsoft Excel? Just pick the column with the numbers you wish to add, select or highlight it, and click on the Auto-sum button(the one with the Greek alphabet Sigma on it). The numbers in the Excel column are added up together, and the sum-total is displayed.
 
image
IBM's free software program SORT offers the SUM operation, to add numbers up. The SUM operation takes a parameter FIELDS. FIELDS parameter specifies the column of numeric values, you wish to add. 

image

To refer to any column, you must know the position and the length. Take a look at the above picture. The name of the Sales Representative starts at position 01, and has a length 25 characters. Thus, the Sales-Representative column is (01,25). The territory column is (26,12). The country column will be referred to as (38,06). The sales figure in Rs. column would be (44,05).

I would like to sum up the sales(in Rs.) in the country India. I wrote the SORT Control card like this -

image

The SORT software program performs a SUM operation on the (44,05) column. It adds up the data in the Sales(in Rs.) column. The SORT Software program also needs to be told the data-format. Since we are adding a bunch of numbers, the appropriate data-type is numeric, ZD. 

Often managers need data with different levels of summarization, to make intelligent decisions. For example, the global sales manager, would study the sales by country, and understand how their products are faring in India vis-a-vis other Asian countries. Lower in the organisational hierarchy, the India sales manager would take a look at the territory-wise sales. While summarizing and adding things up, you may form groups or sets. For example, if I consider each country as a single large group, and add things up, I'd get the total-sales in India, the total-sales in China, the total-sales in Bangladesh and so on.. This would be a country-wise sales report. You may group data by territory to produce a sales(in Rs.) territory-wise distribution report.

Adjusting the level-of-detail, forming groups or subsets of data is done by coding the SORT operation. I am interested to know the total sales(in Rs.) in the country India. The SORT software program should treat each country, as a separate group, while adding up the sales. Hence, I have referred to the Country (38,06), as a criterion to group data. Country is said to be the controlling field.

When you run the SORT Software Program, with above Control-Card, the Output Dataset shall look like this. The total sales in India is Rs.15,000.

image

On the same lines, the break-up of sales by territory, could be done by treating each region or state as a single group. You must code the SORT operation, and group by the territory (26,12). I wrote the SORT Control card like this -

image
 
When you run the SORT Software Program, with above Control-Card, the Output Dataset shall look like this. The total sales in Gujarat state is Rs.7000, in Maharashtra Rs. 1000, in Tamil Nadu Rs.5000, and in the state of Uttarakhand Rs.2000.

image
Q. Is it possible to add numbers, on two or more control fields?
Patients do not have knowledge of medicine. Hence, they approach a doctor/general-pracitioner, who prescribes them drugs. Glaxo Smithkline(GSK), a leading pharmaceutical company employs many field representatives to promote and pass on knowledge about their drugs(prescription drugs and not OTC drugs), to doctors. Shri. Quasar S Chunawala works for GSK, as a sales rep in the Eastern states, Shri. Prasanna Limaye handles the western states, Smt. Neha Laxman Rawat is covering Northern states of Uttaranchal, Himachal Pradesh and Punjab, whereas Shri. Sai Kiran Valisetty is a field rep in South India.

The below table is the sales data for Glaxo Smithkline Pharma company in India, for the year 2011-12.

image
At the end of the year, it is time to evaluate the performance of the sales representatives. I would like to look at the sales totals of each sales rep, territory-wise, and further state-wise.

In this sales analysis, the software program DFSORT allows you to divide this cosmos(universe) of data into galaxies(clusters) and further drill them down into smaller sub-groups(Solar-systems), to calculate totals and sub-totals. For example, you may form groups of Sales data by territory, say Eastern region sales(blue colored), Western region sales(red colored), sales up in the North(green colored) and sales down south(sky-blue colored).

You may further dissect it by state. So, the total sales in the Northern territory, constitutes the sales in Uttaranchal, Himachal Pradesh and Punjab.

image 

I would like to sum up the Sales Target(in Units) referred to as (46,03) column, the Sales Actual(in Units) referred to as (50,03) column. To do so, I must code the SUM operation. I began writing my SORT control card like this -

image

I would like to control the Sales-analysis by territory (01,06), and look at the sales totals by territory. I coded the SORT operation, to group the data by territory.

When you run the SORT software program using the above control-Card, the output-dataset looks like this. Shri. Quasar Chunawala has sold 806 units, against a quota of 965 units in Eastern region. Smt. Neha Laxman Rawat has sold 890 units in the North, Shri. Sai Kiran Valisetty has sold 760 units in the South and Shri. Prasanna Limaye was able to sell 840 units in the Western territory.

image

To calculate the sales sub-totals for each state (08,18), the state must be the second controlling field. This will form state-wise sub-groups within each territory group. I modified the SORT control-card as follows.

image

When you run the SORT software program using the above control-Card, the output-dataset looks like this.

image

To many people who are thrown to work at a mainframe computer on their first job, they feel lost. Mainframe people seem to speak a completely different language and that doesn't make life easy. What's more, the books and manuals are incredibly hard to comprehend.

"What on earth is a Mainframe?" is an absolute beginner's guide to mainframe computers. We'll introduce you to the hardware and peripherals. We'll talk about the operating system, the software installed on a mainframe. We'll also talk about the different people who work on a mainframe. In a nutshell, we'll de-mystify the mainframe.

Readers based in India, can buy the e-book for Rs. 50 only or the print book. International readers based in the US and other countries can click here to purchase the e-book.