Mainframes 360
The one stop destination for System Z professionals

Wednesday, September 12, 2012

Know your variables

Know your variables

Have you tried coffee at Starbucks? Starbucks offers so many coffee sizes - Demi, Short, Tall, Grande(pronouced Grawn-de), Venti and Trenta. A variable is just a coffee-cup. It's a container in memory. It holds something. They come in different sizes and types.

Declaring a variable

COBOL cares about the type of the variable. You can't do something bizarre and dangerous like storing giraffe data in a rabbit variable. It's risky to put floating-point data in an integer variable. You can't put alphabetic data into an integer. What if someones tries to add this alphabetic-data? To ensure type-safety, you must declare the type of your variable. Is it character, numeric, fixed-point or floating-point?

Q. What is a Variable? What are Literals?
A Computer Program takes Data as Input, performs processing on the Input Data, and produces and Output. You would like to store the Input Data and Output Results in Computer Memory, so that you can retrieve it for later purposes.

Let's take a look at how Computer Memory looks like. Just like on the street people live in houses, Computer Memory is organised as a series of Cells. These cells do not house people, instead they house Data.

You can visualize a picture of computer memory like the one below:

Suppose you have stored the number 2, at some Memory Location in the Computer Storage. Next time, you want to retrieve the contents of this cell. How to go about it? In Computer Memory, how do you refer to a particular Cell or Memory Location? In the real world, houses on a street have different names.

In a similar fashion, what you have to do is, you need to assign a name to the Computer Memory Location or House, say MY-NUMBER. And then, you can access the contents of this Memory Location, using the name MY-NUMBER. Next time, you simply have to say, "DISPLAY the contents of MY-NUMBER", and you'll get the Output=2.

I am going to tell you a little more about the data(contents) stored inside a Computer Memory Location. If the data stored in a Computer Memory Location can change, for example if the value in MY-NUMBER can be changed(modified) to 3, such a Memory Location is called Variable. The word Variable is due to the fact, that the contents of such a Cell can vary or change. On the other hand, a Literal means constant data.

Together, Variables and Literals are Computer Storage Areas, where data is kept, and identified a unique name.
Q. What are the rules for naming Variables?
Every Variable(Computer Storage Area) must have a name. It's a good idea to assign relevant and meaningful names to variables. For example, PRINCIPAL, NUMBER-OF-YEARS, RATE-OF-INTEREST are  self-explanatory names.

Whitespaces are not allowed. If a Variable name has multiple words, separate them by Hyphens, like AREA-OF-CIRCLE.
Q. What is declaration of Variables? What is DATA DIVSION?
You can't directly store data in a Variable(Mainframe Computer Storage Area). First, you must declare or announce the Variable.

Declaration actually causes the Mainframe Computer, to keep aside Storage Space for your Data. During declaration, you must specify exactly how many Bytes of Space you need, to store your Data – One Byte, Two Bytes, Three Bytes how much? Depending on your space requirements, the Mainframe Computer honours your request, and exclusively reserves(books) Storage Space for you. Now, you can go ahead and store some data in it.

Remember that, the Mainframe Computer is like a miser. It does not give away any Bytes for free. Even if you want a single byte of Memory Space, you must first ask the Mainframe Computer for it. Telling the Mainframe Computer, how much space you need to store Data, is called Declaration.

DATA DIVISION contains the declaration(list) of all the Variables(Computer Storage Areas), you want to use in the COBOL Program.
Q. How to declare Variables in the DATA DIVISION?
Declaring a Variable in COBOL is very easy. In COBOL, you code the data-name of the Variable, followed by Data-type. Suppose, I want to calculate Simple Interest on Rs. 1000, for 5 years, at 20 percent Interest.

The assumption is, to store one character you need 1 Byte. To store the Input Data-Items on the Mainframe Computer, I shall need three Variables – PRINCIPAL(4 Bytes large), NUMBER-OF-YEARS(1-Byte) and RATE-OF-INTEREST(2-Bytes). How to declare these variables in COBOL? You always code the Data-Name followed by Data-Type
Q. What are data-types? What are the basic data-types in COBOL?
Data-type indicates whether a variable can hold Alphabetic Characters such as 'A','B','C',... etc. or numeric data such as 123, –500, 6159, ... etc. The data that you can store in a COBOL-Variable falls into one of these classes :


Alpha-numeric data consists of Alphabetic character and numbers. For example, 
'HELLO 123 @$','INDIA IS THE 3RD LARGEST ECONOMY' are alpha-numeric strings. In COBOL, the Symbol X implies Alpha-numeric.

Numeric Data refers to numbers which are used in Arithmetic-computation. 123, 3.14159, –642.70 are examples of Numeric Data. In COBOL, the Symbol 9  implies Numeric.

Alphabetic Data refers to Non-Numeric Data. 'HELLO HOW DO YOU DO', 'NOT BAD!' are examples of pure Alphabetic-Strings. In COBOL, the Symbol A implies Alphabetic.
Q. What is PICTURE Clause?
PICTURE Clause specifies the Data-Type and Size(in Bytes) of a Variable. PICTURE Clause is coded after the Data-Name. You may code PICTURE or simply PIC.


PRINCIPAL Variable is defined as PIC 99999. 9 means PRINCIPAL Variable can hold Numeric Data. As it is a PIC 99999, five times, PRINCIPAL Variable occupies 5-Bytes of Storage space. Generally, you can store one character in a Byte. So, in 5-Bytes of Storage Space, you can store a number upto Five-Digits large.

NUMBER-OF-YEARS Variable is defined as PIC 9. NUMBER-OF-YEARS occupies 1-Byte of Storage-Space. In 1-Byte Space, you can store a Single-Digit Number.

RATE-OF-INTEREST Variable is defined as PIC 99, which suggests it is 2-Bytes big and numeric type. In 2-Bytes Space, you store a number upto .

FIRST-NAME Variable is specified as PIC XXXXXX. X stands for Alpha-numeric, so FIRST-NAME can hold alpha-numeric Data. Further, as its PIC XXXXXX(Six Times), you can store a alphanumeric Textual-word upto Six Characters large in it. Size or Length of FIRST-NAME is Six.

LAST-NAME Variable is specified as PIC XXXXXXXXXX. This means, you can store an alpha-numeric string  upto ten characters long in the LAST-NAME Variable.

You can code PIC XXXXXXXXXX, in short as PIC X(10). Similarly, you may code
PIC 99999 as PIC 9(05) in short.

Q. What are group and Elementary Data Items?
COBOL provides the facility to provide a detailed-breakup of a Variable(Computer Storage Area). A field or variable can be broken down further into smaller sub-fields. Consider the name of an Employee stored in the EMPLOYEE-NAME variable. It is declared as follows -


The Employee's name contains his First-Name, Middle-Name and his Last-Name, all put together. Therefore, it is possible to divide the EMPLOYEE-NAME into three parts EMPLOYEE-FNAME, EMPLOYEE-MNAME and EMPLOYEE-LNAME.


I have broken down the EMPLOYEE-NAME 30-Character Field, into EMPLOYEE-FNAME(10 characters) EMPLOYEE-MNAME(10 Characters) and EMPLOYEE-LNAME(10 Characters) fields. The EMPLOYEE-NAME is called a Group Data-Item. The sub-ordinate data-items under it – EMPLOYEE-FNAME, EMPLOYEE-MNAME and EMPLOYEE-LNAME are called Elementary or Simple Data-Items. A visual representation of the EMPLOYEE-NAME Area and its break-up is shown below.

In a similar fashion, the Address of an Employee generally consists of a Street, a City and Pin-Code. Therefore, in the COBOL Program you may specify a detailed break-up of Address like this. 


The Group Data-item Address is composed of Street, a 10-Byte Alphanumeric Field, City being a 10-byte alphanumeric field again and pin-code, a 6-digit numeric field. The Address Field has a sum-total size of 10 Bytes + 10 Bytes + 6 Bytes = 26 Bytes. Pictorially it may be represented as follows. 

Likewise, the phone-number of an Employee, would consist of Country- Code, City-Code and the Actual number. Look at, how I've coded EMPLOYEE-PHONE-NO Group-Item in COBOL.


The Full-Name of the Employee, the Address of the Employee and his
Phone-No. together represent an Employee's Data. COBOL allows aggregation, putting together Data-items, under a head, a higher-level data-item. So, I have clubbed EMPLOYEE-NAME, EMPLOYEE-ADDRESS and EMPLOYEE-PHONE-NO under one roof, EMPLOYEE-DATA. 


Take a look at the above picture. I'll just quickly run you through the EMPLOYEE-DATA Item's structure. EMPLOYEE-DATA is used to hold or store the data of an Employee. The Group Item EMPLOYEE-DATA is broken down into EMPLOYEE-NAME, EMPLOYEE-ADDRESS and EMPLOYEE-PHONE-NO. 
EMPLOYEE-NAME is a group-item in turn, consisting of EMPLOYEE-FNAME, EMPLOYEE-MNAME and EMPLOYEE-LNAME Elementary Items.

EMPLOYEE-ADDRESS is a group-item internally made up of EMPLOYEE-STREET, EMPLOYEE-CITY and EMPLOYEE-PINCODE Elementary Items.

The COUNTRY-CODE, CITY-CODE and LOCAL-NUMBER together constitute the Group-Item EMPLOYEE-PHONE-NO.

The EMPLOYEE-DATA Item can be represented with the help of a Inverted Hierarchical Tree-Like picture shown above. This is the structure of EMPLOYEE-DATA. In this manner, COBOL allows you to specify the format or structure of the data, by creating Group and Elementary Data-Items.
Q. What are Level-Numbers in COBOL?
In the Indian Military, General delegates his authority through Major General, who heads Lieutenant General, Brigadier, Colonel, and so on. There officers are at different ranks in the Military. In a similar fashion, when you describe a Data-structure or format in COBOL, every item holds a rank in the structure.

With reference to the example of the EMPLOYEE-DATA Item, take a look at the structure. This tree-structure has several Levels. Each level can be numbered 01, 02, 03 and so on.

At the top of the Hierarchy, is EMPLOYEE-DATA. This is the root-level or the top-most level. The highest level of data is at Level 01. EMPLOYEE-DATA is said to be a 01—Level Data-Item.

EMPLOYEE-NAME, EMPLOYEE-ADDRESS and EMPLOYEE-PHONE-NO are sub-ordinates of EMPLOYEE-DATA. So, they are said to be at Level 02. Similarly, EMPLOYEE-FNAME, EMPLOYEE-MNAME, EMPLOYEE-LNAME, EMPLOYEE-STREET, EMPLOYEE-CITY and so on... are at Level 03.

In COBOL, you indicate where a Data-Item stands in the Hierarchy(Structure-Tree) by coding the Level-Number. You write the Level-Number, followed by the data-item name. For example, you should code 01 EMPLOYEE-DATA, 02 EMPLOYEE-NAME and 03 EMPLOYEE-MNAME.

I have written the COBOL Code, for the above EMPLOYEE-DATA Tree Structure as follows -

It is considered a good practise to code Level-numbers as 01,05,10,15,20 and so forth in multiples of 5. The highest level of data is given Level 01. For the second level of data 05, and increments each level thereafter in multiples of 05.
Q. What is the size of the group-item EMPLOYEE-NAME? It doesn't have a PICTURE Clause.
Elementary-items like EMPLOYEE-FNAME, EMPLOYEE-MNAME and EMPLOYEE-LNAME must have a PICTURE Clause. The first, middle and last names are ten bytes each. The EMPLOYEE-FNAME, EMPLOYEE-MNAME and EMPLOYEE-LNAME are grouped together under a high-level data-item EMPLOYEE-NAME. The size of the group-item, 10 bytes + 10 bytes + 10 bytes = 30 bytes is implied.

Like-wise, EMPLOYEE-ADDRESS is 10 bytes + 10 bytes + 06 bytes = 26 bytes large. The group-item EMPLOYEE-PHONE-NO is 03 bytes + 03 bytes + 08 bytes = 14 bytes large. The root-level 01 data-item EMPLOYEE-DATA is the sum of the lengths of EMPLOYEE-NAME(30 Bytes), EMPLOYEE-ADDRESS(26 Bytes) and EMPLOYEE-PHONE-NO(14 Bytes) equal to 70 bytes.
Q. While writing code, how do I align high and lower-level data-items?
The highest level data-item is given 01-Level. The thumb-rule is to code 01-Level items in Area-A(Positions 8-11) in the COBOL Program. On the other hand, lower-level data items like EMPLOYEE-NAME or EMPLOYEE-FNAME should be coded in Area-B(Positions 12-72).

Indentation is good practise. Indentation in COBOL is done by the level. Indent to the right, a data-item at a lower level by 3 positions. This is an extremely good coding-standard.

Q. How do I initialize a variable?
While writing software code, as a best practice initialize all variables, before you use them. When a COBOL program starts execution, you can assign initial values to the variables in the program. To initialize a variable in COBOL, you code the VALUES keyword in the definition of the data-item. Here are a few examples.


I have initialized the variable WS-PRINCIPAL = 1000, WS-NO-OF-YEARS = 05 and WS-MY-CAR = 'CHEVROLET'. Enclose alpha-numeric values in single-quotes.

Similarly, the variables WS-EMPLOYEE-FNAME, WS-EMPLOYEE-MNAME and WS-EMPLOYEE-LNAME have been assigned the initial values 'QUASAR',  'SHABBIR' and 'CHUNAWALA'.

COBOL also permits initialising a group item. Initialising a group-item at a higher level, in turn initialises all the lower level items subordinate to it. Say, I initialise WS-EMPLOYEE-PHONE = '91222894165'. Effectively, WS-COUNTRY-CODE has room for 2 bytes and receives the value 91. WS-AREA-CODE is 2 bytes large and receives 22. WS-LOCAL-PHONE has length 08 and receives 28941365.
Q. What are the different PICTURE Characters?
The declaration of elementary item contains the level-number, the data-item name and the picture string.  

The PICTURE String specifies the data-type and size(in bytes) of the variable. The PICTURE string has characters such as 9 to indicate a number, X to indicate alpha-numeric data. There are various other characters like A P S and V, you may supply in the PICTURE string.

The PICTURE string of a variable can also be edited for good visual appearance on a report being printed, or displayed on the mainframe terminal. There are various edit characters like B Z 0 / , . + – CR DB * and $, that may be edit the picture clause.
Q. What do the PICTURE Characters A P S and V signify? 
The picture character X stands for alpha-numeric data. Picture character A turns out to be for alphabetic data. In the below example, the variable WS-MY-CAR is alphabetic and has a length 10-bytes. So, if you own a 'CHEVROLET', you may store it in WS-MY-CAR. Alphabetic variables can contain only a letter(A-Z) or a space. 


But, what if you go home in a 'MARUTI 800'? The last three positions might contain alphabets or numbers. In that case, WS-MY-CAR can be defined as PIC AAAAAAAXXX, 7 alphabets and 3 positions alphanumeric.

In the Cobol language, you may work with both counting-numbers 1,2,3,.. and decimal numbers too like 1.50, 0.0000000623,.. which have a fractional part. The PICTURE character V is the implied position of the decimal-point.

Say, you are writing a program to calculate the area of a circle and require the mathematical constant pi 3.14159. 

I defined a COBOL variable WS-PI with a picture string PIC 999999 and initialized it with the number 314159.


To indicate the position of the decimal point I shall insert character V in the picture string at an appropriate position. In the math constant pi 3.14159, there is one digit to the left and five digits to the right of the decimal point. So, I will define the variable WS-PI as a
PIC 9V99999, and store the initial value  3.14159 in it.

The position of the decimal point is implied. The decimal-point is not stored in the computer-memory. In fact, the variable WS-PI just holds the plain number 314159. On displaying the contents of WS-PI on the terminal, or printing its value on a report, I'd get 314159. WS-PI is still six bytes large, one byte for each digit. The Cobol compiler software merely assumes a decimal point at 3^14159;the position indicated by the marker ^. The Cobol compiler software remembers the position of the decimal-point, treats 3^14159 just like 3.14159 in any arithmetic.
Q. How computers internally store data?
Computers are digital. They utilize the binary characters 0 or 1(bit). All data in a computer is represented by binary(0,1) bits. You see, every character A, B, C, D ...upto Z, 0, 1, 2,...,9, @, $, ; etc. is represented by grouping a number of bits(0,1) to form a bit-pattern. Each character has a unique eight-bit pattern.

What happens when you store the character 'A' in computer memory? The letter 'A' has the bit-pattern 1100 0001. The computer interprets 'A' as 1100 0001, and stores this bit-pattern in its memory.


The bit-pattern 1100 0001 transformed into decimal is the number 193, and in hexadecimal is the number x'C1'.

IBM has decided the bit-pattern(codes) for each character. Here is the code-table by IBM. 

   image  image     

It is difficult to memorize the bit-patterns like 1100 0001 for A, 1100 0010 for B and so forth. Instead, look at the Hex number equivalent of these bit-patterns. They are easy to commit to memory. The alphabets A, B … upto Z are represented by the codes x'C1', x'C2',... upto x'E8' in hex. Likewise, the digits 0, 1, 2 ... through 9 are represented by the codes x'F0', x'F1',...through x'F9' in hex.

Take a look at what happens when you store the word 'HELLO' on the computer. The character 'H' is represented as 1100 1000, 'E' as 1100 0101, 'L' as 1101 0011, 'L' again as
1101 0011 and 'O' as 1101 0110.


On the Mainframe, the ISPF file-editor has a HEX mode. Try turning the HEX mode ON, by typing the HEX ON command, to see the how the data is actually stored or represented on the computer.

In COBOL, the USAGE specifies how data is internally represented and stored in computer-memory. The USAGE is written next to the PICTURE Clause in the definition of the data-item. Code USAGE IS clause to specify the mode.


All data-items(variables) in COBOL are assumed to be in DISPLAY mode, unless an explicit USAGE clause is specified in the definition. Omitting the USAGE, defaults to DISPLAY mode.


DISPLAY mode is meant for strings such as 'HELLO WORLD', 'HARLEY DAVIDSON', or numbers like 168, 155.02 which are not frequently used in arithmetic. Internally, the data is stored in computer-memory the same way it is DISPLAY'ed on the screen, as-is.

During compilation, the COBOL compiler software calculates how many bytes of memory space the data needs, is it 1 byte, 2 bytes, 4 bytes,.. "how much" by looking at the usage.

To many people who are thrown to work at a mainframe computer on their first job, they feel lost. Mainframe people seem to speak a completely different language and that doesn't make life easy. What's more, the books and manuals are incredibly hard to comprehend.

"What on earth is a Mainframe?" is an absolute beginner's guide to mainframe computers. We'll introduce you to the hardware and peripherals. We'll talk about the operating system, the software installed on a mainframe. We'll also talk about the different people who work on a mainframe. In a nutshell, we'll de-mystify the mainframe.

Readers based in India, can buy the e-book for Rs. 50 only or the print book. International readers based in the US and other countries can click here to purchase the e-book.