Bulletin Board

What’s new at MAINFRAMES 360
(Updates)...
- Re-writing of select JCL and VSAM Tutorials is under-way, orientation would be more hands-on, and practical
- New Tutorials on COBOL-VSAM programs shall be put up pretty soon
- IMS DB/DC Tutorials to start up in April, 2010

Mainframes360 Search

Loading

Quick Links

Jump to :  

Sunday, August 9, 2009

COBOL Tutorial – DATA DIVISION : Part I


Contents

  1. Online Systems v/s Batch Systems
  2. Data Division and its Sections
  3. Basic Data-types in COBOL
    1. Rules for Valid Data names
    2. Level Numbers
    3. PICTURE Clause
    4. Value Clause
  4. Literals and Figurative Constants
  5. Group and Elementary Data Items
  6. Rules governing COBOL Data-Items
  7. More on FD File Descriptor Paragraph 

Q. What are the two broad types of Applications/Systems?

All application/systems are of 2 types –
(i) Online Systems(Transaction Processing Systems)
(ii)Batch Processing Systems.

When you run a program like CMD Command prompt, you first type in some input commands, you get some output, then you type in some more input, you get some output, and so on... this cycle continues till you get the final results. The ongoing interactive input-process-output, input-process-output ... mode of processing is called Online Processing/Transaction Processing mode. It is called online processing, because it is an interactive mode of processing data. Here, you type in the input, you immediately get the output. Response times are very quick.

On the other hand, when the volume of input data is very huge, and processing can take long time, you will supply all input data, upfront, at the outset, right at the beginning of the Program. Then, you start the program, and wait for the program to process the huge volume of data, and print outputs. This mode of processing is Batch Processing Mode.

In Batch processing, the input data to COBOL Programs and output results of the program will be stored in Input Dataset and Output Dataset.

Let me cite to you some examples of Online Systems and Batch Systems. Systems such as the ATM Machine, Online Banking Portal, IRCTC Railway Reservation Websites, provide information in real-time. These are online systems.

However, certain tasks like taking Backup or Archiving of data, generating monthly Credit Card Statements for Account Holders are Batch JOBs.

Q. What is the DATA DIVSION? What are the different sections in the DATA DIVISION?

DATA DIVISION is used to describe the Input Output Storage areas and temporary storage areas. In other words, it’ll describe the structure of the data being stored in Input Output Files, as well as in any temporary storage.

Structure is the format, or generalised layout of the data. For example for Employee Data, every employee may have following details -

<Employee>
   <Name>
      <FName></FName>
      <MName></MName>
      <LName></LName>
   </Name>
   <Address>
      <Street></Street>
      <City></City>
      <Pincode></Pincode>
   </Address>
   <Phone>
      <Phone1></Phone1>
      <Phone2></Phone2>
   </Phone>
</Employee>

So, every Employee has a name, address and Phone. The Name consists of First Name, Middle Name and Last Name. The Address consists of Street, City, Pincode. The Phone consists of Phone1 and Phone2. This is called the structure of the data.

In Batch Processing Systems, all the Input Data is stored in Input Datasets, in the form of Records. Similarly, all the Output Result Data is stored in the Output Datasets, as Output Records. All data records also have a format or a structure. To describe the Structure of these records in Input and Output files, we use File Descriptor FD paragraph in the DATA DIVISION. Thus, FD Paragraph is used to describe the Input Output Record Formats.

Given below is a sample screen which shows the placement of the File Descriptor FD paragraphs. We ought to write an FD paragraph corresponding to every file-name declared in the COBOL Program in the INPUT-OUTPUT SECTION. In other words, every SELECT statement in the INPUT-OUTPUT SECTION should have a corresponding File Descriptor(FD) which describes the record format in the respective file.


Q. What are data-types? What are the basic data-types in COBOL?

Data-type is used to announce/declare to the MVS O/S, about the type of data you’re gonna store in a Storage Location, so that it can estimate the storage space to be allocated and reserved for you data excusively. You tell what kind of data by using Data-type. There are three main data-types or classes in COBOL. They are :
NUMERIC Symbol         -> 9
ALPHABETIC Symbol      -> A
ALPHANUMERIC Symbol    –> X

To store data in storage area, or describe record format, we must use data-name followed by proper data-type.

The rules for valid data-names are :
1. 1-30 characters
2. Alphabets, digits and hyphens allowed.
3. Blanks not allowed
4. Should not begin or end with hyphen
5. Should not be a COBOL reserved keyword

Any data-item in the DATA DIVISION looks like this :

01        WS-EMP-NAME        PIC X(10)          VALUE ‘QUASAR’
Level No.  Data-name     Data-type and size      Initial value

Level nos. are used to specify hierarchy. A file contains records, records contains fields. 01 level is used for records and independent items. 02-49 levels are used for fields within the record. 66 is used for rename clause. 77 is used for independent data-item. 88 level is used for condition names.

PICTURE Clause : It gives the data-type and the size of the data-item. This specifies, whether, the storage location stores numbers, alphabets, or alpha-numeric data. It also shows, how much is the width of the data-type.
Examples
PICTURE 999 -> Stores a 3 digit +ve no.
PICTURE S999 -> Stores a 3 digit +ve/-ve no.
PICTURE XXXX -> Stores a string of 4 characters
PICTURE 99V99 -> Stores a +ve real from 0-99.99
PICTURE S9V9 -> Stores a +ve/-ve real from –9.9 to +9.9

Shorthand Notation : You can abbreviate 9999 as 9(4), XXXX as X(4), 999V99 as 9(3)V9(2), S99999 as S9(5) etc.

VALUE Clause : We can assign an initial value to the elementary data-items/variables by using the VALUE clause. It is an optional clause.

01 WS-EMP-SALARY    PIC 9(4)V99      VALUE 1000.52
01 WS-EMP-JDATE     PIC X(10)        VALUE ’26-06-09’

Literals and Figurative Constants :
Literals and Figurative constants are data-items whose value remains fixed/constant. It does not change in a program.

ZERO(S) or ZEROES - Its value is 0.
SPACE(S) - It is used for whitespaces.
HIGH-VALUE(S) - Represents the highest value
LOW-VALUE(S) - Represents the lowest value
QUOTE(S) - Represents single or double quotes
ALL literal - Fill with literal

Example of Figurative Constants
01 GROSS-PAY PIC9(5)V99 VALUE 13.5
MOVE ZEROS TO GROSS-PAY.

Before : 00013|50
After : 00000|00

01 STUDENT-NAME PIC X(10) VALUE ‘MIKE’
MOVE ALL ‘-‘ TO STUDENT-NAME
Before : MIKEBBBBBB
After : ----------

Q. What are group and Elementary Data Items?

In COBOL, one or more elementary data-items can be grouped together. For example, we can group EMP-NAME, EMP-SALARY and EMP-JDATE as EMPLOYEE-RECORD. In COBOL, we say that a group item is a data item, which has several low-level(elementary) data-items. A group item is declared using a level number and a data name. It does not have a PICTURE clause.

A group-item may also contain other group-items or a combination of a group-item and an elementary data-item. So, in COBOL, you can form a hierarchy(tree) of data-items. Let me give you a simple example.

Let’s say we wanted to store details of an Employee. Suppose the length of the EMP-DETAILS records is 49, and the data is alpha-numeric. So we can declare this data-item as :

01 EMPLOYEE-RECORD PIC X(49).

However, it is observed that every Employee has a name, address, date-of-joining and salary. So, we can break up EMPLOYEE-RECORD into EMP-NAME, EMP-DOJ, and EMP-SALARY.

01 EMPLOYEE-RECORD.
   02 EMP-NAME PIC X(30).
   02 EMP-DOJ PIC X(10).
   02 EMP-SALARY S9(5)V99.

However, upon further analysis, it is found that name consists of first-name, middle-name and then last-name. Also, EMP-DOJ consists of Date DD, Month MM and year YYYY. So, we can break up EMP-NAME as :

01 EMPLOYEE-RECORD.
   02 EMP-NAME.
      03 EMP-FIRST-NAME PIC X(10).
      03 EMP-MIDDLE-NAME PIC X(10).
      03 EMP-LAST-NAME PIC X(10).
   02 EMP-DOJ.
      03 EMP-DD PIC X(2).
      03 EMP-MM PIC X(2).
      03 EMP-YY PIC X(2).
   02 EMP-SALARY S9(5)V99.

                                         EMPLOYEE-RECORD(01)
                                                 |
                                                 |
                    -----------------------------------------------------------
                    |                            |                             |
               EMP-NAME(02)                  EMP-DOJ(02)               EMP-SALARY(02)
                    |                            |
                    ------------------           ----------------------
                    |         |      |           |          |          |
                   FNAME   MNAME   LNAME(03) EMP-DD      EMP-MM     EMP-YYYY

The top level group entry has level no. 01. Then lower levels data-items have level 02, 03... and so on.
Using these group items and level nos. to form a hierarchy, is to describe the structure(format) of the record. It applies the principle of nesting. One elementary or group item can be nested inside another group-item. Higher the level no., higher is the depth of the data-item.

Some Important Rules :
1. All the data-items at the same level in the tree, must bear the same level no. If EMP-NAME and EMP-DOJ have different level nos., it is not allowed.
2.
All 01-LEVEL entries must be started in AREA-A.
3.
Only atomic items(leaf nodes) will have a PICTURE Clause. Higher level data-items never have a PIC Clause.

Consider the tree as a family tree with EMPLOYEE-RECORD as parent and its children as EMP-NAME, EMP-DOJ and EMP-SALARY. Those data-items that have no children e.g.
EMP-FIRST-NAME, EMP-MIDDLE-NAME or EMP-SALARY are called leaf-nodes.
Leaf-nodes are always atomic.

Q. Can you elaborate on how to write the File Descriptor FD Paragraph?

As mentioned before, we must describe the input file record format and output file record format in the File Descriptor FD paragraph under the FILE SECTION. Remember, that FD paragraph is always coded in AREA A.

Suppose the file from which we are reading contains EMPLOYEE Details, with every employee having name, address and phone. Names consist of firstname, middle name and last name. Address has a street, city and pin-code. Phone consists of two phone nos. home-phone and work-phone.
This is how the FD Paragraph looks like :

Note : When you are reading input data from a Sequential PS File, one-by-one each record will be stored in the Input Record area i.e. EMPLOYEE-RECORD will store the details of the first employee, then the next employee and so on... Since, the values stored in input-output areas(like EMPLOYEE-RECORD) will be supplied by the input file/dataset, we should not use VALUE clause to initialize these fields. VALUE clause cannot be used for any records described in the FILE SECTION.

 

0 comments:

Post a Comment

Related Posts with Thumbnails

Quick Links

Jump to :  

Note :

© Copyright – Quasar Chunawalla, 2010.
Note : The copyrights of all the material, text and pictures posted in this website belong to the author. Any instance of lifting the material from this website, shall be considered as an act of plagiarism. For any clarifications, please mail at quasar.chunawalla@gmail.com
 
back to top