Mainframes 360
The one stop destination for System Z professionals

Thursday, November 4, 2010

Comparing COBOL and Assembler Programs

Q. What are the different types of Assembler Statements?
There are three types of Assembler Statements – (i) Instructions : These are actual Machine Instructions,  which when executed by the Mainframe, perform some operation on the Data for example MVC(Move characters) and AP(Add Packed). (ii) Assembler Directives : These are directions, commands or orders meant for the Assembler Software(ASMA90) rather than the Mainframe Computer, for example SPACE(to space out lines on the listing – like SKIP in COBOL) and END(to mark the end of the Assembler Program). (iii) Macros : These are instructions which are further expanded(substituted) by the Assembler to produce more Instructions, for example OPEN(for opening a file), GET(for reading a record), PUT(for writing a record).

In COBOL, words are grouped together to form a Clause. COBOL-Clause is just like an Assembler-Statement. Several clauses form a Sentence. Many sentences go on to make up a Paragraph, many para's give a SECTION, and many sections form a DIVISION.
Q. What's the syntax of the Assembler Language like?
An Assembler Instruction is made up of 5 different Parts.
1. An Optional-Name or Label – You can assign a name to every Assembler Instruction. This corresponds to a COBOL File-Name, data-name or
Paragraph-Name in the DATA and PROCEDURE DIVISION. For example, in the below snap PRINTER is the Internal File-Name.


2. A required Operation-Code – The Operation-Code or OPCODE, represents the Assembler Statement that you want to execute – it may be Machine Instruction, Assembler Directive or a Macro. Generally, the convention followed is to start coding the Opcode from Column 10. In the above snap, DCB is the operation-code.

3. Optional Operands – Any Assembler Instruction can carry one or more Operands. Each Operand follows the format Parameter=Value. The list of parameters are separated by commas. DDNAME=OUTFILE,DEVD=DA,MACRF=(PM), are all operands. You normally start coding the first operand from Column-16.

4. Optional Comments – You can write anything you want describing the code as Comments. Comments can begin anywhere after the Operands, but there must be at-least one blank space between the Last Operand and the beginning of the Comment Line.

5. An Optional Continuation-Indicator – When you want to continue an Assembler Instruction over to the next-line, you code a continuation character in Position 72 of the Assembler Statement. The continuation character could be anything other than a blank(white-space). The continued lines should begin normally in Position 16.
Q. Is there any restriction on the length of File-names, data-names or paragraph names in Assembler?
It is most likely that you are using High-Level Assembler Assembler H at your shop. Assembler H allows labels upto 63 characters. In certain cases only names of length 8 are allowed, for example program-name or sub-routine names. Many old Assembler programmer continue to use the older limit of 8 characters out of habit. I personally feel, its wise to stick to the 8-Character Limit.
Q. What are the important positions in an Assembler Statement?
Position 01 is called the Begin-Column, Position 71 is called the End-Column. You code Assembler statements between the Begin-Column and End-Column. If the instruction is longer, you should put a continuation character in column-72 known as continuation-indicator field. Position 16 is called the Continue-Column.
Q. Do I need to watch out for anything?
Absolutely, there's no room for errors. A goof-up could be costly, and sometimes unlike JCL or COBOL, the errors would not even show up. I would like to warn you, that when a Assembler statement is continues onto the next Line, one needs to put a comma at the end of the first Line. In the snap below, forgetting to put a comma after MACRF=(PM), makes Assembler think RECFM, LRECL, BLKSIZE and DSORG are comments. As seen, they’ve been highlighted in Turquoise.

Another typo that you are likely to commit, when start coding Assembler is putting extra-commas.
Q. How do you form a valid Label in Assembler?
A Label in Assembler is textual character-string, made of A-Z, 0-9, @, $ and #. Assembler H also supports the underscore sign '_'. Generally, in COBOL Programs, you find paragraph names such as 300-HOUSEKEEPING, 2500-PRIMING-READ, 2450-CHECK-RC etc. Paragraph-names start with digits, because it aids the COBOL Programmer to know the direction in which to find the paragraph, when he looks at the COBOL Program Listing.

In Assembler, labels cannot start with a Digit. The underlying reason for this is because, Assembler-Programmers don't need to have something in the Label, to tell them the place where the Label is Located. The Assembler Listing automatically tells them where the Label is. More on this later.

The assembler software puts the address each label referred to in a Branch Instruction, in the ADDR1 Column. By looking at this value, and comparing it with the current LOC Value, you can determine whether the Label precedes or succeeds the current Line.

You have to get into the habit of looking the Address in ADDR1 and ADDR2 Columns of the Assembler Listing to locate Labels, and be a productive Assembler Programmer.
Q. A COBOL Program has a definite Layout IDENTIFICATION, ENVIRONMENT, DATA and PROCEDURE DIVISION. What about Assembler?
Assembler does not follow a strict, rigid layout for writing Programs. It is far more relaxed than COBOL's organization of Programs. Below, I have enlisted what each DIVISION of a COBOL Program does.

What I do is, I write Assembler Programs in a sequence, which roughly approximates the above structure.

Actual Instructions
File I-O Areas, working-storage areas
File-names and Dataset Control Blocks(DCB)
External–record and data descriptions(in dummy sections-DSECTS)
Q. How do you decode the Assembler Listing?
I am gonna elaborate the Assembler-Listing that was produced for the Program ASPROG02. The complete assembler-listing can be found, by clicking here. The first-page of your Assembler Listing should be titled External-Symbol Dictionary, commonly abbreviated as "ESD".


1. The SYMBOL Column shows any Control-Section(CSECT) names and ENTRY Names inside your Program.
2. The ADDR Column shows where the name under the SYMBOL Column is located. In this case ASPROG02 begins at Address 0 within the Program.
3. The LENGTH Column tells the Length of the name in the SYMBOL. In this case, the size of Program ASPROG02 is hexadecimal X'3CC'(972 Bytes).

The next Page and several subsequent, contain the listings of instructions in your Program. This is the same information provided in a COBOL PMAP or CLIST.


LOC – This contains the location, or hexadecimal Displacement of the Statement from the beginning of the Program.
Object Code – This contains 3 columns. Not all the 3 columns are always filled. This is the Actual Machine Instruction generated by the Assembler, which you would find in memory in the Dump.
ADDR1 – This column will contain the Hexadecimal address of the first operand in the instruction that you have coded. This is what is printed in the LOC Column of where the Label is actually coded.
ADDR2 – This column will contain the Hexadecimal address of the second operand in the Instruction.
STMT – This column will contain the Line-Number or Statement-Number.
SOURCE STATEMENT – This is the actual 80-Byte Assembler Source Statement.

Understanding the Assembler Listing is very important to be a good Assembler-Programmer.
1. Take a look at Statement 15.
For the second-operand SAVEAREA, the assembler has substituted the hexadecimal address ADDR2=X'DC'. Now find the, hexadecimal Address X'DC' in the LOC Column. That's, where SAVEAREA is defined.


2. Locate the Statement
Both ADDR1 and ADDR2 Columns should have a printed value there. RECNUM is substituted by ADDR1=X'304' and ONE is substituted by ADDR2=X'309'. Scan the Program until you find these addresses.


You should try practicing this a few more times, till you've mastered it.


Following the Assembler Instruction Listing, there will be the
Relocation-Dictionary or "RLD". The RLD will be used by the Loader when it brings your program into Memory for Execution. For the moment, I’ll skip the RLD. Next, you shall find the Cross-Reference Listing.


SYMBOL – The name of the field, or Instruction-Label, just like the name in a COBOL Cross-Reference XREF Listing.
LEN – The Assembler’s computed Length for that symbol. How much storage space does it occupy in Bytes?
VALUE – This contains the Hexadecimal address of the location of the Symbol.
DEFN – The Statement number where the Symbol was defined. If you accidentally define a Symbol twice, the Assembler will list it in the Cross-reference and flag the second definition as duplicates.
References – The statement where the symbol was referred.

When the Assembler-Program contains errors, the Assembler provides diagnostic information about the error at two-places. First a line containing,
* * * ERROR * * *
is printed following the statement with the error. Secondly, after the Cross-Reference, a summary of all un-acceptable erroneous statements that were flagged is printed as Statistics. 

To many people who are thrown to work at a mainframe computer on their first job, they feel lost. Mainframe people seem to speak a completely different language and that doesn't make life easy. What's more, the books and manuals are incredibly hard to comprehend.

"What on earth is a Mainframe?" is an absolute beginner's guide to mainframe computers. We'll introduce you to the hardware and peripherals. We'll talk about the operating system, the software installed on a mainframe. We'll also talk about the different people who work on a mainframe. In a nutshell, we'll de-mystify the mainframe.

Readers based in India, can buy the e-book for Rs. 50 only or the print book. International readers based in the US and other countries can click here to purchase the e-book.