Top 3 Products & Services
Dated: Aug. 12, 2004
Related CategoriesCOBOL Programming
- History of Cobol
- Who Uses Cobol
- Is Cobol a Best Practice
- Was It a Best Exercise in 1959 ??
- Re-engineering Cobol into Cintrinsic Control
- Lets the Game Begin
- Starts with the Basics of Cobol
- Coding Areas
- Data Vision
- Identification Vision
- Procedure Division
- Boolean Data
- Initializing the Data
- Printing and Writing the Data
COBOL is a high level programming language of the procedural type. That is, it is not a functional, logic-oriented or object-oriented language. It is used primarily in the implementation phase of software development, like most programming languages. Sometimes, though, low-level designs are written in a COBOL-like pseudo code. A COBOL programmer writes a program using keyboard entry of text, because the most common human to computer medium at the time of its invention was the punch card.
The name COBOL is an acronym that expands to Common Business Oriented Language. It was designed to be used in business applications. Today, we would say Information Systems. It continues to be most commonly used in Information Systems applications.
History of Cobol
In 1959, the U. S. Defense Department created a group called The Short-range Committee, which, over a period of a few months, defined the COBOL language. The committee was organized by the Conference On Data System Languages (CODASYL). A midrange committee was also organized and a long-range committee was defined but never created. The Short-range committee consisted of representatives from: National Bureau of Standards, US Air Force, RCA, Burroughs, the Navy, Sylvania and Sperry Rand. The first COBOL compiler was implemented in 1960. The committee was selected to be a combination of computer manufacturers, which were expected to implement COBOL, and expected users of COBOL.
These committee members were practitioners, with almost no representation of academically oriented researchers. The Short-Range Committee worked largely from knowledge of the existing Flow-Matic language, developed by Grace Hopper, and IBM's specification of its planned Commercial Translator. (Sammet, 1978) (IEEE, 1999)
The committee intended the language to have good file support, including support for sequential, key-indexed and direct access files. They wanted the language to be readable, even by the programmers' managers. Several design principles guided the design of COBOL.
- separate data & procedures
- machine dependent statements in one place
- ease of transcription to require media (cards at the time)
- effectiveness of problem structure
- ease of implementation (for compiler writers)
- physically available character set (printable)
- noise words
- long data names (Sammet, 1978)
Even for a standardized language, implementers add extensions to the language to gain a market advantage or to make use of features in their hardware. For example, IBM includes pointer variables and the "address of" clause to set a pointer. It also offers the option to turn on subscript range checking at run time. (IBM, 1993) Several implementers offer editors sensitive to the COBOL syntax and even a complete interactive development environment.
COBOL continues to progress with work being done in the Object-Oriented COBOL standard. The ISO/IEC has issued Committee Draft 1.8 in February of this year. (ISO/IEC, 2000) Even before the standard is finalized, Hitachi, IBM, Microfocus (now part of Merant) and others are offering object-oriented COBOL compilers.
Who Uses Cobol
COBOL has had four decades, since it was defined, to become entrenched in the business computing community. As a programming language, COBOL is intended to be used, and is primarily used during software implementation. However, sometimes low-level designs are written in a COBOL-like pseudo code.
Although estimates of the extent of COBOL's use vary, they all indicate that COBOL is very widely used. Robert Glass (1997) estimates that two-thirds of all the world's programmers use COBOL. Estimates of the number of COBOL programmers range from 2.5 million (Keuffel 2000) to 3 million (Fussichen, 1990). A recent voluntary response survey by Carr and Kizior (2000) on the future of COBOL education indicates that more than 87% of the business respondents’ current develop and maintain COBOL applications, as illustrated in figure 5. They also said that more than 45% of the respondents would continue to use COBOL through the next decade, although 30% thought their use of COBOL would decline.
Estimates of the amount of COBOL code in productive use vary widely, but they do not contradict each other. Arranga (2000) estimates between 18 billion and 200 billion lines of COBOL code are running production applications worldwide. Estimates of 80 billion by Fusichen (1990), 150 to 175 billion lines by Wheatley in (Arranga, 1999) and 180 billion by Hardgrave and Doke (2000) fall within that range. The last authors also estimate that 5 billion lines of new COBOL code are written each year. Addressed from the perspective of software applications written, Arranga and Price (2000) say that 35% of all new business application development is written in COBOL.
Is Cobol a Best Practice
To determine if COBOL really is a best practice among software engineering practices, we must divide the question into two parts. Was COBOL a best practice when it was first introduced? Is COBOL still a best practice today? Since four decades have elapsed since its introduction, and both COBOL and software engineering have changed over time, we must ask this two-part question. We also note that a good programming language must be readable to make programs maintainable and it must include modern language constructs.
Was It a Best Practice in 1959 ?
Was COBOL a best practice when it was introduced in 1959? Like every high level language is intended to be, COBOL was further from the machine language and assembler language, which were in common use at the time. This distance made it closer to human natural language. It was only the second high-level language, after FORTRAN to become and stay popular. Unlike FORTRAN, COBOL was intended to be readable by business programmers and even by their managers. It is probably the only programming language that can be read aloud and make sense. By being readable, it is more easily written. Programmers can easily remember the verbs and required structure, because they combine to make readable sentences. A COBOL statement is even called a sentence and it usually ends with a period. The only drawback to readability is that COBOL can be wordy. Language readability is a prerequisite to maintainability. This is important in the Information Systems industry where programs get modified and enhanced for twenty or more years without being replaced
The development of COBOL made many contributions to the field of high level programming languages, some of which have carried over to other languages. Its development was a joint effort that was clearly successful. It separated the data description from the procedural statements.
Re-engineering Cobol into CIntrinsic Control
Joiner & Tsai (1994) report on a tool for re-engineering a COBOL application into the C language. The paper lists five conclusions about the COBOL programs that they studied. Four of the conclusions imply that the existing COBOL programs are well structured and easy to understand.
COBOL applications are constructed of many small programs.
“Existing record variables make good candidates for instance variables since many represent the state of entities in the application domain.”
“Existing paragraphs make good candidates for methods since they are generally small and define few variables.”
"Minimal ‘hand’ re-engineering is needed because the programs use few messy constructs such as GOTOs, ALTERS or fall-throughs.”
According to the report, translation of COBOL into C is desirable because COBOL lacks local variables and cannot pass parameters to called procedures. The second statement has never been true, because COBOL has always had the CALL statements, which can invoke and pass parameters to an external procedure. Again, ANSI 85 COBOL has completely eliminated both deficiencies with the introduction of nested programs. Any number of whole subprograms can be nested inside a main program, and those subprograms can themselves contain whole subprograms, in a single compilation unit. These subprograms are invoked with a CALL statement that passes parameters. Thus, the main program and each subprogram can own local variables whose scope is limited to that subprogram. Joiner and Tsai present only those two reasons for changing to C, leaving COBOL programmers familiar with nested programs no good reason to change their key,
Lets the Game Begin
The best way to learn to programme/learn a new language is to actually be able to write code and run it on a computer. Consequently, you really need a computer (probably a PC), a text editor (Notepad or WordPad will do) to write the code into, and most importantly, a COBOL compiler which will check your code and then convert it into something the computer can understand and execute. I use the Fujitsu COBOL85 ver3.0 compiler which can be downloaded for free
If you look at the COBOL coding all the specific positions of coding elements are important for the compiler to understand. Essentially, the first 6 spaces are ignored by the compiler and are usually used by the programmer for line numbers. These numbers are not the same as those in BASIC where the line number is used as part of the logic (e.g. GOTO 280, sending the logic to line 280).
The seventh position is called the continuation area. Only certain characters ever appear here, these being:
* (asterisk), / (solidus or forward slash), or - (hyphen).
The asterisk is used to precede a comment, i.e. all that follows is ignored by the compiler. The solidus is used to indicate a page break when printing coding from the compiler, but it too can be used as comment since the rest of the line is ignored by the compiler. The hyphen is used as a continuation marker, i.e. when a quoted literal needs to be extended over to the next line. It is not for continuing a statement onto the next line (this is unnecessary*) and also cannot be used to continue a COBOL word. (*You can write any COBOL statement over as many lines as you like, so long as you stay in the correct coding region and don't split strings.)
000200*Here is a comment.
000210/A new line for printing and a comment.
000340 DISPLAY 'This might be a very long string that
000350- 'needs to be continued onto the next line'
Positions 8 to 11 and 12 to 72 are called area A and area B, respectively.
Identifier names User-defined names must conform to the following rules:
- Must only consist of alphabetic and numeric characters and/or hyphens
- The name must contain at least one alphabetic character
- Must be no more than 30 characters
- When using hyphens, they must not appear at the beginning or end of the name
Some examples of legal names:
Like all COBOL code, the compiler will not distinguish between upper and lower case letters (except within quotes). Lastly, COBOL has a large list of reserved words that cannot be used as identifier names.
The full stop (period) is the most important punctuation mark used, and its use will be detailed later, Generally, every line of the IDENTIFICATION, ENVIRONMENT, and DATA DIVISION end in a period.
Quotation marks, either single or double, are used to surround quoted literals (and when calling a sub-program). However, don’t mix them when surrounding the literal, e.g.
" This is bad ’
" but this is ok "
Commas and semi-colons are also used to separate lists of identifiers, e.g. MOVE 2 TO DATA-ITEM-1, DATA-ITEM-2, DATA-ITEM-3 A space must follow the comma/semi-colon. They are optional however, and a space would suffice, but it does add to clarity.
Since COBOL was developed in the USA, the spelling of words is American, e.g. INITIALIZE or ORGANIZATION (using Z rather than S). Brits be warned!
In many cases, abbreviations and alternative spellings are available (see reserved word list), e.g. ZERO ZEROS ZEROES all mean the same thing. Likewise, LINE and LINES, PICTURE and PIC, THROUGH and THRU.
Hello World Program
As is traditional for all introductory lessons for a programming language, here's a 'Hello World' program
The different pieces of data need to be defined so that the program can read a record at a time, placing each piece of information into the right area of memory (which will be labelled by an identifier).
COBOL program code is divided into four basic division: IDENTIFICATION, ENVIRONMENT, DATA, and PROCEDURE divisions. The identification division is required, but in theory the others are not absolute (although you won't have much of a program without any procedures or data!).
The data division is where memory space in the computer is allocated to data and identifiers that are to be used by the program. Two important sections of this division are the FILE SECTION and the WORKING-STORAGE SECTION. The file section is used to define the structure, size and type of the data that will be read from or written to a file.
Suppose the 'input.dat' file (described above) contains a series of records about a companies customers, giving details of name, address, and customer number. If you were to open 'input.dat' with a text editor you would see each record on a new line like this:
Joe Bloggs 20Shelly Road Bigtown 023320
John Dow 15Keats Avenue Nowheresville042101
Jock MacDoon05Elliot Drive Midwich 100230
This division tells the computer what the program will be interacting with (i.e. its environment) such as printers, disk drives, other files etc... As such, there are two important sections: the CONFIGURATION SECTION (which defines the source and object computer) and the INPUT-OUTPUT SECTION (which defines printers, files that may by used and assigns identifier names to these external features).
000260 ENVIRONMENT DIVISION.
000270 CONFIGURATION SECTION.
000280 SOURCE-COMPUTER. IBM PC.
000290 OBJECT-COMPUTER. IBM PC.
000300 INPUT-OUTPUT SECTION.
000320 SELECT INPUT-FILE ASSIGN TO 'input.dat'
000330 ORGANIZATION IS LINE SEQUENTIAL.
000340 SELECT PRINT-FILE ASSIGN TO PRINTER.
The identification division tells the computer the name of the program and supplies other documentation concerning the program's author, when it was written, when it was compiled, who it is intended for...etc. In fact, only the program name is required by the compiler.
000100 INDENTIFICATION DIVISION.
000110 PROGRAM-ID. EXAMPLE-1-PROG.
000120 AUTHOR. TIM R P BROWN.
000130 INSTALLATION. XYZ GROUP.
000140 DATE-WRITTEN. 17/5/00.
000160 SECURITY. LOCAL GROUP.
The procedure division is where the logic of the program actually found. COBOL is a modular language, in that a program is usually broken up into units described as paragraphs.
000900 PROCEDURE DIVISION.
000920 PERFORM READ-DATA-FILE
000930 PERFORM CALULATE-PRICES
000940 PERFORM PRINT-PRICE-REPORT
000950 STOP RUN.
Boolean data is either TRUE or FALSE. These are data types are useful for flags for so-called condition-name conditions.
A simple example:
- When then number entered (line 250) is greater than 1000 then a 'Y' character is moved to the level 01 item NUMBER-SIZE. The effect of this is to give the level 88 item BIG-NUMBER a TRUE condition. This is what level 88 is for in COBOL.
- Line 240 initially sets BIG-NUMBER to false by moving an 'N' character into NUMBER-SIZE, although any character (other than 'Y') would have the same effect.
- IF BIG-NUMBER THEN... is like saying "IF BIG-NUMBER is true THEN..."
Multiple level 88 can be set for a single group, or you can have more than one option that will set the condition to true.
01 THIRTY-DAY-MONTHS PIC X VALUE SPACE. 88 SEPTEMBER VALUE 'S'. 88 APRIL VALUE 'A'. 88 JUNE VALUE 'J'. 88 NOVEMBER VALUE 'N'. 01 MONTHS-CHECK PIC X. 88 SHORT-MONTH VALUE 'S' 'A' 'J' 'N' 'F'. 01 GRADES-CHECK PIC 999. 88 A-GRADE VALUE 70 THRU 100. 88 B-GRADE VALUE 60 THRU 69. 88 C-GRADE VALUE 50 THRU 59. 88 FAIL-GRADE VALUE 0 THRU 49.
GRADES-CHECK uses THRU (or THROUGH) to allow a range of numeric values to be tested.
A useful verb to use is SET. Rather than having to use the line:
MOVE 'Y' TO NUMBER-SIZE
as in the code example above, you can simply set the boolean variable to true by coding:
SET BIG-NUMBER TO TRUE
This means that you don't have to worry about what the value of the level 01 item has to be in order to make the associated level 88 to be true (notice that it is the level 88 item name that is set to true and NOT the level 01 item). Of course, you might also code
SET BIG-NUMBER TO FALSE
.INITIALIZING THE DATA
During a program run it is often necessary to reset an item, or group of items, back to zero (or other value), or back to a certain literal. Often the program requires data to be set at a certain value (or set literal) at the beginning of a run. For example, an item may be used to count the number of records that have been read by the program. each time this has occurred the line:
COMPUTE REC-COUNT = REC-COUNT + 1
Obviously, the first time REC-COUNT is encountered, it would need to have a value (probably zero). This could be acheived in the data division:
01 REC-COUNT PIC 9(4) VALUE ZERO.
Alternatively, early in the procedure division, the command
MOVE ZERO TO REC-COUNT
would have the same effect. If, however, you wished to set a group of items to zero (to zeroize) and/or set other alphanumeric items in that group to spaces then you could use the INITIALIZE verb. For example:
000200 DATA DIVISION. 000210 WORKING-STORAGE SECTION 000220 01 DATA-GROUP. 000230 03 REC-COUNTER PIC 9(4). 000240 03 REC-TYPE PIC X(2). 000250 03 REC-DATE PIC 9(6). 000260 03 FILLER PIC X(14) VALUE 'Record details'.
And in the procedure division:
000400 INITIALIZE DATA-GROUP
The effect of this will be that whatever the contents of any of the level 03 items prior to the initialize statement REC-COUNTER will now contain zero, as will REC-DATE, and REC-TYPE will contain spaces. However, FILLER (the last item), is actually a reserved word and refers to an used area. The word 'FILLER' can actually be omitted (i.e. 01 PIC X(14) VALUE 'Record details'.). As you will see in the Printing/writing data part of the next section, a literal can be assigned to this. Following initialization the filler will remain unchanged (and not space-filled).
The following code illustrate how a printed report is defined in the data division. If writing to a file it would be virtually identical.
If you wished to print a report in the form of a table then you would first have to assign an identifier name to the printer in the environment division using the select clause.
- The printout would have the following format:
- The printer was assigned to PRINT-FILE (the FD level) with the level 01 called REPORT-OUT
- There are four groups used to define each main part of the printout: PRINT-HEADERS (for the title and column heads), PRINT-LINE (for the actual data from the records), P-FOOTER (for the totals at the end of the table), and P-BATCH which appears after the main table and lists various totals
- To define text, fillers are used with a VALUE of what the text is to be, e.g.
001090 03 COL-HEAD-1 PIC X(31)
001100 VALUE ' PART CUST/ DATE QUANT'. This is the first line of the column header. COL-HEAD-2 giving the next line.
- Spaces between the titles done by defining a PIC X size that is larger then the text since the extra spaces will be space-filled
- Spaces between data are acheived by the use of fillers with a VALUE SPACES for the desired PIC X size.
- Data and strings to be printed are first moved to the appropriate item of the print group and then the entire group is written to REPORT-OUT, which is defined as PIC X(80). For example:
003220 MOVE PAGE-NO TO P-PAGE-NO 003230 WRITE REPORT-OUT FROM P-TITLE AFTER PAGEHere the page number is moved to the P-TITLE sub-group member (of PRINT-HEADERS) P-PAGE-NO. The following line effectively means:
MOVE P-TITLE TO REPORT-OUT
WRITE REPORT-OUT AFTER PAGE
(AFTER PAGE instructs the printer to start a new page)
- It is in the data groups involved in printing (or writing to a file) that data editing (such as zero-supression) is performed
- By simply changing the ASSIGN PRINT-FILE TO 'PRINTER' to ASSIGN PRINT-FILE TO 'report.txt' would be all that was required to produce the same report in a file called 'report.txt' and add ORGANIZATION IS LINE SEQUENTIAL. Although, the AFTER PAGE and AFTER ... LINES would have no effect
The preponderance of evidence clearly supports naming COBOL a best practice. It is widely used, and it is taught at many universities. It contains most of the features and constructs required of a modern programming language and it continues to progress into the object-oriented world. Finally, COBOL has features required of a language used for business applications that other languages lack. Unless or until a better language is invented, perhaps a non-hybrid object-oriented business language, some version of COBOL will certainly be used for a long time. COBOL is a software engineering best practice.
Now that you've gotten free know-how on this topic, try to grow your skills even faster with online video training. Then finally, put these skills to the test and make a name for yourself by offering these skills to others by becoming a freelancer. There are literally 2000+ new projects that are posted every single freakin' day, no lie!
Simply go to: Computer Tutorial Videos
Thanks for asking Alessandro!
First of all, As a Non-COBOL programmer, I can tell you this from my experience with other programming. Where there is a will, there is a way. COBOL can be a bit tricky at start but like any other programming, the learning curve drops drastically after you learn some of the basic stuff. And then, regarding the mainframe question...
I'm certain that there maybe a software out there to give you a mock mainframe environment on a PC. Similar to how a lot of folks learn Linux and Unix on a PC.
Libros recomendados son: Murach's Mainframe COBOL