Structured programming in Basic; part 4: ANSI Basic, Macintosh Basic, and True Basic. Arthur Luehrmann.
Structured Programming In Basic
Part 4: ANSI Basic, Macintosh Basic, and True Basic
The first three articles in this series (May, June, and July, 1984) introduced the main ideas of structured programming: (1) the top-down method of planning a program and (2) the use of three types of formal control blocks to handle all problems of program logic. These powerful ideas were developed there for users of the dialects of Basic currently available on nearly all personal computers. The final two articles in the series show how these same structured programming concepts can be expressed even more simply in the new generation of Basics just beginning to appear on personal computers.
Basic: A Blessing and a Curse
It is amazing how often in popular computer magazines one reads statements claiming that Basic first appeared in the early 70's, that its design was limited by the small memories available in the first microcomputers, that it is inherently an interpreted, rather than compiled, language. None of these claims is true.
Basic celebrated its 20th birthday on May 1 of this year. John Kemeny and Thomas Kurtz, then and now professors of mathematics at Dartmouth College, aided by a small group of undergraduates, planned and implemented Basic in the early 60's, when the microcomputer was undreamed of and the minicomputer was still years in the future. Basic was designed to run on the only thing around: the mainframe computer. The problem back then was not so much limited memory as limited time. The Dartmouth team had also created the first educational time-sharing system.
Time-sharing works well only if there is time to share--that is, if each user needs only a small amount of time to run a program. Basic, therefore, had to be fast. Since compiled programs run about ten times faster than interpreted ones, Dartmouth Basic was designed from the beginning as a language that would be easy to compile.
So much for ancient history. The remarkable thing about Basic is that, despite its time-sharing mainframe roots, the language has been among the easiest to implement on general purpose minicomputers, laboratory computers, and now microcomputers. The reason for this is both the blessing and the curse of Basic.
To make Dartmouth Basic easy to compile, Kemeny, Kurtz, et al. were forced to make each Basic statement be very similar to one or two of the built-in instructions that the processor understood. The machine language of every processor contains an unconditional jump instruction; so Basic had a GOTO statement with the same effect. Every machine language has conditional jumps; so Basic had an IF statement. Every machine language has a subroutine jump and a subroutine return instruction; so Basic had GOSUB and RETURN statements. And so forth.
It is this strong similarity to machine language that has been the blessing and the curse of the Dartmouth Basic of 20 years ago. On the blessing side of the ledger, this low level nature of Basic has made it very easy for hundreds of programmers to write compilers and interpreters for the language. The curse is more subtle; at first it looks like another blessing. Since each Basic statement is extraordinarily simple and easy to understand without reference to any surrounding statements, one might think that programs made up of these simple statements should also be easy to understand. Sad experience has shown that this is not the case.
Just as one can understand all the parts in a wind-up clock and still not understand how the clock works, or all the words in a paragraph and still not understand the meaning of the paragraph, so it is with computer programs. One needs to understand the parts, but one also needs to see how the parts work together to make up the whole. One needs to see how the parts are organized. One needs to see large scale structures without being bogged down by a mass of detail.
These discoveries about programs began to appear in the mid-60's, mainly as a result of growing experience with another low level language, Fortran. As computer memories got bigger, so did the Fortran programs being stuffed into them by a growing army of professional programmers. And as the programs got longer, they took disproportionately more time to write and debug. Far worse than that, long programs were vastly more expensive to maintain than shorter ones.
Out of this experience came a great deal of deep thinking about the things that make a program complex. In the late 60's, the ideas we now know as structured programming began to appear in journal articles. Soon after that, new programming languages such as Pascal incorporated these structuring ideas in the form of specific tools for handling complexity and taming it.
The Evolution of Basic
To look at Basic as implemented on nearly every personal computer today, one could easily conclude that the structured programming wave had simply washed over Basic and left it far out at sea. As a practical matter, that is indeed the case. The Basic that comes built into all the millions of Apple, IBM, and Radio Shack computers is, apart from numerous special features, little different from the Dartmouth Basic of 20 years ago. In fact, most of the programs I wrote at Dartmouth in the 60's would run unmodified on these current Basics. Like the Dartmouth original, these Basics force programmers to solve their problems by means of unstructured jumps of control: the GOTO, IF, GOSUB, and RETURN statements.
Nevertheless, it is wrong to say that the Basic language has been left behind by structured programming ideas. Nearly 15 years ago, Dartmouth Basic added a CALL statement and other statements allowing a Basic programmer to create subprograms (procedures) with names, two-way parameter passing, and local variables. Shortly after that, it became possible to create a separate file of subprograms, compile them, and establish a link between the file and a user's program. About eight years ago, a new version of Dartmouth Basic offered a complete set of formal control blocks for loops and branches, thus eliminating the need for the unstructured GOTO and IF statements and for line numbers.
Unfortunately, most of this creative work at Dartmouth has somehow been kept a deep, dark secret from the rest of the world. None of these structured programming tools has found its way into the Basics to which most people have access today. In the early 70's when Bill Gates wrote the first Basic interpreter for a microprocessor, his model, alas, was the 1964 version of Dartmouth Basic. Neither he nor his company nor his successors have seen fit to update that model. As a result, hundreds of thousands of personal computer programmers have been forced to learn and use a language that by all rights should be considered an interesting fossil.
ANSI Basic
The sorry state of personal computer Basics is about to change. For nearly a decade, Committee X3J2 of the American National Standards Institute (ANSI) has been at work on developing standards for the Basic language. X3J2 is made up of about 30 volunteers who come from universities, schools, and mainly the computer industry. The Committee does not set any standard; rather it proposes a standard and distributes it for public comment. A parent committee, X3, makes the final decision.
X3J2's first few years were spent (in hindsight, some might say "wasted') on standardizing what amounts to the original 1964 Dartmouth Basic. This was necessary, however, since so many implementors kept getting simple things wrong. For example, the FOR loop is incorrectly handled in nearly all microcomputer Basics prior to the IBM PC version. Consider the following program:
10 PRINT "HOW MANY STARS DO YOU WANT';
20 INPUT N
30 FOR J = 1 TO N
40 PRINT "* ';
50 NEXT J
When Kemeny and Kurtz designed Basic, they thought hard about programs like this. A user, they believed, would expect to see N stars, for any value of N including 0. Therefore, if N is zero, the body of the loop should not be performed at all. In other words, the loop exit test should be made at the beginning of the loop, not the end. That is how the FOR loop has worked in Dartmouth Basic for 20 years. Not so with Gates's first interpreter and most of the Basics that have descended from it.
So the first several years of Committee X3J2's life were spent defining a standard called ANSI Minimal Basic, which tidies up such matters as the FOR loop. (A copy of the Minimal Basic standard, X3.60-1978, is available from ANSI, 1430 Broadway, NY, NY 10018.)
The members of X3J2 recognized that Minimal Basic was more a toy than an actual language. To make it useful, implementors would be forced to enhance, almost certainly in a nonstandard way. Therefore, the Committee turned its attention next to the definition of another language to be called ANSI Basic. It would, for reasons of compatibility, be a superset of Minimal Basic; but it would also contain programming elements thought to be essential for writing serious applications.
After many years of quarterly week-long meetings, a draft proposal for ANSI Basic has been agreed to and sent to the public for comment. If things go about as expected, a formal standard should take effect in about a year. The draft standard is strongly influenced by the structured version of Basic that has been developed at Dartmouth over the past decade and user-tested by tens of thousands of people. It contains all the elements needed for modular, top-down design and for specifying control structures.
Already, the draft standard is having an effect that personal computer users can see. Several member organizations of X3J2 have created new microcomputer versions of Basic that are strongly influenced by the draft standard. At the time of this writing, Apple Computer is in the final throes of implementing Macintosh Basic, which is modeled closely upon the ANSI draft.
Another company, True Basic, Inc., founded by Kemeny, Kurtz, and a small team of Dartmouth programmers, is also at work on a product called True Basic, which will run on both the IBM PC and the Apple Macintosh. True Basic, which will be distributed by Addison Wesley Publishing Company, conforms extremely closely to the ANSI draft. By the time this article appears, Macintosh Basic and True Basic should be available for purchase. Other implementations of ANSI and ANSI-like Basics are probably under development by others.
Top-Down, Modular Design
The rest of this article and the one next month present the elements of ANSI Basic that make it easy to use when following the guidelines of structured programming. The best way to understand these new elements is through an example. The third article in this series (July, 1984) showed how to apply the methods of structured programming to write a game-playing program.
The computer was to get a secret word from a third party and then to ask the main player to guess the word. If the guess was wrong, the computer was to prompt the player as to whether the secret word was earlier or later in the dictionary than the guess, and then to ask for another guess. The program below was the result. (Later in the July article the problem was improved a bit, but this version will serve our present purposes.)
100 "PROGRAM GUESSING GAME
110 GOSUB 200 "SECRET WORD
120 GOSUB 400 "GUESS WORDS
130 GOSUB 600 "WRAP UP
140 END
190 "
200 "SUB SECRET WORD
210 CLS
220 PRINT "WHAT'S THE SECRET WORD';
230 INPUT S$
380 RETURN
390 "
400 "SUB GUESS WORDS
410 "LOOP
420 PRINT "WHAT'S YOUR GUESS';
430 INPUT G$
440 IF G$ = S$ THEN 500
450 GOSUB 800 "HINT
490 GOTO 410
500 "END LOOP
580 RETURN
590 "
600 "SUB WRAP UP
610 PRINT "YOU GOT IT!!!'
620 PRINT "THE WORD WAS'; S$
780 RETURN
790 "
800 "SUB HINT
810 IF S$ < G$ THEN 850
820 "FALSE
830 PRINT "LATER THAN'; G$
840 GOTO 870
850 "TRUE
860 PRINT "EARLIER THAN'; G$
870 "END IF
980 RETURN
With minor exceptions (for example, the abbreviation of REM by an apostrophe), this program conforms to the ANSI Minimal Basic Standard. You can enter it and run it on almost any computer. Although using Minimal Basic, the program is written in a highly structured form, as described in detail in the earlier articles.
First, it adheres to the principle of top-down design: There is a main routine and a set of subroutines. Second, all problems of control are handled by formal loop and branch blocks. A loop block appears in lines 400-500, and a branch block appears in lines 810-870. Both blocks are built up from Minimal Basic REM, GOTO, and IF statements.
Without further discussion, let's see how this program might be written in ANSI Basic.
Program GuessingGame
Call SecretWord(s$ )
Call GuessWords(s$ )
Call WrapUp(s$ )
End
External sub SecretWord(secret$ )
Clear
Print "What's the secret word';
Input secret$
End sub ttExternal sub GuessWords (secret$ )
Do
Print "What's your guess';
Input guess$
If guess$ = secret$ then exit do Call Hint (secret$, guess$ )
Loop
End sub
External sub WrapUp (secret$ )
Print "You got it!!!'
Print "The word was'; secret$
End sub
External sub Hint (secret$, guess$ ) If secret$ < guess$ then
Print "Earlier than'; guess$
Else
Print "Later than'; guess$
End if
End sub
The first thing to notice is that the overall form of these two programs is essentially the same. This is true because both versions follow the guidelines of structured programming. In both, we find a main part and four subparts. Furthermore, the bodies of all the parts consist only of action blocks, loop blocks, and branch blocks. There are a few new Basic keywords in the second version, but the general shape and structure is the same as in the first version.
Some differences are also striking. The first version has line numbers, while the second has none. Actually, the ANSI standard requires line numbers, but only for compatibility with Minimal Basic. The new control structures in ANSI Basic eliminate the need for line numbers. Every implementation of ANSI Basic is expected to make line numbers optional to the programmer. This is true of Macintosh Basic and True Basic.
Another obvious difference is the use of lowercase letters and long variable names in the second version. Uppercase and lowercase letters are treated as equivalent when used anywhere in ANSI Basic except as string constants. It is up to the programmer to develop a consistent style of capitalization. Long variable names allow the programmer to use meaningful names for the data to be processed. There is a slight penalty for allowing long names: They must be delimited from Basic keywords by a space. In older Basics, the two statements
Let a = 5
Leta = 5
have exactly the same effect. In ANSI Basic, the first statement would assign 5 to the variable a, while the second statement would assign 5 to the variable Leta and assume that the Let keyword had been left out.
From the point of view of top-down design and program modularity, the important features to note in the ANSI Basic version are the five parts separated from one another by blank lines. (The final article in this series will deal with the content of the parts. For now, the focus is on the relation among the parts.) These five parts are examples of program units. The first part is the main until. It begins with the Program statement and ends with the End statement. Every ANSI Basic program must have such a unit. The four parts that follow it are examples of external units. Each one begins with an External-sub statement containing the name of the unit and each ends with an End-sub statement.
In ANSI Basic, each program unit is a separate world. Variable names introduced in one program unit have local significance only and are unknown to other program units. If by chance the same name is used for two different variables in different program units, no problems will arise. Likewise, each program unit has its own separate sequence of Data statements; a Read statement in one unit will refer to Data values in that unit alone.
This situation is very different from the one in Minimal Basic. There, all variable names are known globally, throughout the entire program. Likewise, all the Data statements in a program define a single sequence of items that may be read by any Read statement in the program. In effect, an entire Minimal Basic program is like the main unit in an ANSI Basic program.
Parameter Passing
If all variables are local, how then do the various program units communicate with one another in an ANSI Basic program? The answer is that they must pass data back and forth in the form of parameters. In the example, the main unit contains the statement.
Call SecretWord(s$ ) and the corresponding external unit looks like this:
External sub SecretWord (secret$ )
Clear
Print "What's the secret word';
Input secret$
End sub
The Call statement in ANSI Basic serves the same function as the GOSUB statement in Minimal Basic. Both statements transfer control to another part of the program; when that part is finished, control normally returns to the statement after the Call or GOSUB. In addition, the above Call statement identifies the variable s$ as a parameter, through which data may be sent to a program unit or, as in the present case, received from it.
The job of SecretWord is to get someone to enter a secret word into the computer. The Input statement accepts the word and assigns it to secret$. Notice that the name secret$ also appears in parentheses right after the name of this program unit. This indicates that secret$ is the name of a parameter through which data (in this case, the secret word) may be passed back to the caller. Several such parameters may be specified this way, with the comma used as a separator.
Notice that the calling unit does not have to know what name the called unit used for the data to be sent back. The caller invents its own name--s$ in the present case. Once the Call statement is performed, the value of s$ in the main unit is identical to the value secret$ had in the external unit. They are simply two names for the same thing.
The second Call statement in the main unit looks like this:
Call GuessWords (s$ )
This statement tells the computer to perform the statements in the body of Guess Words. But this time, the value of s$ is being sent into the external unit, not retrieved from it. Parameter passing is a two-way street in ANSI Basic. Such parameters are known as reference parameters. (Pascal programmers know them as VAR parameters.)
The remaining external units in the ANSI Basic example work much the same way as GuessWords. In each case, one or two parameters are passed into the unit by means of a Call statement. The names of the parameters in all four external units happen to be the same: secret$ and guess$. As stated above, this is not necessary. Each program unit can have its own "private name' for the parameters to be sent back and forth. This is especially important when programs are developed by a team of writers. The members of the team must agree only on what the program units must do and what parameters must be passed, but not on parameter names nor on the names of local variables.
Three Kinds of Procedure
As stated earlier, an ANSI Basic program is a collection of fairly independent program units. There must be a main unit. In addition, there may be several external units. Units communicate with one another by passing parameters back and forth. Within a unit, all variable names are local to that unit.
So far, we have seen only one kind of external unit. Recall the keyword sub in the first line of each of the four external units in the program example. This word indicates that the unit is a Basic subprogram. A subprogram is one of three different kinds of procedures available in ANSI Basic. The other two are functions and pictures. Like subprograms, functions and pictures have names and may use parameters to communicate. Here is an example of a simple external function unit:
External function Randomint (first, last)
Let n = last - first + l
Let Randomint = first + int (n * rnd)
End function
There are two differences between the appearance of functions and that of subprograms. First, the keyword function replaces the keyword sub in the first and last lines. Second, somewhere in the body of a function, the name of the function must appear on the left side of a Let statement; this is how the function returns a value to the program unit that calls it. In fact, the parameters in functions are unavailable for two-way communication. Function parameters may only send data into the function, not return data to the caller. Such one-way parameters are called value parameters.
The value returned by the above function is a random integer in the range between the values of first and last, inclusive. As with Basic built-in functions, such as int and rnd, a user-defined function is called simply by using its name in an expression. After the function unit is performed, the single value it returns replaces the name of the function in the expression from which the function was called.
The third kind of procedure in ANSI Basic is the picture, which is used to define some graphic object, such as a circle. Graphics is beyond the scope of this series of articles. Suffice it to say that a picture is just like a subprogram except for the way it is called. The caller uses a Draw statement instead of a Call statement. Furthermore, the Draw statement allows the caller to specify a location, size, and orientation angle for the picture. Thus a picture which is defined as a circle can be called as an ellipse, rotated 45 degrees, and placed at a new origin--all with a single Draw statement.
Incidentally, all three ANSI Basic procedures may be called recursively. That is, a statement in the body of a given procedure may call that procedure itself. This is often a useful way to conceptualize certain otherwise complex programming problems.
Internal Procedures
External procedures--subprograms, functions, and pictures --give the programmer powerful tools for dividing a large programming task into a number of distinctly separate pieces that communicate with one another by passing specific pieces of data back and forth. Since these units are largely independent of one another, they can be developed and tested without fear of accidental interactions.
It often happens that a given program unit becomes larger and larger during the process of development. At some point, the unit may become so large that it is hard to read, understand, or change. The solution is to divide it into smaller units. However, one is then faced with a different kind of confusion: too many external units to keep track of. A better approach is to divide a lengthy program unit into a small number of subunits, all within the same unit. In ANSI Basic, this is done by means of internal procedures.
As an example, here is how program GuessingGame would look if it were written as a single main unit containing its own internal procedures.
Program GuessingGame
Call SecretWord
Call GuessWords
Call WrapUp
Sub SecretWord
Clear
Print "What's the secret word';
Input Secret$
End sub
Sub GuessWords
Do
Print "What's your guess';
Input guess$
If guess$ = secret$ then exit do
Call Hint
Loop
End sub
Sub WrapUp
Print "You got it!!!'
Print "The word was'; secret$
End sub
Sub Hint
If secret$ < guess$ then
Print "Earlier than'; guess$
Else
Print "Later than'; guess$
End if
End sub
End
Note first that the test of the procedures now appears before the End statement, which is now the last statement. That is, the procedures are now inside the main unit, whereas before they were outside. Note also that the keyword External no longer appears in the first line of each procedure. Finally, note that parameter passing is no longer needed here.
The main difference between internal and external procedures has to do with the scope of variable names. Internal procedures in a given program unit all share the same set of variable names. In the example here, secret$ first appears in procedure GetWord. Later references to secret$ in the other procedures all refer to the same data value. That is why parameter passing was unnecessary in this case. (It is legal to pass parameters to internal procedures however, and often very useful.)
The scope rule in ANSI Basic is very simple: a given variable name is known (1) everywhere within a program unit, and (2) nowhere outside that unless the variable is passed as a parameter to another unit. To put it another way, variable names are global within a program unit and local to that unit. The same is true of items in Data statements. It is also true of file channels, though that subject is beyond the scope of these articles.
The preceding program example showed only one particular kind of internal procedure: the internal subprogram. There can also be internal functions and internal pictures. The same scope rules apply to them.
Coming Next Month
This article has introduced the elements of ANSI Basic that are useful for dividing a complex problem into simpler, relatively independent parts. External procedures, which have local variables and communicate via parameters, are the main modules out of which a long program is built. Internal procedures, which share variables with one another and with the program unit in which they are contained, provide subunits within the main modules.
Now we leave the topic of top-down design and program modularity. In the final article in this series we shall turn to the other main component of structured programming: the use of formal control structures to handle all problems of program logic. The June and July articles in this series showed how to build these loop and branch structures out of REM, IF, and GOTO statements. In ANSI Basic, these structures come ready-made. It is that fact which allows the programmer to avoid all those wile jumps that can easily turn a simple program into a tangle of spaghetti code.