Brian Buckley's CS 441 page

Assignment One
Assignment Two
Assignment Three
Assignment Four Syntax Graph, part one
Assignment Four Syntax Graph, part two
Assignment Four Parser
Assignment Five

Project One
Project Two
Project Three
Project Four
Project Five

Assignment One: resource pages:

How many programming languages can there possibly be? Far more that one would expect. Developed for business, academia or profit, there are thousands of programming languages in use today. The site http://sk.nvg.org/lang/lang.html provides a list of different programming languages and a brief explanation of the language. The site provides extra links for those who want to learn more. A similar, but indexed list of different programming languages can be found at the site http://www.wikipedia.org/wiki/Alphabetical_list_of_programming_languages. This site provides a more extensive list of languages, but does not provide descriptions for each one. The knowledgeable reader, however, may submit a description or edit an existing one.

Why do people need all of these languages? Aren't they all just basically the same? The answer to this obviously baited question is no. People need languages to perform myriad tasks, and the languages, in turn, are diverse and plentiful. The site http://www.cs.waikato.ac.nz/~marku/languages.html#ool contains a brief description of object-oriented, functional and logical languages. Each has a different use and different implementation.

No matter what the language, there must be some definitive way to recognize what is data and what is not. Therefore, programming languages develop data types. These data types are created, modified and/or developed through code. The code must take a certain form and have very distinct syntax. The following site,
 http://scom.hud.ac.uk/scomtlm/book/node252.html, provides a brief but helpful explanation of one data type, the queue, as it applies to one language, Miranda. For a more general discussion of the semantics and form of programming languages, especially mathematical, try http://www.csse.monash.edu.au/~lloyd/tilde/Semantics/.

While the study of programming languages and the intricacies therein is fascinating and intriguing, there is a terrible downside. There seems to be a very definitive relationship between the time one spends studying programming languages and the level of introverted, antisocial weirdness, or as I call it, the nerd level. The following link provides names and links to several programming language and computer science experts: http://users.erols.com/ziring/dopl.html. These folks have obvious brilliance, but rank equally high on the nerd chart. When the nerdiness becomes uncontrollable, their wacko sense of creativity leaks its way to the world wide web. See the surreal Simpson/Dali illustrations at the K-State site: http://www.cis.ksu.edu/~schmidt/group.survey.html. Be careful not to let the madness melt you.



Assignment Two: Discuss Readability, Writability, Reliability

*All page number references are from the Sebesta text

1. What common programming language statement, in your opinion, is most beneficial to readability and why do you think that?

If one statement defines readability, then it must be the “if…then” statement. The “if…then” statement resembles spoken language in both syntax and semantics. It's meaning is simple and straightforward: if something is true do this, otherwise do something else. Programmers do not overload the “if…then” statement, thereby adding to its simplicity and universality. Not only is it simple, the “if…then” statement best represents orthogonality. After the “if” is a statement that is either true or false. If true, one statement happens. There is no maybe. There is no exception to the rule. This is about as straightforward as programming gets.

The “if…then” is simple, orthogonal and intuitive. It is the first programming statement many aspiring programmers learn. The “if…then” statement occurs, in one form or another, in almost all programming languages. Because of its general appeal, similarity to spoken grammar and simple, orthogonal design, the “if…then” is the most readable programming statement.

 

2. How does distinguishing between upper-and lowercase identifiers affect the three criteria?

Readability: Distinguishing between lower-and uppercase diminishes readability. The number of allowable characters doubles, thereby increasing the number of basic components. “A language that has a large number of basic components is more difficult to learn.”(9). Not only are case-distinguishing languages less simple, they are less orthogonal. Orthogonality involves a high degree of regularity in the design with few exceptions to the rules. A programmer using a case sensitive language might spell a word right and use it appropriately in the syntax, but still create problems because of a misused capital letter. This creates both irregularity in the syntax and exceptions to the rule (the rule being if you spell it right, it works). Take the word “and.” There is only one way to spell the word in a non-case sensitive language. However, in a language that distinguishes between capitals and lowercase, there are eight (and, And, aNd, anD, ANd, AnD, aND, AND).

Writability: Distinguishing between upper and lower case identifiers both hinders and helps writability. Because it is less readable, less simple and less orthogonal, case-sensitive languages hurt writability. (15-16) However, expressivity is greatly enhanced, as is support for abstraction. In C/C++, an identifier in all caps is usually a global variable or macro. Structs and classes often start with a capital letter. These are not rules, but are more akin to common law among programmers. Having these understood rules makes writing easier. The variables enjoy abstractions because variable tags such as in “glbl_Myglobal” or “tag_struct_Mystruct” do not need to be included. Case sensitive syntax also allows the programmer more expressivity. One can declare an identifier with

            Myclass myclass

The myclass is an instance of the class Myclass. This type of expressivity would not be allowed without case-sensitive syntax.

Reliability: Case sensitive grammars are less reliable than those that do not differentiate between case. A programmer can easily use a lowercase instead of a capital and the compiler will not pick up the discrepancy, especially if both have been declared. Just as a language that does not type check floats and integers can cause problems, so can a language that does not capital check a program. For example, if I declared

            Myclass myclass

And uses Myclass when I wanted to use myclass, some compilers would not recognize the problem. This produces less reliable code. Also, because case-sensitive languages are less readable, they are in turn less reliable.

 

3 How do type declaration statements for simple variables affect the readability of a language?

The more closely a type resembles spoken language, the more readable it is for a programming language. “For example, suppose a numeric type is used for an indicator flag because there is not Boolean …timeout = 1.”(13) That type of syntax does not make a lot of sense. Timeout really means true, not one. But with a language that supports Boolean types, the statement can be written “timeout  = true.” (13). Type declaration statements are more readable if they mirror written language. They are less readable if they (like using an int for a bool) feign similarity to spoken language by substituting one type where another would be better.

 

4. Write an evaluation of Java and C, using the criteria described in this chapter. Please be thorough and provide a reason/justification for your evaluation.

Java is more readable, easier to write far more reliable than C. Java’s use of object-oriented design brings the language more closely in tune with adjective-noun syntax in spoken language. Java and C both have similar data types and structures. However, C has “two kinds of structure data types, arrays and records … records can be returned from functions but arrays cannot.” (11) This is an orthogonal discrepancy, and diminishes the readability. Java and C both enjoy control statements such as the “for” and “while” loop, which increase readability. Java and C are similar in how they read, but Java is a bit more orthogonal and the object-orientation makes it far more readable.

Java is a little easier to write than C. Both share common writability features. For example, both allow abstraction through functions. One need only write a function once and then call it multiple times. Both allow expressivity by conveniently allowing multiple ways to do things. “The notation count ++ is more convenient and shorter than count =  count +1.” (17). Sometimes “for” loops are more convenient than “while” loops even though both can be written to do the same thing. Java does not have the confusing pointer syntax that C has. Also in C, one must allocate and deallocate memory with _alloc() functions that really make no sense. C++ remedied this with the “new” and “delete” keywords, but Java is easier to write than either C or C++ due to garbage collection.

Java is far more reliable than C. First, Java has strong type checking, thereby identifying any mistypes before it creates logical errors. Failure to type check "has led to countless program errors ... in the original C language." (17) Java has excellent exception handling, which “take(s) corrective measures, and then continue(s).” (17) While C++ has exception handling, C does not, which greatly reduces reliability. Part of C’s power is to directly access memory with pointers. However, the same memory location can be accessed with different pointers, thereby creating aliasing, which is another detriment to reliability.

It is easy to see that Java has many advantages over C. First, Java is more readable. It has all of the control statements, data types and simplicity of C without much the confusing syntax, making it more readable. Both Java and C are share similar grammar, although C’s memory allocation and pointers make it less writable than Java. Java is far superior to C in reliability because of its exception handling, type checking and lack of pointer-aliasing. So why would anyone use C over Java? See the next question

 

5. Evaluate both Java and C with respect to the ultimate cost (as discussed in Chapter 1 of the Sebesta text). Again, please be thorough and provide a reason/justification for your evaluation.

Java is far superior to C in almost all regards. It is more simple, easier to write and far more reliable. The only benefit C has over Java is cost. The cost of a language has many dimensions. There is the cost of training programmers, the cost of writing the language, the cost of compiling and executing, the cost of the implementation system and the cost of maintenance. (18-19)

Many programmers are trained in Java. However, the C/C++ language is extremely popular in academia and most programmers are at least familiar with it. Many people choose Java because of the variety of free compilers. C has just as many, if not more free compilers. Java is often chosen because of its portability to different platforms. C is even more portable than Java. C is used on platforms from UNIX to Windows and has been standardized longer than Java has been a language (ANSI C circa 1989, Java  circa 1994). (39)

But the most noticeable cost hit in Java is the very long compilation, debugging and execution time. Java has excellent type checking, garbage collection and object oriented design that improves the language, but it comes at the cost of time. For example, “Java … demands that all references to array elements be checked to ensure that the index or indices are in their legal ranges.”(24). C does not have any sort of index checking. It is common that “C programs execute faster than semantically equivalent Java programs” (24) due to the lack of checking and the low-level access with pointers and dynamic memory. C usually has fewer hardware requirements. Because Java uses more physical memory, one must have more memory available, potentially increasing the cost of hardware.

Sometimes, C is simply not a feasible option. C is not designed for user interface application whereas Java has several different toolkits and libraries for interface design. C is not designed for abstract, high-level programmers. Java plays to those with diverse, but not necessarily advanced programming skills who want to develop rapid, cross platform applications. When C is taken out of its element and used to program something better suited for Java, its cost will escalate dramatically. Trying to use C to create a front-end to an Oracle database would border insanity. The maintenance cost and hits to reliability would probably render any attempt inoperable. However, crunching numbers for a GPS unit with 2 Megs of total memory would be a ridiculous, perhaps impossible task for Java, whereas C would do just fine.

Most of the time, a programmer would never have to choose between C and Java because the two languages are made for different uses. However, if some circumstance occurred where both were viable options, Java would be easier and C would be cheaper. 

Assignment Three: Create a table of the major programming language developments

Development Language Readability Writability Reliability Source
Machine Code Machine Language True machine code consists of only two digits, 0 and 1. Machine code is not readable without documentation. Even with documentation, it is almost unreadable. Because it only has 1s and 0s, machine code is impossible to write without documentation. When programmers who wrote  machine code often used switches, levers and knobs on computers the size of buildings. Humans that enter machine code directly are prone to error. Manually entered machine code is extremely unreliable. wikipedia
Pseudocodes Pseudocode languages include John Macus' Speedcoding, UNIVAC's compiler and Short Code for the BINAC computer While not as daunting as machine code, pseudocodes are still unreadable without documentation. Pseudocodes were far easier to write than machine codes. Programs were shorter, more intuitive and more efficient. "Bacus claimed that problems that could take two weeks to program in machine code could be programmed in a few hours using Speedcoding." Pseudocode is more reliable than machine code. Although short code took longer to execute than pure machine code, the result was less prone to error, easier to write, easier to read and easier to maintain. Class text p 42-43
First High-Level language Fortran I Fortran is leaps and bounds more readable than any predecessor language. Variable names, input/output programming and algebraic operators made reading code much easier. One of the key features of writability is support for abstraction. Fortran did allow subroutines and control statements that could be used over and over again. The reason Fortran I was reliable is because "the machine code produced by the compiler would be about as efficient as what could be produced by hand." (46) At the same time, the compiler removed the human error of manually entering machine code and pseudocode. This enhanced reliability beyond any other language at the time. Class text p 44-46
Functional Programming LISP, FLPL, IPL LISP's readability is quite straightforward. It is simple and orthogonal. In fact, it only has two types, an atom and a list. There is no need for loops, assignments or variables. LISP is very readable Although many would say that LISP is a writable program, the definition of writability as on page 17 of the textbook would prove contrary. There is little support for abstraction because there are no subroutines or functions available to programmers. LISP is not expressive because there is almost no way of expressing computation beyond the simple atom-list form. While it is easy to write simple LISP programs, it is not, according to textbook definition, a writable language. Is LISP reliable? The textbook defines reliability in terms of type checking, aliasing and exception handling. These are largely imperative language characteristics. LISP itself is simple and reliable, but loses reliability as it tries to become increasingly portable and cross-platform. "During the 1970s and 1980s, a large number of different dialects of LISP were developed an used. This led to the familiar problem of portability." (53) COMMON LISP, a standard, helps improve reliability but increases complexity due to more data types and structures. Class text p 51-54
The failure of ALGOL ALGOL 60 ALGOL was to become the universal language. It became the universal language no one used. However, its failure fueled the success of languages for decades to come. ALGOL was amazingly readable. "The syntax of the language should be as close as possible to standard mathematical notation, and programs in it should be readable with little further explanation." (56) The syntax of ALGOL influenced C, C++, Java, Pascal and almost every imperative language written thereafter. Block structures, passing parameters to subprograms and recursive procedures allowed programmers writability that they never had before. One of the most interesting reliability issues with ALGOL is its cross-platform nature and its ability to "become, almost immediately, the only acceptable formal means of communicating algorithms in computing literature." (59) However, due to several problems, it was seldom implemented. The reason ALGOL is so important is because it failed. Its lack of input and output functions showed other language developers how important I/O really is. ALGOL did not allow users to define their own data types. It featured a pitiful type-checking system which created the freedom to write lackadaisical programs. There is no call-by-reference, only by value. Those who designed Pascal and C must have looked and ALGOL and thought, If only we could take all of this great stuff and just add this and that.

ALGOL was unreliable because it lacked some of the intricacies we now take for granted in programming languages. The most important failure, and subsequent lesson learned, is that ALGOL was never supported by IBM. No one today would develop an "international, commonly used" language that didn't support Windows. Why? Because ALGOL failed. It is largely due to ALGOL's beauty and ultimate failure that so many of its predecessors succeeded.

Class text p 55-60 and this site
Computerizing Business Records COBOL COBOL is somewhat readable. The Department of Defense, which sponsored the development of COBOL, mandated that the "use of English [which] would allow managers to read programs." (63) The use of English words does enable easier reading, especially with data division and COBOL's allowance of 30-character names. Writing COBOL can be constrictive, although easier than some other languages. "The language [is] easy to use, even at the expense of being less powerful, in order to broaden the base of those who could program computers." (62) Unfortunately, the simplicity of COBOL hinders the programmer's ability to write intricate code. For example, "versions of COBOL prior to the 1974 standard ... did not allow subprograms with parameters." (63) The longevity and durability of COBOL are a testament to its reliability. While initially slow and cumbersome, fast processors, efficient compilers and cheap memory have made COBOL useful and reliable. Class text p 61-65
User time is more important than computer time BASIC BASIC was designed to be easy to learn. Instead of int or float, the keyword numbers was used. The original BASIC only had 14 different statements. BASIC was and still is remarkably simple. Due to its simplicity, BASIC is not writable except to novice programmers. Its lack of structure and little allowance for expressivity and abstraction make it ineffective and annoying even for mediocre programmers. BASIC is simply not reliable. It was never standardized, except for a simple, bare-bones standard named Minimal-BASIC. Even recent upgrades to BASIC including Microsoft's Visual BASIC and VB.NET are simple, not standardized and inferior to competitor programs such as Delphi, C# and Java. Class text p 66-70
The merger of scientific and business applications in one language PL/I PL/I was too big to work. One cannot merge the three most complex programming languages of the time--ALGOL, Fortran an COBOL--into one language and expect even remote simplicity. PL/I was not simple, not orthogonal and ultimately, not readable PL/I was massive. Furthermore, unless a programmer knew Fortran, COBOL and ALGOL, the data types, which included pointers, must have seemed foreign and obscure. PL/I was overwhelming to even computer gurus such as Edsger Dijkstra, who voiced harsh criticism concerning the complexity of the language. Poor exception handling, attempting to hodgepodge several different languages into one language, pointers, concurrently executing subprograms that didn't execute properly and almost universal disapproval made PL/I unreliable and unpopular. Class text p 71-73
Programming Based on Logic Prolog Prolog consists of statements, and only allows for a few statements that can be combined in complex ways. Therefore, it is simple, but not necessarily orthogonal. Prolog is a simple and readable language, although it is not as readable as written logic. Prolog, like BASIC, is easy to write for new programmers, although advanced programmers will struggle with its simplicity and lack of data types. Prolog is somewhat writable, but its lack of imperative design limits its scope. Prolog is not reliable. It is easy to write endless loops due to its "left-to-right depth-first order of evaluation." (641) Also, Prolog does not recognize truth outside of its limited amount of internal information. If there is insufficient information in the database, Prolog will assume goals to be false even if they are really true. Also, there is no known efficient sorting algorithm, adding to the already massive computation time. Class text p 85-86 and 640-645
Object-oriented programming Smalltalk Everything in Smalltalk is an object. Objects communicate with each other through messages. While initially convoluted and foreign to programmers more familiar with Fortran or COBOL, Smalltalk is relatively simple and quite readable once a programmer understands the object-oriented nature of the language. Smalltalk is easy to write. Like readability, programmers who had never used an object-oriented language might cringe when first attempting to write a Smalltalk program. However, inheritance, objects, methods and subprograms empower the programmer and make writing code easier. Because Smalltalk is an entire development environment, the graphical interface makes it even easier to write. Programmers in 2003 who want to use an object-oriented language will choose something besides Smalltalk. The type checking, exception handling and overall reliability are far superior in a language such as Java. Although not as reliable as other object-oriented languages, Smalltalk's complete development environment and object-oriented design is overwhelmingly influential in languages today and those of tomorrow. Class text p 92-95
Combining object-oriented and imperative languages C++, also Delphi and Eiffel C++ and Delphi are readable insofar as they mimic languages that most programmers already knew, C and Pascal (respectively). However, neither C nor Pascal are known for simplicity. They are complex, albeit very powerful. C++ and Delphi are readable insofar as they are familiar, not because they are simple or orthogonal C++ is as powerful a language as any programmer needs. Unlike most languages, it allows operator overloading, loose type checking and supports procedure-oriented programming. Visual C++ and Delphi provide a graphic interface, like Smalltalk, to make them even more writable. At its time, C++ was probably the most writable language ever developed. C++ is not reliable. Due to its complexity and writability, C++ gives programmers the ability to make mistakes. Illogical overloading of operators--one can make the plus sign mean greater-than, for example--flip-flopping on type and general insecurities make C++ a very dangerous language that is unreliable even in the hands of competent programmers. Delphi, based off the safer Pascal, is far more reliable, but not nearly as expressive or writable. Class text p 96-99
Language developed for the World Wide Web HTML HTML is remarkably readable. It is simple. In fact, in just one class, even average students can start authoring web pages. All tags are standardized and one need know only a few simple tags and keywords to develop working web pages. HTML was not originally writable. There was absolutely no support for abstraction as all tags were previously defined. There was no support for expression either. However, with the implementation of XML, CSS and JavaScript, HTML can still maintain its simplicity while using the aforementioned additions to gain greater writability. HTML is reliable now that the W3C has implemented a standard. All browsers, from IE to Opera, obey the WC3's standard to some degree. The result is a simple, reliable language that runs on any browser and any version. Non-standard elements, such as Netscape 4's layer tag, have presented reliability problems due to the lack of cross-browser support. However, the standard HTML is very reliable and almost universally accepted. W3C's site and HTMLGoodies.com
Imperative-based object-oriented language Java Java is more simple and more elegant than the other major imperative/object-oriented language, C++. Java's primitive types are not objects based on classes. Java does not have pointers, enumeration types, stand-alone subprograms or multiple inheritance. Java was designed to be more simple than C++, and it absolutely is. Although some would disagree, it is my opinion that no language (except perhaps C#) is as writable as C++. C++ has as much expression and support for abstraction as any programmer could possibly want. For example, in Java, one cannot assign a type coercion from float to int, whereas in C and C++ it is possible and often done. Java is writable because it is more simple than its competitor languages, but it is not as powerful and does not allow the programmer as much freedom, abstraction or expressivity Java is absolutely more reliable than any of its predecessors. In fact, Java's reliability is its strength. By eliminating some of the features that make it less writable than C++, Java becomes an unbelievable reliable language. It has strong type checking, garbage collection, no multiple inheritance, implicit dereferencing, and methods can only be called through a class or object. All of these features, and several not mentioned, make Java one of if not the most reliable language available to programmers today. Class text p 99-103
Scripting languages for the web JavaScript, also PHP and JScript To anyone familiar with C, Java or C++, JavaScript is almost immediately readable. However, because it is often imbedded in HTML, it can be somewhat tricky to spot. Also, the line where JavaScript stops and HTML begins is hazy at best. Not necessarily simple or orthogonal, Javascript is readable only so far as it is familiar. To someone who has no previous knowledge of C(++) or Java, JavaScript will look odd and confusing. HTML is a very limited language (if you call it a language). Javascript provides the expressivity and support for abstraction that HTML lacks. JavaScript uses functions, expressions and provides a way for developers to empower web pages without the intricacies of Java applets. Compared with straight HTML, JavaScript is very writable. While it is commonplace and very popular, JavaScript is not reliable. The variety of browsers and browser versions makes writing JavaScript somewhat of a guessing game. Does IE 4 support my script? How about the new Opera or Netscape browser? Also, JavaScript array indices are not checked for validity, it is not strongly typed and it does not support inheritance or "dynamic binding of method calls to methods." (104) JavaScript is not reliable due to its own faults and varying browsers. Class text p 103-105

Miscellaneous developments in programming languages

Unpopular languages that gain influence over time Plankalkul and SIMULA Plankalkul was developed in the early 1940s, but was largely unknown until its manuscript was published in 1943. The language featured "programs to sort arrays of numbers, test the connectivity of a given graph, carry out integer and floating-point operations...and perform syntax analysis on logic formulas." (40) If discovered in its time, Plankalkul would have been revolutionary. Its syntax included keywords that were not developed until Fortran.

SIMULA is an extension of ALGOL. In includes block structures, class instances and data structures. SIMULA introduced a key concept that is used in almost all object-oriented programming today. "Data structures and the routines that manipulate that data structure are packaged together ... distinct from a class instance, so a program can create and use any number of instances of a particular class." (76-77)

Plankalkul and SIMULA introduced concepts that were decades ahead of their time. Although neither language was implemented to any great extent, their influence is still felt today.

Class text p 38-41 and 76-77
Languages used primarily for academia BASIC and Pascal BASIC, as already discussed, was developed in Dartmouth, New Hampshire as a way for liberal art majors to learn the basics of programming. This trend continued with the development of Pascal. "Because Pascal was designed as a teaching language, it lacks several features that are essential for many kinds of appliations." (80) Pascal is relatively safe, popular and easy to learn. However, it has a weak compiler, little support for expressive subprograms and nonstandard dialects. Nevertheless, languages such as BASIC and Pascal have empowered generations of students with the ability to write complex, working programs. class text p 66-67, 79-81
Current developments C# and XML In an effort to combine the reliability of Java with the writabiliy of C++, Microsoft has developed the C# language. C# features a Just-in-time compiler that bypasses the cumbersome Virtual Machine of Java. C# includes pointers, operator overloading and enum types and goto statements that were left out of Java. Also, C# is XML-compatible.

XML is an abbreviation for extensible markup language, and is both an improvement to and an suppliment to HTML. XML is a way of documenting structured information. In HTML, the tags have fixed meaning. In XML, however, the developer is allowed to create his/her own tags and subsequent meaning.

Both C# and XML attempt to define new standards in the programming industry. They both attempt to take the readability and writability of existing languages and improve them by adding features that enable businesses to more accurately and efficiently develop both standing and web-based applications.

Class text p 106-107, Microsoft's web site and XML.com

 

Brian Buckley
buckleyb@umkc.edu