Concepts and Semantics of Programming Languages 1. Therese Hardin

Чтение книги онлайн.

Читать онлайн книгу Concepts and Semantics of Programming Languages 1 - Therese Hardin страница 10

Concepts and Semantics of Programming Languages 1 - Therese Hardin

Скачать книгу

addressed well before the emergence of computers, in the late 19th century, by logicians, mathematicians and philosophers, who introduced a range of different approaches to the theory of calculability.

      The Turing machine [TUR 95] is a mathematical model of computation introduced in 1936. This machine operates on an infinite memory tape divided into cells and has three instructions: move one cell of the tape right or left, write or read a symbol in the cell or compare the contents of two cells. It has been formally proven that any “imperative” programming language, featuring assignment, a conditional instruction and a while loop, has the same power of expression as this Turing machine.

      Several other models of the notion of algorithmic computation were introduced over the course of the 20th century, and have been formally proven to be equivalent to the Turing machine. One notable example is Kleene’s recursion theory [KLE 52], the basis for the “pure functional” languages, based on the notion of (potentially) recursive functions; hence, these languages also have the same power of expression as the Turing machine. Pure functional and imperative languages have developed in parallel throughout the history of high-level programming, leading to different programming styles.

      1.2.2. High-level languages

      Broadly speaking, the execution of a functional program carries out a series of function calls that lead to the result, with intermediate values stored exclusively in the registers. The execution of an imperative program carries out a sequence of modifications of memory cells named by identifiers, the values in the cells being computed during execution. The most widespread high-level languages include both functional and imperative features, along with various possibilities (modules, object features, etc.) to divide source code into pieces that can be reused.

      Whatever the style of programming used, any program written in a high-level language needs to be translated into binary language to be executed. These translations are executed either every time the program is executed – in which case the translation program is known as an interpreter or just once, storing the produced binary code – in which case the translator is known as a compiler.

      C #include <stdio.h> int main () { int i = 0 ; printf (“%d\n”, i++) ; return (0) ; }

      will print 0, but if i++ is replaced with ++i, the same program will print 1.

      There are a number of concepts that are common to all high-level languages: value naming, organization of namespaces, explicit memory management, etc. However, these concepts may be expressed using different syntactic constructs. The field of language semantics covers a set of logico-mathematical theories, which describe these concepts and their properties. Constructing the semantics of a program allows to the formal verification of whether the program possesses all of the required properties.

      1.2.3. From source code to executable programs

      The transition from the program source to its execution is a multistep process. Some of these steps may differ in different languages. In this section, we shall give an overview of the main steps involved in analyzing and transforming source code, applicable to most programming languages.

      The source code of a program is made up of one or more text files. Indeed, to ease software architecture, most languages allow source code to be split across several files, known as compilation units. Each file is processed separately prior to the final phase, in which the results of processing are combined into one single executable file.

      1.2.3.1. Lexical analysis

      Lexical analysis is the first phase of translation: it converts the sequence of characters that is indeed the source file into a sequence of words, assigning each to a category. Comments are generally deleted at this stage. Thus, in the following text presumed to be written in C

      /* This is a comment. */ if [x == 3 int +) cos ($v)

      Lexical analysis may be seen as a form of “spell check”, in which each recognized word is assigned to a category (keyword, constant, identifier). These words are referred to as tokens.

      1.2.3.2. Syntactic analysis

      Every language follows grammar. For example, in English, a sentence is generally considered to be correctly formed if it contains a subject, verb and complement in an understandable order. Programming languages are no exception: syntactic analysis verifies that the phrases of a source file conform with the grammar of their language. For example, in C, the keyword if must be followed by a bracketed expression, an instruction must end with a semicolon, etc. Clearly, the source text given in the example above in the context of lexical analysis does not respect the syntax of C.

      Technically, the syntactic analyzer is in charge of the complete grammatical analysis of the source file. It calls the lexical analyzer every time it requires a token to progress through the analyzed source. Syntactic analysis is thus a form of grammar verification, and it also builds a representation of the source file by a data structure, which is often a tree, called the abstract syntax tree (AST). This data structure will be used by all the following phases of compilation, up to the point of execution by an interpreter or the creation of an executable file.

      1.2.3.3. Semantic analyses

      The first two analysis phases of compilation only concern the textual structure of the source. They do not concern the meaning of the program, i.e. its semantics. Source texts that pass the syntactic analysis phase do not always have meaning. The phrase “the sea eats a derivable rabbit” is grammatically correct, but is evidently nonsense.

      The best-known semantic analysis is the typing analysis, which prohibits the combination of elements that are incompatible in nature. Thus, in the previous phase, “derivable” could be applicable to a function, but certainly not to a “rabbit”.

      1.2.3.4.

Скачать книгу