Академический Документы
Профессиональный Документы
Культура Документы
Introduction
The General Problem of Describing
Syntax
Formal Methods of Describing Syntax
Attribute Grammars
Describing the Meanings of
Programs: Dynamic Semantics
Semantics
the meaning of the expressions, statements, and
program units
which describes what language constructs *do*
Example:
For example, the syntax of a Java while statement
is
while (boolean_expr) statement
Steps in a compiler:
2. Generators
A language generator is a device that
can be used to generate the
sentences of a language.
Context-Free Grammars
Developed by Noam Chomsky in the
mid-1950s
Language generators, meant to
describe the syntax of natural
languages
Define a class of languages called
context-free languages
Context-Free Grammar
In linguistics and computer science, a context-free
grammar (CFG) is a formal grammar in which every
production rule is of the form
Vw
where V is a non-terminal symbol and w is a string
consisting of terminals and/or non-terminals.
The term "context-free" expresses the fact that the nonterminal V can always be replaced by w, regardless of
the context in which it occurs.
A formal language is context-free if there is a contextfree grammar that generates it.
context-free grammar
(CFG)
Is a set of recursive rewriting rules (or productions) used to
generate patterns of strings.
A CFG consists of the following components:
a set of terminal symbols, which are the characters of the
alphabet that appear in the strings generated by the
grammar. {0, 1}
a set of nonterminal symbols, which are placeholders for
patterns of terminal symbols that can be generated by the
nonterminal symbols. {q, f,}
a set of productions, which are rules for replacing (or
rewriting) nonterminal symbols (on the left side of the
production) in a string with other nonterminal or terminal
symbols (on the right side of the production).
a start symbol, which is a special nonterminal symbol that
appears in the initial string generated by the grammar.
Derivation of a program
Ambiguity
A grammar that generates a
sentential form for which there are
two or more distinct parse trees is
said to be ambiguous
Operator Precedence
How to remove the ambiguity?
By enforcing Operator precedence
Operator with Higher precedence
should be placed lower in the the tree
Leftmost derivation
Rightmost derivation
Associativity of Operators
Left Associativity
Left recursive grammar rule
e.g, a+b+c = (a+b)+c
e.g., A = B + C + A
What languages do
SmallTalk: no operator precedence
APL: right-associativity with no operator precedence
- Programmers typically add parentheses when things are
difficult to read
if (a < x < b)
Extended BNF
Three extensions are commonly
included in the various versions of
EBNF.
The first of these denotes an optional
part of an RHS, which is delimited by
brackets. For example, a C if-else
statement can be described as
<if_stmt> if (<expression>)
<statement>
[else
<statement>]
<identifier>
{,
<identifier>}
This is a replacement of the recursion
by a form of implied iteration; the
part enclosed within braces can be
iterated any number of times
repeated indefinitely or left out
Attribute Grammar
CFGs cannot describe all of the syntax of
programming languages
An attribute grammar is a device used to
describe more of the structure of a
programming
language
than
can
be
described with a context-free grammar.
An attribute grammar is an extension to a
context-free grammar
Primary value of attribute grammars (AGs)
Static semantics specification
Compiler design (static semantics checking)
Syntax in Lisp?
- No expressions in LISP, but function
applications!
- Everything is parenthesized.
- No associativity, No operator precedence.
- Compare with the 25 pages of the Java manual
- This's also why writing a LISP interpreter is not
very difficult
- write a pure Java interpreter would be
extremely difficult
- writing a byte code interpreter is not that hard
Attribute grammars
are a formal approach both to describing
and checking the correctness of the
static semantics rules of a program.
Dynamic semantics
which is the meaning of expressions,
statements, and program units
Basic concepts
Attributes,
which are associated with
grammar symbols (the terminal and nonterminal symbols), are similar to variables in
the sense that they can have values assigned
to them.
Attribute
computation
functions,
sometimes called semantic functions, are
associated with grammar rules. They are used
to specify how attribute values are computed.
Predicate functions, which state the static
semantic rules of the language, are associated
with grammar rules to check for attribute
consistency.
3. Predicate functions
. Boolean expressions on the attribute set
{A(X0), ... , A(Xn)}
Intrinsic attributes
Intrinsic attributes are synthesized
attributes of leaf nodes whose values
are determined outside the parse
tree.
Examples of attribute
grammars
The string attribute of <proc_name>, denoted by <proc_name>.string,
is the actual string of characters that were found immediately following
the reserved word procedure by the compiler.
Syntax rule:
<proc_def> procedure <proc_name>[1] <proc_body> end
<proc_name>[2];
Predicate:
the predicate rule states that the name string attribute of the
<proc_name> nonterminal in the subprogram header must match the
name string attribute of the <proc_name> nonterminal following the
end of the subprogram
<proc_name>[1]string == <proc_name>[2].string
Examples of attribute
grammars
Evaluation
Checking the static semantic rules of a language is an
essential part of all compilers.
Even if a compiler writer has never heard of an attribute
grammar, it is necessary to use fundamental ideas to
design the checks of static semantics rules of the compiler.
One of the main difficulties in using an attribute grammar
to describe all of the syntax and static semantics of a real
contemporary programming language is the size and
complexity of the attribute grammar.
The large number of attributes and semantic rules required
for a complete programming language make such
grammars difficult to write and read.
the attribute values on a large parse tree are costly to
evaluate.
Semantics
There is no single widely acceptable notation or formalism for
describing semantics
Semantics description tool is useful for:
Better understanding the statements of a language
Developing more effective compiler
Program correctness proofs
Three approaches
Operational
Axiomatic
Denotational
Operational Semantics
Describe the meaning of a program by executing its statements on a machine,
either simulated or actual.
The change in the state of the machine (memory, registers, etc.) defines the
meaning of the statement
Basic Process
language is clarity
to design an appropriate intermediate language
Every construct of the intermediate language must
have an obvious and unambiguous meaning.
Evaluation
Good if used informally (language manuals, etc.)
Extremely complex if used formally (e.g., Vienna
Definition Language - VDL), it was used for
describing semantics of PL/I (PL/I stands for
"Programming Language 1". PL/I was an
antecedent of the C programming language,
which essentially replaced it as an all-purpose
serious programming language).
Useful for language users and implementors
Based on algorithms, rather than mathematics
Denotational semantics
It is an approach of formalizing the
meanings ofprogramming languages
by
constructing
mathematical
objects
(calleddenotations)
that
describe
the
meanings
of
expressions from the languages
The method is named denotational
because the mathematical objects
denote the meaning of their
corresponding syntactic entities.
Denotational vs operational
semantics
Example
Denotational semantics:
The change in the values of the program's variables
The state change is defined by mathematical functions
Expressions
Semantic function: maps expressions to
integer value (or error)
Assignment Statements
Semantic function: maps state to state
Evaluation of denotational
semantics:
When a complete system has been defined
for a given language, it can be used to
determine the meaning of complete
programs in that language. (i.e) Provides a
rigorous way to analyze programs
Can be used to prove the correctness of
programs
Can be an aid to language design and
compiler generation
But too complex for language users
Axiomatic Semantics
Axiomatic semanticsis an approach
based onmathematical logic to proving
thecorrectness of computer programs
Goal: program correctness proof
(i.e) Rather than directly specifying the
meaning of a program, axiomatic semantics
specifies what can be proven about the
program.
Approach
Specify constraints for each statement
by logical expressions called assertions (or
predicate)
Pre-condition:
an assertion before a statement
Describes the constraints on the variables at that
point
Post-condition:
an assertion following a statement
Describes the new constraints on those variables
Weakest precondition:
The least restrictive precondition that will
guarantee the postcondition.
Pre-post form Notation: {P} S {Q}
S: program statement
P: constraints on variables BEFORE statement
execution called a "precondition"
Q: constraints on variables AFTER statement
execution called a "postcondition"
Example: a = b + 1 {a > 1}
One possible precondition: {b > 10}
Weakest precondition: {b > 0}
This rule states that if S1, S2, . . . , and Sn are true, then
the truth of S can be inferred.
The top part of an inference rule is called its antecedent;
the bottom part is called its consequent.
An axiom is a logical statement that is assumed to be true.
Therefore, an axiom is an inference rule without an
antecedent.
Assignment Statements
- Example:
a = b / 2 - 1 {a < 10}
the weakest precondition is:
P={b/2-1
< 10} which reduces to {b < 22}.
Sequences
The weakest precondition for a sequence of
statements cannot be described by an axiom,
because the precondition depends on the particular
kinds of statements in the sequence.
So the precondition can only be described with an
inference rule.
Let S1 and S2 be adjacent program statements. If S1
and S2 have the following pre- and post-conditions
Selection
Example
while y <> x do y = y + 1 end {y = x}
Note:
equal sign => In assertions, it means mathematical
equality;
outside assertions, it means the assignment operator.
4. loop termination
From previous example, the question is
whether the loop
{y <= x} while y <> x do y = y + 1 end
{y = x} terminates.
Program Proofs
Evaluation
Developing axioms or inference rules for
all of the statements in a language is
difficult
It is a good tool for correctness proofs,
and an excellent framework for reasoning
about programs, but it is not as useful for
language users and compiler writers
Its usefulness in describing the meaning
of a programming language is limited for
language users or compiler writers
Summary