SourceForge.net Logo BNF for Java: About BNF


Hello! A Mini-Grammar in Action

Let's start the discussion with a mini-grammar: we'll parse the word "Hello!".

Here is the BNF program:

hello = "H", "e", "l", "l", "o", end of hello;  (* how to spell "Hello!"                 *)
end of hello = "." | "!";                       (* a period or exclamation ends the text *)

Here is the source text, probably in a simple Java String:

     Hello.  or   Hello! 

That's all! This metalanguage syntax expects one thing only: a series of characters that spells H, e, l, l, o, !! ( or Hello.)
No other sequence of characters will succeed with this definition.


The Compiler's Architecture

Here is the architecture of a complete compiler. The data flow is left to right. Logic and processing activity is controlled by the Definition block -- the meta-language.

                       +-----------------------+
                       | Language Definition   |
                       | Meta-Program (in BNF) |
                       +-----------------------+
                           /        |       \
                          /         |        \
                         /          |         \
    (source)     (scanner)      (parser)       (parser)       (emitter)     (target)
    (reader)                                                                (writer)
                +----------+   +----------+   +----------+   +----------+
    source      | lexical  |   |  syntax  |   | semantic |   |  code    |     object
    program  -->| analysis |-->| analysis |-->| analysis |-->|generation|--> program
                +----------+   +----------+   +----------+   +----------+

This diagram is modified from Watson, "High-Level Languages and Their Compilers, figure 4.1, page 99


The Compiler's Action

hello = "H", "e", "l", "l", "o", end of hello;  (* how to spell "Hello!"                 *)
end of hello = "." | "!";                       (* a period or exclamation ends the text *)

The first statement in our "Hello" mini-grammar has a name: hello which acts as the meta-identifier, for the syntax expression that follows the = definition operator. The statement defines the syntax, as well as content, that the grammar requires. So, this meta-identifier also serves as a statement type, or context name, for the expression.

The statement is composed of a sequence of terms. In this simple case, all of the terms are terminals that represent characters that the grammar requires from the source.

Each of the terms in the statement will be analyzed by the compiler, in the order given, from beginning to end. A terminal is parsed by reading the next character from the source, and comparing it to the syntax terminal. The scanner matches the syntax terminal H, to the source character H (a single letter).

When each term succeeds, the parser/analyzer proceeds to the next term in the statement. Every one of the syntax terms must be parsed, until at the end of the statement, the entire statement, and its meta-identifier, and its context, and the grammar, have succeeded, indicating that the source was indeed a Hello.

The last term in the definition of "Hello" is a non-terminal. It is a meta-identifier, and it calls the named statement elsewhere in the grammar.

The second statement, "end of hello", is a single term, a choice which in turn provides the two alternates, "." or "!". Each of these terminals will be parsed to match the scanner's next character. The one that succeeds, succeeds for the statement. In this way, the content of the "end of hello" statement is the character, and the context is "end of hello".

However if any one of the terms fails, possibly because one of the source characters did not match the syntax terminals, then the entire statement, and its meta-identifier, and its context, have failed. At that point, the grammar rejects the parse statement, the scanner backtracks to the point before the statement was tried, and parsing may continue with an alternate statement. In this case, there are no alternates, so the entire grammar fails, indicating the we don't know what the source is.


Books

High-Level Languages and Their Compilers
by Des Watson, 1989
Addison-Wesley Publishing
ISBN 0-201-18489-3
Dewey 005.13 WAT

Chapter 4 has a good outline for a compiler.
I used the description to outline the package and module
architecture of my compiler.

Formal Specification of Programming Languages: A Panoramic Primer
by Frank G. Pagan
Prenice-Hall Inc, 1981
ISBN 0-13-329052-2
Dewey 001.64'24

Chapter 2 has a good description of BNF syntax.
I used the ideas to outline the interactions of my compiler's architecture.

ISO/IEC 14977:1996(E) International Standard
Information Technology/Syntactic MetaLanguage/Extended BNF

distributed on http://www.iso.ch/iso/en

I have tried to conform as closely as possible to the syntax in Sections 4,5,6,7
and to the example in Section 8.

    Valid XHTML 1.1!     Valid CSS!