SourceForge.net Logo BNF for Java Example: Canadian Postal Code Example


MetaLanguage Extensions in this compiler

In the International Standard's Introduction, the section titled Limitations and extensions discusses "extending the metalanguage".
In this Compiler Project, we say:
implementation extensions to refer to architectural features such as the Parse Tree and Emitter, and
metalanguage extensions to refer to the syntax term in your grammar that calls an external object.

Example: Analyzing the Canadian Postal Code

Suppose we are parsing a form with Canadian address. Here is a BNF parser for the Postal Code:

Canadian Postal Code

(* Postal Code eg: T2W 1W5 *)
postal code = letter, digit, letter, space, digit, letter, digit;

This definition works, because it attempts to match its format against the terminal in the input document. If the input matches the definition, the definition succeeds, the token is added to the Parse Tree, and the parser proceeds. If the input does not match, the definition fails, and the parser must backtrack in the document, and attempt an alternate definition, if there is one.

Example: Validating the Canadian Postal Code

Analyzing correct format is not the same as validating actual use of the code by our Post Office. To discover whether the source string is actually a postal code for a postal region in Canada (or in Alberta, or in Calgary) you can utilize your own search on a lookup table, named "Postal Code" in our corporate database. The lookup would return a simple yes, if found, or no, if not.

Canadian Postal Code, with validation

(* Postal Code eg: T2W 1W5 *)
postal code = letter, digit, letter, space, digit, letter, digit, lookup("Postal Code");

Canadian Postal Code, with more validation

(* Postal Code eg: T2W 1W5 in Calgary only *)
postal code = letter, digit, letter, space, digit, letter, digit, lookup("Postal Code", "Calgary" );

For this example, the metalanguage extension does not alter any data. It adds a truth value to the definition, the same way an ordinary parser term has a truth value.

Syntax of the Metalanguage Extension

(* Metalanguage Extension *)
metalanguage extension
  = meta identifier,              (* same as a syntax name *)
    start extension symbol,       (*  '('    *)
    extension parameter list,     (*   ...   *)
    end extension symbol;         (*  ')'    *)

start extension symbol   = '(';
end extension symbol     = ')';
extension parameter item = terminal string;     (* eg: 'Calgary' or "Calgary" *)

extension parameter list
  = extension parameter item,                         (* eg: lookup("Postal Code")            *)
    { concatenate symbol, extension parameter item }  (* eg: lookup("Postal Code", "Calgary") *)
  | ;                                                 (* eg: lookup()                         *)

Note: In the Dan's implementation, the extension is a type of Syntax Primary (* see 4.10 in the Standard *).

(* see 4.10 *) syntactic primary
  = optional sequence                (* [ ... ]                *)
  | repeated sequence                (* { ... }                *)
  | grouped sequence                 (* ( ... )                *)
  | meta identifier                  (*  postal code           *)
  | terminal string                  (* 'Postal' or "Code"     *)
  | special sequence                 (* ? ... ?                *)
  | empty sequence                   (* nothing                *)
  | metalanguage extension;          (* lookup("Postal Code")  *)

Specifying Your Extension

The extension's program is a Java class that you write and provide to the Compiler. The Compiler loads your class at run time. The Parser Engine instantiates an object when the syntatactic primary is parsed. It then passes the parameter list (a simple array of Strings) to your parse() method, which has access to the Parse Tree, and must return boolean true or false. Later, when the Emitter is unraveling the Parse Tree, your object is re-visited, for a second-pass output.

The ExtensionObject Base Class

Extend this class to create your Extension Object Class

package bnf;

public abstract class ExtensionObject
{
    protected ParseTreeNode context;    // postal code = letter, digit, letter, space, digit, letter, digit;
    protected Emitter emitter;          // an output stream
    protected String args[];            // [ "Postal Code", "Calgary" ]

    /**
     * The base constructor sets up the environment for the Extension Object
     */
    public ExtensionObject( ParseTreeNode context, Emitter emitter, String args[] )
    {
        this.context = context;     // the BNF ParseTreeNode for "postal code",
        this.args = args;           // the parameter list, an array of Strings
        this.emitter = emitter;     // your output stream
    }

    /**
     * The parse method is aliased to your extension for Pass 1.
     */
    public abstract boolean parse();

    /**
     * The parse method is aliased to your extension for Pass 2.
     */
    public abstract String emit();

}

The User-Supplied ExtensionObject Class

This Java code follows the above example of a Postal Code validator

package myExtensions;
import java.util.*;             // List, Iterator
import java.sql.*;              // Statement, ResultSet

public class PostalCode
    extends bnf.ExtensionObject
{
    private String value;             // "T2W 1W5"
    private boolean valid;            // must return true/false to Parser

    /**
     * The parse method is aliased to the "lookup()" extension for Pass 1.
     * Overrides ExtensionObject.java
     */
    public boolean parse()
    {
        1. collect the tokens from the context, eg: context is an List of Strings, [ "T","2","W"," ","1","Z","5" ]
        2. format them if necessary             eg: String value = "T2W 1Z5";
        3. create an SQL statement              eg: select * in postal codes where code = value;
        4. convert the SQL return to boolean    eg: valid = resultSet.next();      // any rows at all?
        5. return valid to the Parser           eg: return valid;
    }

    /**
     * The emit method is aliased to the "lookup()" extension for Pass 2.
     * Overrides ExtensionObject.java
     */
    public String emit()
    {
        return value;        // "T2W 1W5"
    }

}

Installing Your Extension

Your extension is a java class, and can be located anywhere. The command-line for the Compiler application will provide the alias. You can add as many extensions as you need, each one is listed in a separate alias.

Specifying an Alias for your extension object

dan:>bnf -grammar:postal-code -source:example.txt -alias:"lookup"="myExtensions.PostalCode.class"