Parsing

From ESEwiki

Jump to: navigation, search

(Draft)

Contents

The grammar

A grammar defined by the parse library is written this way:

  {PARSE_TABLE <<
     -- atoms
  >> }

where atoms are pairs of a STRING and PARSE_ATOM (either PARSE_NON_TERMINAL or PARSE_TERMINAL) objects. The STRING is the name of the atom, and the PARSE_NON_TERMINAL or the PARSE_TERMINAL is its definition.

For example, "KW end of file" is the name of the atom; it will be used in some PARSE_NON_TERMINAL.

The Atoms

Non-terminals

You define a non-terminal like this:

   {PARSE_NON_TERMINAL <<
      -- rules
   >> }

where rules are pairs:

  • the first element is a TRAVERSABLE[STRING], usually defined as a manifest FAST_ARRAY[STRING], each STRING being the name of one atom
  • the second element is an agent used when reducing the grammar (more about that later).

Tokens

You define terminal atoms, aka tokens, like this:

   create {PARSE_TERMINAL}.make(parser, reducer)

where:

Grammar reduction

Once the source text is totally parsed, ESE_PARSER calls all the reduction agents. It may be used e.g. to write an interpreter (look at the calc tutorial, either in tutorial/parse or tutorial/yepp) or to create an AST (abstract syntactic tree) the way tools like yepp and tuf use it.

An example: the end-of-file token

Now, about the end-of-file token: you may call it whatever you want as long as you define the keyword.

In the EIFFEL_GRAMMAR class the end of file keyword is called "KW end of file" and it's a token defined this way:

   create {PARSE_TERMINAL}.make(agent parse_end, Void)

where parse_end is defined a bit further in the class text, that way:

   parse_end (buffer: MINI_PARSER_BUFFER): EIFFEL_IMAGE is
      do
         skip_blanks(buffer)
         if buffer.end_reached then
            create Result.make(once "", last_blanks.twin)
         end
      end

But maybe the calc example is more simple? In that class (in tutorial/parse) the end-of-file token is named simply "end" and its parser agent function is defined like this:

   parse_end (buffer: MINI_PARSER_BUFFER): CALC_IMAGE is
      do
         if buffer.end_reached then
            Result := keyword
         end
      end

(we don't care what `keyword' is)

The important part of the parser function is that it checks buffer.end_reached and returns a PARSER_IMAGE when True.

Personal tools