Discussion:
shift/reduce conflict
Eric Lavigne
2007-05-27 20:22:55 UTC
Permalink
Menhir (similar to ocamlyacc) indicates that my grammar file has a
shift/reduce conflict. If I remove the definition of "box" from my
grammar file, then it works correctly and Menhir produces no warnings.
I have attached my grammar file at the end of this email in hopes that
someone will explain what I did wrong.

The following site explains that shift/reduce is a type of ambiguity,
but I do not see how this applies to my situation.
http://plus.kaist.ac.kr/~shoh/ocaml/ocamllex-ocamlyacc/ocamlyacc-tutorial/sec-shift-reduce-conflicts.html

This parser is designed to parse an array of numbers, which may be a
0-D, 1-D, 2-D, or 3-D array, as shown in examples below.

0-D (scalar):

1

1-D (row):

1 2 3 4 5

1-D (column):

1
2
3

2-D (matrix):

1 2 3
4 5 6

3-D (box):

1 2 3
1 2 3

4 5 6
4 5 6

The grammar file:

%{
open Tree
%}

/* Declarations */
%token NEWLINE EOF
%token <float> NUMBER
%start main
%type <float Tree.t> main

%%
/* Grammar rules */
ending:
| EOF {}
| NEWLINE ending {};
scalar:
| NUMBER NEWLINE {Leaf $1};
row:
| NUMBER NUMBER NEWLINE {[(Leaf $1); (Leaf $2)]}
| NUMBER row { (Leaf $1) :: $2 };
column:
| NUMBER NEWLINE NUMBER NEWLINE {[(Leaf $1);(Leaf $3)]}
| NUMBER NEWLINE column {(Leaf $1) :: $3};
matrix:
| row row {[(Node $1);(Node $2)]}
| row matrix { (Node $1) :: $2 };
box:
| matrix NEWLINE matrix {[(Node $1);(Node $3)]}
| matrix NEWLINE box {(Node $1) :: $3};
main:
| scalar ending { $1 }
| row ending { Node $1 }
| column ending { Node $1 }
| box ending { Node $1 }
| matrix ending { Node $1 };

%%
Eric Lavigne
2007-05-27 23:57:01 UTC
Permalink
I turns out that there is a conflict between box and ending. If my
generated parser sees two newlines it must decide whether those two
newlines are followed by a NUMBER or by an EOF. Solving this problem
requires looking two tokens ahead, so my grammar is not ambiguous, but
also not LR(1).

One solution is to define ending as EOF rather than NEWLINE* EOF. This
is not a nice solution from a user's perspective because trailing
newlines are invisible in many text editors. I'm going to keep
thinking about this problem for a bit longer before I give up and
change the file format.
Post by Eric Lavigne
Menhir (similar to ocamlyacc) indicates that my grammar file has a
shift/reduce conflict. If I remove the definition of "box" from my
grammar file, then it works correctly and Menhir produces no warnings.
I have attached my grammar file at the end of this email in hopes that
someone will explain what I did wrong.
...
%{
open Tree
%}
/* Declarations */
%token NEWLINE EOF
%token <float> NUMBER
%start main
%type <float Tree.t> main
%%
/* Grammar rules */
| EOF {}
| NEWLINE ending {};
| NUMBER NEWLINE {Leaf $1};
| NUMBER NUMBER NEWLINE {[(Leaf $1); (Leaf $2)]}
| NUMBER row { (Leaf $1) :: $2 };
| NUMBER NEWLINE NUMBER NEWLINE {[(Leaf $1);(Leaf $3)]}
| NUMBER NEWLINE column {(Leaf $1) :: $3};
| row row {[(Node $1);(Node $2)]}
| row matrix { (Node $1) :: $2 };
| matrix NEWLINE matrix {[(Node $1);(Node $3)]}
| matrix NEWLINE box {(Node $1) :: $3};
| scalar ending { $1 }
| row ending { Node $1 }
| column ending { Node $1 }
| box ending { Node $1 }
| matrix ending { Node $1 };
%%
Eric Lavigne
2007-05-28 01:24:16 UTC
Permalink
Solved.

I was trying to solve this problem from within Menhir, but an ocamllex
solution was very easy. Instead of asking Menhir to recognize two
NEWLINE tokens in a row, I instead asked ocamllex to represent such
occurances as DOUBLENEWLINE tokens. Similarly, instead of asking
Menhir to recognize an arbitrary number of NEWLINE tokens followed by
an EOF token, I asked ocamllex to redefine EOF to represent an
arbitrary number of blank lines followed by the end of the file. Now
that ocamllex takes care of these issues, my grammar meets the LR(1)
condition. Here is my modified ocamllex input file:

{
open Parser
let line = ref 1
}

(* How to recognize a floating point number? *)
let digit = ['0'-'9']
let integer = ['-' '+']? digit+
let exponent = ['e' 'E'] integer
let num = (integer "."? integer?) | ("." integer) exponent?

(* uninteresting space *)
let space = [' ' '\t']

(* Converting the input file into a sequence of tokens. *)
rule token = parse
| space {token lexbuf}
| num as x { NUMBER (float_of_string x) }
| '\n' { incr line; NEWLINE }
| '\n' space* '\n' { incr line; incr line; DOUBLENEWLINE }
| '\n'* space* eof { line := 1; EOF }
| _ { let endingline = !line in
line := 1;
failwith ("Lexing error on line "^string_of_int endingline) }
Post by Eric Lavigne
I turns out that there is a conflict between box and ending. If my
generated parser sees two newlines it must decide whether those two
newlines are followed by a NUMBER or by an EOF. Solving this problem
requires looking two tokens ahead, so my grammar is not ambiguous, but
also not LR(1).
One solution is to define ending as EOF rather than NEWLINE* EOF. This
is not a nice solution from a user's perspective because trailing
newlines are invisible in many text editors. I'm going to keep
thinking about this problem for a bit longer before I give up and
change the file format.
Post by Eric Lavigne
Menhir (similar to ocamlyacc) indicates that my grammar file has a
shift/reduce conflict. If I remove the definition of "box" from my
grammar file, then it works correctly and Menhir produces no warnings.
I have attached my grammar file at the end of this email in hopes that
someone will explain what I did wrong.
...
%{
open Tree
%}
/* Declarations */
%token NEWLINE EOF
%token <float> NUMBER
%start main
%type <float Tree.t> main
%%
/* Grammar rules */
| EOF {}
| NEWLINE ending {};
| NUMBER NEWLINE {Leaf $1};
| NUMBER NUMBER NEWLINE {[(Leaf $1); (Leaf $2)]}
| NUMBER row { (Leaf $1) :: $2 };
| NUMBER NEWLINE NUMBER NEWLINE {[(Leaf $1);(Leaf $3)]}
| NUMBER NEWLINE column {(Leaf $1) :: $3};
| row row {[(Node $1);(Node $2)]}
| row matrix { (Node $1) :: $2 };
| matrix NEWLINE matrix {[(Node $1);(Node $3)]}
| matrix NEWLINE box {(Node $1) :: $3};
| scalar ending { $1 }
| row ending { Node $1 }
| column ending { Node $1 }
| box ending { Node $1 }
| matrix ending { Node $1 };
%%
Archives up to November 11, 2006 are also downloadable at http://www.connettivo.net/cntprojects/ocaml_beginners/
The archives of the very official ocaml list (the seniors' one) can be found at http://caml.inria.fr
Attachments are banned and you're asked to be polite, avoid flames etc.
Yahoo! Groups Links
Loading...