Theory, Part II

Let's extend the previous example to include additional operators.

Input

stack + - * / ^ M f p c , ( ) $

+ r r s s s s s s s r s r r

- r r s s s s s s s r s r r

* r r r r s s s s s r s r r

/ r r r r s s s s s r s r r

^ r r r r s s s s s r s r r

M r r r r r s s s s r s r r

f r r r r r r r r r r s r r

p r r r r r r r r r r s r r

c r r r r r r r r r r s r r

, r r r r r r r r r r r r e4

( s s s s s s s s s s s s e1

) r r r r r r e3 e3 e3 r e2 r r

$ s s s s s s s s s e4 s e3 a

	Input
stack	`+`	`-`	`*`	`/`	`^`	`M`	`f`	`p`	`c`	`,`	`(`	`)`	`$`
`+`	r	r	s	s	s	s	s	s	s	r	s	r	r
`-`	r	r	s	s	s	s	s	s	s	r	s	r	r
`*`	r	r	r	r	s	s	s	s	s	r	s	r	r
`/`	r	r	r	r	s	s	s	s	s	r	s	r	r
`^`	r	r	r	r	s	s	s	s	s	r	s	r	r
`M`	r	r	r	r	r	s	s	s	s	r	s	r	r
`f`	r	r	r	r	r	r	r	r	r	r	s	r	r
`p`	r	r	r	r	r	r	r	r	r	r	s	r	r
`c`	r	r	r	r	r	r	r	r	r	r	s	r	r
`,`	r	r	r	r	r	r	r	r	r	r	r	r	e4
`(`	s	s	s	s	s	s	s	s	s	s	s	s	e1
`)`	r	r	r	r	r	r	e3	e3	e3	r	e2	r	r
`$`	s	s	s	s	s	s	s	s	s	e4	s	e3	a

The above table incorporates the following operators:

"M", for unary minus.
"^", for exponentiation. 5 ^ 2 yields 25.
"f", for factorial. f(x) returns the factorial of x.
"p", for permutations. p(n,r) returns the number of permutations for n objects taken r at a time.
"c", for combinations. c(n,r) returns the number of combinations for n objects taken r at a time.

The following operator precedence is implied, starting at the highest precedence:

unary minus
exponentiation, right-associative
multiplication/division, left-associative
addition/subtraction, left-associative

The following errors are detected:

error e1: missing right parenthesis
error e2: missing operator
error e3: unbalanced right parenthesis
error e4: invalid function argument

Parentheses are often used to override precedence. This is easily accomodated in the operator precedence table using the following algorithm:

input     action
 '('      shift
 ')'      while opr[tos] != '('
             reduce
          shift ')'
          reduce '()'

On encountering a left parenthesis, we shift it to the opr stack. When we input a right parenthesis, the stack is popped until the matching left parenthesis is found. Then we shift the right parenthesis, and reduce by popping both parentheses off the stack.

For function calls, the function reference is shifted to the stack. Each comma-separated argument is also shifted to the stack. On encountering a comma, the operator stack is reduced until the opening parenthesis of the function call is visible, leaving the function parameter on the value stack. Then the comma is shifted to the stack, and popped on the next reduction. When the closing parenthesis of the function call is encountered, it will be shifted to the stack and reduced. This will expose the function call for subsequent reduction.

Operator classes may be used to minimize the size of the precedence table. For example, a single generic token representing a function call may be pushed on the operator stack. An ordinal, defining which function is referenced, is pushed on the value stack. For this purpose, you may wish to implement each element on the value stack as a union to allow for different datatypes.

Error-checking is not bullet-proof. For example, omitting commas between function arguments will be accepted, and work properly. Even more bizarre, reverse-polish notation, where operators follow operands, is also acceptable. This phenomenon occurs because operators are applied to the top elements of the value stack, and the order that operators and values are pushed is not significant. Major errors, such as omitting an operator, will be found by the final reduction, as the stacks will not be empty. More rigorous error-checking may be done by defining a boolean follow matrix. The current and previous tokens are used as indices into the matrix to determine if one can follow the other.

The table is implemented in the next section, using ANSI-C. Note that there are some differences between my solution and Aho's:

Aho specifies three precedence relations, whereas I've only specified two (shift and reduce). You can use three if you wish; I'm the last one to stifle creativity!
Aho uses one stack for both values and operators. I've used separate stacks, as the implementation is typically easier.