Yesterday afternoon I finished my latest PCPlus article on generating all possible arithmetic expressions with four operators. The article explored several algorithms, such as evaluating all full binary trees with a certain number of internal nodes (mine was four), evaluating an expression tree, and the like, and for the sidebar I slipped in a quick bit about RPN (Reverse Polish Notation) and how succinct it is for describing arithmetic expressions (there’s no operator precedence or parentheses to worry about). Equally important is the absolute ease with which you can evaluate an RPN expression compared to an algebraic one (that is, an expression using the standard infix notation).
I sent it off but then started to think about how to convert an RPN expression (say, 1 2 3 4 5 + × - ÷) into its algebraic equivalent (1 ÷ (2 - 3 × (4 + 5)), or 0.04). The reason for doing this is to keep and use the speedy succinct RPN form internally, but to display any result in the normal format. There’s lots of information online about doing the reverse operation — that is, converting infix to postfix, for example Dijkstra’s Shunting-yard algorithm, which I first implemented at university using FORTRAN (shudder) — but virtually none about doing the opposite (and some of them I’ve seen even assume that the postfix expression has parentheses, for heaven’s sake).
So, as I said, I thought about it for a while and came up with this answer.
The easy algorithm is to read the RPN expression as if you were going to evaluate it but build up an algebraic expression instead.
Let’s show this with 2 3 + 4 ×. We read the 2 and push it onto the stack. We do the same with the 3. The next token is +, which we process by popping the right-hand operand (call this the rhs for right hand side), the left hand operand (lhs), forming an algebraic expression with parentheses: (2 + 3) and then pushing it onto the stack. Push the next token, the 4. The next token is ×. We pop the rhs (the 4), the lhs (the bracketed expression), form a new expression as before: ((2 + 3) × 4), and push it onto the stack. There’s no more RPN expression left, and so the answer is the final string on the stack. Not too bad, expect for the superfluous parentheses around the entire answer.
The problem with this simple algorithm is that it pays no attention to precedence and to avoid any problems with it surrounds each sub-expression with parentheses. For example, 2 3 × 4 + would be converted to ((2 × 3) + 4), instead of the more succinct, yet unambiguous 2 × 3 + 4.
So how can we improve on the situation? First of all we assign a precedence number to each of the four arithmetic operators: add and subtract get 1, and multiply and divide get 2. That way we can easily say that, for example, multiply is of greater precedence than addition (that is, given a choice between multiply and divide, we do the multiplication first).
We’re going to build up an expression tree from the RPN expression instead of a simple string as in the previous algorithm. We proceed as before but now the stack will store nodes of the expression tree.
Let’s show this with 2 3 + 4 × again. We read the 2, create a node to hold it, and push that node onto the stack. We do the same with the 3. The next token is +, which we process by popping the rhs, the lhs, and form a mini tree with a new root node to hold the +.

We push the root node, the operand node, onto the stack. We then read the 4, so create a node for it, and push that. The next token is ×. For this we pop the rhs, then the lhs, and form a new tree with the × operator as root:

This new root gets pushed onto the stack. As before, we’ve run out of RPN expression and so the answer is the top item on the stack. Of course, we’re not quite there yet: an expression tree is not an algebraic expression, but we’re close.
We now walk the tree using an inorder traversal, paying attention to precedence. In fact we use a rule that says if we go from an operator with a higher precedence to one with a lower precedence, we’ll output parentheses to surround the subexpression containing the lower precedence operator. If the precedence number is equal or lower, we don’t output any parentheses. Here’s the pseudo code:
string Visit(node, priorPrecedence) {
if node is number
return number.ToString()
result = Visit(left, node.precedence) +
node.Operator +
Visit(right, node.precedence)
if node.precedence < priorPrecedence
result = '(' + result + ')'
return result
}
There are some assumptions here but they’re pretty obvious (for example, there’s some way to distinguish a number node from an operator node, operator nodes have left and right children, and so on).
So, a two-step process, but nothing too difficult once you’re used to the stack-based approach to evaluating an RPN expression. I quite like this idea of reconstructing an expression tree and then traversing it in a different order than before.
Now playing:
Sweetback - Round And Round
(from Stage (2))
Share it: