Posts tagged with 'expression'

Postfix to infix, part 2: adding the parentheses

Once upon a time (all right, it was in May 2010), I wrote an article for PCPlus about generating all possible arithmetic operations with the standard four operators. You can read the article here. After I’d written it, I wrote a blog post about how easy it was to convert the RPN form (Reverse Polish Notation) of the expressions I was generating into the standard algebraic or infix form. You can read that post here. (Note that this post will make more sense if you read these two articles first to get some background.)

Unfortunately, there was a bug in the pseudo-code I’d written for that latter post. If you’d started with an RPN expression like 2 3 4 ÷ ÷, you’d get the following infix expression: 2 ÷ 3 ÷ 4. Unfortunately, if you evaluate the RPN expression you get 8/3 but if you evaluate the algebraic expression you’d get 1/6. What went wrong?

The problem is that we’re missing some parentheses. In the absence of parentheses, a series of equal-precedence operators is evaluated from left to right. My example RPN expression comes from a tree that looks like this:

Expression tree for 2 3 4 / /

So the RPN expression should be converted to 2 ÷ (3 ÷ 4). Instead we are converting it to an implicit (2 ÷ 3) ÷ 4, which is a very different tree:

Expression tree for 2 3 / 4 /

The problem then is the left-associativity of operators when they have the same precedence, and only when they are subtract or divide (the left-associativity problem doesn’t matter with add and multiply). Looking at the tree images above, this is only a problem when we have a right child whose operator is the same as this node’s operator and is either - or ÷. So we must do something slightly different when we recursively visit the right child than we do with the left child.

Here’s my original pseudo-code:

string Visit(node, priorPrecedence) {
  if node is number 
    return number.ToString()
  result = Visit(node.left, node.precedence) +
           node.operator +
           Visit(node.right, node.precedence)
  if node.precedence < priorPrecedence
    result = '(' + result + ')'
  return result
}

Since we have to do something different about parentheses if we visit left than we do if we visit right, we might as well bring that decision up into the code that operates on the current node instead. In the following pseudo-code I make use of two helper routines, one to determine if we need parentheses for the left child and the other similarly for the right child.

boolean NeedParensOnLeft(node) {
  return (node.left is operator) and
         (node.left.precedence < node.precedence)
}

boolean NeedParensOnRight(node) {
  if node.right is number 
    return false
  if node.operator is '+' or node.operator is '*'
    return (node.right.precedence < node.precedence)
  return (node.right.precedence <= node.precedence)
}

string Visit(node) {
  if node is number 
    return number.ToString()

  lhs = Visit(node.left)
  if NeedParensOnLeft(node)
    lhs = '(' + lhs + ')'

  rhs = Visit(node.right)
  if NeedParensOnRight(node)
      rhs = '(' + rhs + ')'

  return lhs + node.operator + rhs
}

For grins (and to make sure I had got it right this time), here’s some complete JavaScript code to convert an RPN expression to a properly-parenthesized algebraic expression. It first converts the RPN expression into an expression tree, and then reads that tree to produce an infix expression as discussed above:

var makeNumberNode = function (number) {
    return {
        kind  : "number",
        value : number
    };
};
    
var makeOpNode = function (op, left, right) {
    var precedence = 1;
    if ((op === "*") || (op === "/")) {
        precedence = 2;
    }
    return {
        kind       : "operator",
        operator   : op,
        precedence : precedence,
        left       : left,
        right      : right
    };
};
    
var convertRPN2Tree = function (rpnExpr) {
    var stack = [],
        i,
        ch,
        rhs,
        lhs;
        
    for (i = 0; i < rpnExpr.length; i++) {
        ch = rpnExpr.charAt(i);
        if ((ch >= "0") && (ch <= "9")) {
            stack.push(makeNumberNode(ch));
        }
        else {
            rhs = stack.pop();
            lhs = stack.pop();
            stack.push(makeOpNode(ch, lhs, rhs));
        }
    }        
        
    return stack.pop();
};
    
var needParensOnLeft = function (node) {
    return ((node.left.kind === "operator") &&
            (node.left.precedence < node.precedence));
};
    
var needParensOnRight = function (node) {
    if (node.right.kind === "number") {
        return false;
    }
    if ((node.operator === "+") || (node.operator === "*")) {
        return (node.right.precedence < node.precedence);
    }
    return (node.right.precedence <= node.precedence);
};
    
var visit = function (node) {
    if (node.kind === "number") {
        return node.value;
    }
        
    var lhs = visit(node.left);
    if (needParensOnLeft(node)) {
        lhs = '(' + lhs + ')';
    }
        
    var rhs = visit(node.right);
    if (needParensOnRight(node)) {
        rhs = '(' + rhs + ')';
    }
        
    return lhs + node.operator + rhs;
};
  
var convertRPN2Infix = function (rpnExpr) {
    var tree = convertRPN2Tree(rpnExpr);
    var infixExpr = visit(tree);
    alert(rpnExpr + " ==> " + infixExpr);
};
    
convertRPN2Infix("234//"); // => 2/(3/4)
convertRPN2Infix("23/4/"); // => 2/3/4
convertRPN2Infix("234**"); // => 2*3*4
convertRPN2Infix("23*4*"); // => 2*3*4

(A couple of notes are in order. First I made the RPN expression deliberately simple; it is assumed to contain only single-digit numbers. Second there is no error-checking at all: don’t pass in invalid RPN expressions. Third, I made sure the code passes JSLint; you should also do this as a matter of course. Oh, that makes three notes, oh well.)

Album cover for In EmptinessNow playing:
Eastern Sun - World Soul
(from In Emptiness)


PCPlus 297: Arithmetic with cards

This particular article explored how to generate arithmetic expressions using a card game as a basis for discussion. It appeared in August 2010, and came about because of me reading two blog posts on entirely different topics within a week or so of each other. The combination triggered an Ah ha! moment, and I wrote it up. The first was for a card game called Krypto and appeared on The Daily WTF; the second was about generating all binary trees of a particular size and appeared on Eric Lippert’s blog, Fabulous Adventures in Coding.

PCPlus logoIn essence, for the card game you strip out all the court cards and jokers, shuffle the remainder and deal out five cards plus one to the side. The object of the game is to think of an arithmetic expression that links the values of the five cards (in the original sequence, mind) to equal the value of the sixth. You can use the four normal operators in any combination plus as many parentheses as you need. It’s not a bad game to exercise your arithmetic abilities and can be quite difficult for certain combinations of cards. Where enumerating all binary trees come in is to try and write a computer program to list out all possible expressions that satisfy the card layout.

In the article, I used Reverse Polish Notation (RPN) because it’s a damn sight easier than infix notation, and I glibly tossed off an assertion saying it’s easy to convert from RPN to infix. I then wrote a blog post about how easy it was, only to be brought up short, because it damn well isn’t. Sigh.

This article first appeared in issue 297, August 2010.

You can read the PDF here.

(I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so.)

Now playing:
Ace - How Long?
(from Groove Armada’s Late Night Tales)


PostFix to Infix: converting RPN to algebraic expressions

Yesterday afternoon I finished my latest PCPlus article on generating all possible arithmetic expressions with four operators. The article explored several algorithms, such as evaluating all full binary trees with a certain number of internal nodes (mine was four), evaluating an expression tree, and the like, and for the sidebar I slipped in a quick bit about RPN (Reverse Polish Notation) and how succinct it is for describing arithmetic expressions (there’s no operator precedence or parentheses to worry about). Equally important is the absolute ease with which you can evaluate an RPN expression compared to an algebraic one (that is, an expression using the standard infix notation).

Growing treesI sent it off but then started to think about how to convert an RPN expression (say, 1 2 3 4 5 + × - ÷) into its algebraic equivalent (1 ÷ (2 - 3 × (4 + 5)), or 0.04). The reason for doing this is to keep and use the speedy succinct RPN form internally, but to display any result in the normal format. There’s lots of information online about doing the reverse operation — that is, converting infix to postfix, for example Dijkstra’s Shunting-yard algorithm, which I first implemented at university using FORTRAN (shudder) — but virtually none about doing the opposite (and some of them I’ve seen even assume that the postfix expression has parentheses, for heaven’s sake).

So, as I said, I thought about it for a while and came up with this answer.

The easy algorithm is to read the RPN expression as if you were going to evaluate it but build up an algebraic expression instead.

Let’s show this with 2 3 + 4 ×. We read the 2 and push it onto the stack. We do the same with the 3. The next token is +, which we process by popping the right-hand operand (call this the rhs for right hand side), the left hand operand (lhs), forming an algebraic expression with parentheses: (2 + 3) and then pushing it onto the stack. Push the next token, the 4. The next token is ×. We pop the rhs (the 4), the lhs (the bracketed expression), form a new expression as before: ((2 + 3) × 4), and push it onto the stack. There’s no more RPN expression left, and so the answer is the final string on the stack. Not too bad, expect for the superfluous parentheses around the entire answer.

The problem with this simple algorithm is that it pays no attention to precedence and to avoid any problems with it surrounds each sub-expression with parentheses. For example, 2 3 × 4 + would be converted to ((2 × 3) + 4), instead of the more succinct, yet unambiguous 2 × 3 + 4.

So how can we improve on the situation? First of all we assign a precedence number to each of the four arithmetic operators: add and subtract get 1, and multiply and divide get 2. That way we can easily say that, for example, multiply is of greater precedence than addition (that is, given a choice between multiply and divide, we do the multiplication first).

We’re going to build up an expression tree from the RPN expression instead of a simple string as in the previous algorithm. We proceed as before but now the stack will store nodes of the expression tree.

Let’s show this with 2 3 + 4 × again. We read the 2, create a node to hold it, and push that node onto the stack. We do the same with the 3. The next token is +, which we process by popping the rhs, the lhs, and form a mini tree with a new root node to hold the +.

First step in creating the expression tree

We push the root node, the operand node, onto the stack. We then read the 4, so create a node for it, and push that. The next token is ×. For this we pop the rhs, then the lhs, and form a new tree with the × operator as root:

Second step in creating the expression tree

This new root gets pushed onto the stack. As before, we’ve run out of RPN expression and so the answer is the top item on the stack. Of course, we’re not quite there yet: an expression tree is not an algebraic expression, but we’re close.

We now walk the tree using an inorder traversal, paying attention to precedence. In fact we use a rule that says if we go from an operator with a higher precedence to one with a lower precedence, we’ll output parentheses to surround the subexpression containing the lower precedence operator. If the precedence number is equal or lower, we don’t output any parentheses. Here’s the pseudo code:

string Visit(node, priorPrecedence) {
  if node is number 
    return number.ToString()
  result = Visit(left, node.precedence) +
           node.Operator +
           Visit(right, node.precedence)
  if node.precedence < priorPrecedence
    result = '(' + result + ')'
  return result
}

There are some assumptions here but they’re pretty obvious (for example, there’s some way to distinguish a number node from an operator node, operator nodes have left and right children, and so on).

So, a two-step process, but nothing too difficult once you’re used to the stack-based approach to evaluating an RPN expression. I quite like this idea of reconstructing an expression tree and then traversing it in a different order than before.

Album cover for Stage (2)Now playing:
Sweetback - Round And Round
(from Stage (2))


PCPlus 265: Evaluating expressions

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a couple of months. When I buy the current issue, I'll publish the article from the issue a year ago.

PCPlus logo Obviously I was on a parsing kick when I wrote this, given that the previous article was on Regular Expressions and this is on expression evaluation. And, equally obviously, I hadn't learned my lessons from a few articles back as again I tried to show some code. Anyway, this article was all about evaluating arithmetic expressions: taking a a string representing an expression like (1+2)×3, converting it into some form that would be more easily evaluated and then evaluating it. The intermediary form is RPN (Reverse Polish Notation) and evaluating an RPN expression is most easily done with a stack.

There's nothing particularly difficult about the article; I think I do a good job in describing how to parse the expression (and how to note errors) to produce an RPN "string" and then how to evaluate the RPN expression.

I thought I'd nailed the images some time ago, but for some reason the illustration showing how to evaluate an RPN expression using a stack "lost" the multiplication sign. Since it was just an asterisk (I checked), it's not like it was some bizarre font issue.

Here's the full code, including a quick test program. It includes the code printed in the article. Note that this code treats numbers only as single digits.

using System;
using System.Collections.Generic;
using System.Text;

namespace ExpressionEvaluator {

  public class ExpressionParser {

    internal interface IParseResult {
      bool Succeeded { get; }
      bool Failed { get; }
      string RpnExpression { get; }
    }

    internal class SuccessfulParse : IParseResult {
      private string rpnExpression;
      public SuccessfulParse(string rpnExpression) {
        this.rpnExpression = rpnExpression;
      }
      public bool Succeeded {
        get { return true; }
      }
      public bool Failed {
        get { return !Succeeded; }
      }
      public string RpnExpression {
        get { return rpnExpression; }
      }
    }

    internal class FailedParse : IParseResult {
      public bool Succeeded {
        get { return false; }
      }

      public bool Failed {
        get { return !Succeeded; }
      }

      public string RpnExpression {
        get { return null; }
      }
    }

    internal class ParserState {
      private string expression;
      private int position;
      private char current;
      public char Current {
        get { return current; }
      }
      public void Advance() {
        position++;
        if (position >= expression.Length)
          current = '\0';
        else
current = expression[position]; } public ParserState(string expression) { this.expression = expression; position = -1; Advance(); } } public static string ParseExpression(string expression) { ParserState state = new ParserState(expression); IParseResult result = ParseExpression(state); return result.RpnExpression; } private delegate IParseResult ParseDelegate(ParserState state); private static IParseResult ParseBinaryExpression(ParserState state, ParseDelegate parseOperand, ParseDelegate parseOperator) { IParseResult operandParse = parseOperand(state); if (operandParse.Failed) return operandParse; string rpn = operandParse.RpnExpression; IParseResult operatorParse = parseOperator(state); while (operatorParse.Succeeded) { operandParse = parseOperand(state); if (operandParse.Failed) return operandParse; rpn += operandParse.RpnExpression + operatorParse.RpnExpression; operatorParse = parseOperator(state); } return new SuccessfulParse(rpn); } private static IParseResult ParseExpression(ParserState state) { return ParseBinaryExpression(state, ParseTerm, ParseAdd); } private static IParseResult ParseTerm(ParserState state) { return ParseBinaryExpression(state, ParseFactor, ParseMultiply); } private static IParseResult ParseFactor(ParserState state) { IParseResult result = ParseNumber(state); if (result.Failed) { result = ParseParenthesizedExpression(state); } return result; } private static IParseResult ParseParenthesizedExpression(ParserState state) { if (state.Current != '(') return new FailedParse(); state.Advance(); IParseResult result = ParseExpression(state); if (result.Failed) return result; if (state.Current != ')') return new FailedParse(); state.Advance(); return result; } private static IParseResult ParseNumber(ParserState state) { char current = state.Current; if (char.IsDigit(current)) { state.Advance(); return new SuccessfulParse(current.ToString()); } return new FailedParse(); } private static IParseResult ParseAdd(ParserState state) { char current = state.Current; switch (current) { case '+': case '-': state.Advance(); return new SuccessfulParse(current.ToString()); default: return new FailedParse(); } } private static IParseResult ParseMultiply(ParserState state) { char current = state.Current; switch (current) { case '*': case '/': state.Advance(); return new SuccessfulParse(current.ToString()); default: return new FailedParse(); } } } class RpnEvaluator { private static double Convert(char token) { return (int)token - (int)'0'; } private static double Evaluate(char op, double rhs, double lhs) { switch (op) { case '+': return lhs + rhs; case '-': return lhs - rhs; case '*': return lhs * rhs; case '/': return lhs / rhs; default: throw new ArgumentException("unknown arithmetic operator", "op"); } } public static double Evaluate(string rpn) { Stack<double> stack = new Stack<double>(); foreach (char token in rpn) { if (char.IsDigit(token)) stack.Push(Convert(token)); else
stack.Push(Evaluate(token, stack.Pop(), stack.Pop())); } return stack.Pop(); } } class Program { static void Evaluate(string expr) { string rpn = ExpressionParser.ParseExpression(expr); Console.WriteLine(String.Format("{0} => {1} = {2}", expr, rpn, RpnEvaluator.Evaluate(rpn))); } static void Main(string[] args) { Evaluate("(1+2)*3"); Evaluate("1+2*3"); Evaluate("1+2/3"); Evaluate("1/2-3"); Evaluate("1+2*3-4"); Evaluate("(1+2)*(3-4)"); Evaluate("((1+2)*(3-4))-5"); Evaluate("1+2+3+4"); Evaluate("1*2*3*4"); Console.ReadLine(); } } }

This article first appeared in issue 265, February 2008.

You can download the PDF here.

Now playing:
Dido - Worthless
(from Café del Mar, Vol. 8)

Search

About Me

I'm Julian M Bucknall, the M because it's my middle initial and because I and the other Julian Bucknall (the movie guy) would like to differentiate ourselves.

I'm a programmer by trade, an actor by ambition, and an algorithms guy by osmosis. I write articles for PCPlus in my spare time, not that there's much of that.

Julian M Bucknall Apart from that, an ex-pat Brit, atheist, microbrew enthusiast, Pet Shop Boys fanboy, slide rule and HP calculator collector, amateur photographer, Altoids muncher.

DevExpress

I'm Chief Technology Officer at Developer Express, a software company that writes some great controls and tools for .NET and Delphi. I'm responsible for the technology oversight and vision of the company.

Validation

Validate markup as HTML5 (beta)     Validate CSS

Bottom swirl

Archives

February 2012 (5)
SMTWTFS
« Jan  
1234
567891011
12131415161718
19202122232425
26272829

Like this Archive Calendar widget? Download it here.

Social networking

Google ads

The OUT Campaign

The OUT Campaign

My Tweets

  • From #devexpress Live Chat today: someone tried to interest us in an idea for a new line of clothing. #makeschangestoroadmap
  • Handy hint: if you are sending your unsolicited résumé to a company, ensure the company's name is spelled correctly in said résumé. #ffs
  • Oooh, pretty. Sudden 8-bit pixellation: an NVIDIA video driver crash. Time to reboot methinks.
Bottom swirl