Local variables are evil (or just irritating)

Every now and then, I reread parts of Refactoring by Martin Fowler. The main reason is of course to look for refactoring ideas for DevExpress' Refactor! Pro product, but I also like to take a particular refactoring in there, tease it out, and really understand it.

Such a refactoring is Replace Temp with Query (120). The preamble to the refactoring goes like this:

You are using a temporary variable to hold the result of an expression.

Extract the expression into a method. Replace all references to the temp with the expression. The new method can then be used in other methods.

The motivation for the refactoring is very interesting because it essentially says "don't use local variables":

The problem with temps is that they are temporary and local. Because they can be seen only in the context of the method in which they are used, temps tend to encourage longer methods, because that's the only way you can reach the temp. By replacing the temp with a query method, any method in the class can get at the information. That helps a lot in coming up with cleaner code for the class.

There's actually something more to consider about local variables or temps and that is using them can make a method harder to simplify through repeated uses of Extract Method (110). If a local is read-only (say a parameter passed into the method) and the code-to-be-extracted makes use of it, then the local will become a parameter to the extracted method.

If however the local is both read and assigned to in the code-to-be-extracted (the variable is said to be mutable), then the new value will have to be returned by the new extracted method (as well as the local becoming a parameter). Not to shabby if there's only one such local, but if there are more, things get hairy quite quickly. (Aside: in Refactor! Pro demos, we like to show this off, since the "hairiness" is magnified by the number of input and output arrows in the preview.)

Let's take an example, and, to be fair, let's take an example I wrote. One of the more popular articles I authored for my previous blog on www.boyet.com is the one where I discuss how to calculate the ISO week number for a given date. I'll repeat the complete code here, but it is instructive to read the whole article.

  public class WeekCalculator {

    private static DateTime GetIsoWeekOne(int year) {
      // get the date for the 4-Jan for this year
      DateTime dt = new DateTime(year, 1, 4);

      // get the ISO day number for this date 1==Monday, 7==Sunday
      int dayNumber = (int)dt.DayOfWeek; // 0==Sunday, 6==Saturday
      if (dayNumber == 0) {
        dayNumber = 7;
      }

      // return the date of the Monday that is less than or equal
      // to this date
      return dt.AddDays(1 - dayNumber);
    }

    public static int GetIsoWeek(DateTime dt) {
      DateTime week1;
      int IsoYear = dt.Year;
      if (dt >= new DateTime(IsoYear, 12, 29)) {
        week1 = GetIsoWeekOne(IsoYear + 1);
        if (dt < week1) {
          week1 = GetIsoWeekOne(IsoYear);
        }
        else {
          IsoYear++;
        }
      }
      else {
        week1 = GetIsoWeekOne(IsoYear);
        if (dt < week1) {
          week1 = GetIsoWeekOne(--IsoYear);
        }
      }

      return (IsoYear * 100) + ((dt - week1).Days / 7 + 1);
    }
  }

If you peruse that, you'll see lots of local variables. In fact, as I look at it now, I can see Extract Method screaming at me from all the comments in the GetIsoWeekOne() method, so let's start there.

Ignoring the year parameter, there are two mutable local variables: dt and dayNumber. Let's first get rid of the latter of those two using Replace Temp with Query. The code for GetIsoWeekOne() now looks like this:

    private static int GetIsoDayNumber(DateTime date) {
      if (date.DayOfWeek == DayOfWeek.Sunday)
        return 7;
      return (int)date.DayOfWeek; 
    }
    private static DateTime GetIsoWeekOne(int year) {
      // get the date for the 4-Jan for this year
      DateTime dt = new DateTime(year, 1, 4);

      // return the date of the Monday that is less than or equal
      // to this date
      return dt.AddDays(1 - GetIsoDayNumber(dt));
    }

That return expression is calling out for some help:

    private static DateTime GetPreviousMonday(DateTime date) {
      return date.AddDays(1 - GetIsoDayNumber(date));
    }
    private static DateTime GetIsoWeekOne(int year) {
      // get the date for the 4-Jan for this year
      DateTime dt = new DateTime(year, 1, 4);
      return GetPreviousMonday(dt);
    }

And now we can get rid of the temp completely. There's an obvious way, but I decided to make it a little cleaner still. Here's the fully refactored code:

    private static int GetIsoDayNumber(DateTime date) {
      if (date.DayOfWeek == DayOfWeek.Sunday)
        return 7;
      return (int)date.DayOfWeek; 
    }
    private static DateTime GetPreviousMonday(DateTime date) {
      return date.AddDays(1 - GetIsoDayNumber(date));
    }
    private static DateTime Get4thOfJanuary(int year) {
      return new DateTime(year, 1, 4);
    }
    private static DateTime GetIsoWeekOne(int year) {
      return GetPreviousMonday(Get4thOfJanuary(year));
    }

The nice thing about this code is that there's lots of opportunities for the .NET JITter to inline the code at run-time.

OK, let's move onto the next method, GetIsoWeek, the public one we call. Here there are three locals (dt, week1, IsoYear), the latter two of which are temps. In fact IsoYear is that most malignant of all temps: it's mutable (check out the IsoYear++ and --IsoYear expressions). Extracting a method out of this is not going to be fun, but I'm inclined to make the effort because the original code is so nasty looking.

In essence, there are three cases we need to look at. The date we're given is:

If we didn't try and do the premature optimizations I was so intent on doing last time, the code would simplify to this by removing the week1 temp:

    public static int GetIsoWeek(DateTime dt) {
      int IsoYear;
      if (dt < GetIsoWeekOne(dt.Year))
        IsoYear = dt.Year - 1;
      else if (dt >= GetIsoWeekOne(dt.Year + 1))
        IsoYear = dt.Year + 1;
      else
        IsoYear = dt.Year;
      return (IsoYear * 100) + ((dt - GetIsoWeekOne(IsoYear)).Days / 7 + 1);
    }

Now we're cooking with gas. Let's extract that return expression.

    private static int CalculateIsoWeek(DateTime date, int isoYear) {
      return (isoYear * 100) + ((date - GetIsoWeekOne(isoYear)).Days / 7 + 1);
    }
    public static int GetIsoWeek(DateTime dt) {
      int IsoYear;
      if (dt < GetIsoWeekOne(dt.Year))
        IsoYear = dt.Year - 1;
      else if (dt >= GetIsoWeekOne(dt.Year + 1))
        IsoYear = dt.Year + 1;
      else
        IsoYear = dt.Year;
      return CalculateIsoWeek(dt, IsoYear);
    }

Heh, now we can get rid of the final temp altogether:

    public static int GetIsoWeek(DateTime date) {
      if (date < GetIsoWeekOne(date.Year))
        return CalculateIsoWeek(date, date.Year - 1);
      if (date >= GetIsoWeekOne(date.Year + 1))
        return CalculateIsoWeek(date, date.Year + 1);
      return CalculateIsoWeek(date, date.Year);
    }

Looks good, but there's some implied repeated calculations of the date of ISO week 1 for various years in there and my performance bump is throbbing. Let's see if we can't do better. The essence of the design is we're trying to place the given date in one of three ranges: less than this year's start of week 1, in between this year's week 1 and next year's week 1, and greater than or equal to the start of next year's week 1. So let's explicitly write it that way and make use of previous calculations of the dates of week 1:

    private static int CalcIsoWeek(DateTime date, DateTime weekOne) {
      return (weekOne.Year * 100) + ((date - weekOne).Days / 7 + 1);
    }
    private static int GetIsoWeekUsingRange(DateTime date, DateTime thisWeekOne, DateTime nextWeekOne) {
      if (date < thisWeekOne)
        return CalcIsoWeek(date, GetIsoWeekOne(date.Year - 1));
      if (date < nextWeekOne)
        return CalcIsoWeek(date, thisWeekOne);
      return CalcIsoWeek(date, nextWeekOne);
    }
    public static int GetIsoWeek(DateTime date) {
      return GetIsoWeekUsingRange(date, GetIsoWeekOne(date.Year), GetIsoWeekOne(date.Year + 1));
    }

With this new code, we only calculate two week 1 dates every time, and the third only if needed.

The whole code now looks like this:

  public class WeekCalculator {
    private static int GetIsoDayNumber(DateTime date) {
      if (date.DayOfWeek == DayOfWeek.Sunday)
        return 7;
      return (int)date.DayOfWeek; 
    }
    private static DateTime GetPreviousMonday(DateTime date) {
      return date.AddDays(1 - GetIsoDayNumber(date));
    }
    private static DateTime Get4thOfJanuary(int year) {
      return new DateTime(year, 1, 4);
    }
    private static DateTime GetIsoWeekOne(int year) {
      return GetPreviousMonday(Get4thOfJanuary(year));
    }
    private static int CalcIsoWeek(DateTime date, DateTime weekOne) {
      return (weekOne.Year * 100) + ((date - weekOne).Days / 7 + 1);
    }
    private static int GetIsoWeekUsingRange(DateTime date, DateTime thisWeekOne, DateTime nextWeekOne) {
      if (date < thisWeekOne)
        return CalcIsoWeek(date, GetIsoWeekOne(date.Year - 1));
      if (date < nextWeekOne)
        return CalcIsoWeek(date, thisWeekOne);
      return CalcIsoWeek(date, nextWeekOne);
    }
    public static int GetIsoWeek(DateTime date) {
      return GetIsoWeekUsingRange(date, GetIsoWeekOne(date.Year), GetIsoWeekOne(date.Year + 1));
    }
  }

Which is a lot clearer, despite there being many more methods. Each method is much smaller and easier to comprehend on its own and, as I've said, present more opportunities for inlining at run-time.

So, from this little anecdotal experiment, I quite haven't shown that local variables are necessarily evil, just that you can clean code up and refactor it quite dramatically when you remove them.

Now playing:
Vangelis - Dreams of Surf
(from Reprise: 1990-1999)

Loading similar posts...   Loading links to posts on similar topics...

6 Responses

 avatar
#1 Mike Reith said...
26-Nov-09 8:03 PM

Julian,

Very nice post. I was just reminded that enumerations are also problematic. Specifically, enumerations are often used in switch statements which are rife opportunities for code duplication (page 82 of Refactoring).

Both of these points are forcing me to rethink how I develop. Great stuff!

Thanks,

Mike

 avatar
#2 Jeremy Gray said...
27-Nov-09 10:54 PM

Nice post, but I thought I should take a moment to point out a missed opportunity that would likely result in better-factored code.

If you rewind back to when GetIsoWeek declared the temp IsoYear variable and then if/else if/else'd its way through a bunch of IsoYear assignments, the best way to get rid of the IsoYear local variable isn't taking the IsoYear assignments and replacing them with returns of the CalculateIsoWeek method. The best (IMHO) way is to instead take the entire IsoYear declaration and all of the if/else if/elses that assign to it and extract them into a new method that gives a name to the logic currently hidden in that block of code. From there, the resulting GetIsoWeek method can then be tidied up a bit further to really distill everything down.

Give it a try and I think you'll quickly see the improvement, as it calls out a concept hidden in your current code and does so without generating the duplication seen in your most recent GetIsoWeekUsingRange method.

 avatar
#3 Pato Moschcovich said...
11-Dec-09 7:54 AM

Julian I always read your articles and enjoy them.

I wanted to share a suggestion.

Right where you say:

"Heh, now we can get rid of the final temp altogether:"

I would have written the method like this at first:

public static int GetIsoWeek(DateTime date) {
  int offset = 0;
  if (date < GetIsoWeekOne(date.Year))
    offset = -1;
  else if (date >= GetIsoWeekOne(date.Year + 1))
    offset = 1;
  return CalculateIsoWeek(date, date.Year + offset);
}

And later like this:

public static int GetYearOffset(DateTime date) {
  if (date < GetIsoWeekOne(date.Year))
    return -1;
  else if (date >= GetIsoWeekOne(date.Year + 1))
    return 1;
  return 0;
}
public static int GetIsoWeek(DateTime date) {
  return CalculateIsoWeek(date, date.Year + GetYearOffset(date));
}

I think it's a lot cleaner.

julian m bucknall avatar
#4 julian m bucknall said...
11-Dec-09 9:21 AM

Pato, Jeremy: Thanks for the refactoring alternatives and the feedback. I think your replies show that there many ways to refactor the original code (which I note everyone agrees was pretty ugly), each as good as each other, each starting from a different point.

Cheers, Julian

 avatar
#5 Anurag said...
18-Dec-12 2:16 PM

Like all answers in Software Engineering, the correct answer should be "It depends". I can think at least one case where a local variable can actually work as compared to an instance member. A DAL Layer which passes some argument to webservice and then webservice is waiting for the stored proc to execute. If another request comes in, the instance variable in the DAL will be changed while waiting for the DB to finish doing its job.

 avatar
#6 Rob said...
15-Jan-13 7:24 AM

This is great stuff thanks Julian :)

I originally came across your post on your old site (www.boyet.com) and needed to modify it ever slightly to fit in the VB.NET project I was working on. I had to modify the GetIsoWeek(DateTime dt) call to only return the week (rather than YYYYWW). This worked fine although I started to have some obscurities where the week being returned was 1 too many .

The (dt - GetIsoWeekOne(IsoYear)).Days / 7 + 1 part of the process was returning 3.71 (when the week was 3) and therefore rounding up to 4. I wrapped it in a Math.Floor() so it correctly returned 3 for me.

It all seems to be working pretty consistently now for me.

Thanks again!

Leave a response

Note: some MarkDown is allowed, but HTML is not. Expand to show what's available.

  •  Emphasize with italics: surround word with underscores _emphasis_
  •  Emphasize strongly: surround word with double-asterisks **strong**
  •  Link: surround text with square brackets, url with parentheses [text](url)
  •  Inline code: surround text with backticks `IEnumerable`
  •  Unordered list: start each line with an asterisk, space * an item
  •  Ordered list: start each line with a digit, period, space 1. an item
  •  Insert code block: start each line with four spaces
  •  Insert blockquote: start each line with right-angle-bracket, space > Now is the time...
Preview of response