Posts tagged with 'binary-tree'

PCPlus 297: Arithmetic with cards

This particular article explored how to generate arithmetic expressions using a card game as a basis for discussion. It appeared in August 2010, and came about because of me reading two blog posts on entirely different topics within a week or so of each other. The combination triggered an Ah ha! moment, and I wrote it up. The first was for a card game called Krypto and appeared on The Daily WTF; the second was about generating all binary trees of a particular size and appeared on Eric Lippert’s blog, Fabulous Adventures in Coding.

PCPlus logoIn essence, for the card game you strip out all the court cards and jokers, shuffle the remainder and deal out five cards plus one to the side. The object of the game is to think of an arithmetic expression that links the values of the five cards (in the original sequence, mind) to equal the value of the sixth. You can use the four normal operators in any combination plus as many parentheses as you need. It’s not a bad game to exercise your arithmetic abilities and can be quite difficult for certain combinations of cards. Where enumerating all binary trees come in is to try and write a computer program to list out all possible expressions that satisfy the card layout.

In the article, I used Reverse Polish Notation (RPN) because it’s a damn sight easier than infix notation, and I glibly tossed off an assertion saying it’s easy to convert from RPN to infix. I then wrote a blog post about how easy it was, only to be brought up short, because it damn well isn’t. Sigh.

This article first appeared in issue 297, August 2010.

You can read the PDF here.

(I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so.)

Now playing:
Ace - How Long?
(from Groove Armada’s Late Night Tales)


Best case, Worse case, Amortized case

A couple of weeks ago, Hacker News pointed to an article about how bad hash tables could be in a worse case scenario.

In essence, the thrust of the argument was that, in a worse case scenario, hash tables degenerate from a very nice O(1) for insertion and deletion of items to O(N). The reason is that hash tables’ quoted run-time efficiency is an amortized value and not an absolute result. If you like, on average over very many insertions and deletions, hash tables behave in a O(1) manner, but there can be cases where a single insertion can be O(N), generally when the table is grown because its load factor grows above a certain value.

The author of the piece then goes on to say that because of this he always uses a balanced binary tree for a dictionary or associative array, because its run-time efficiency is O(log N) for insertions and deletions, even in the worse case. That is, a balanced binary tree, such as a red-black tree, offers a worse-case guarantee that insertions and deletions will not take longer than O(log N).

This is all quite true, but I would venture to guess that the worse case scenario does not occur for the vast majority of applications. For example, let us take the spell checker scenario where we store a wordlist in a hash table. (Ignore for the moment the fact that a trie or a ternary tree might be a better structure than either a hash table or a balanced binary tree for this.) One would assume that the wordlist size would not change drastically during the run of the application with violent swings in the number of words. In which case, the developer would optimize the dictionary, for example, by tweaking the hash function or by initializing the number of elements in the table correctly. In such an example, there would be no such worse case. (I shall ignore the worse case of the crappy developer, as one does in all discussions of run-time efficiency.)

Another example is associative arrays in dynamic languages like JavaScript or Ruby. There you are pretty much forced to use hash tables because they’re, well, part of the infrastructure. Presumably performance testing will reveal problems in your code (again, ignoring the issue of the crap developer who doesn’t test) that you can then fix, if possible.

My viewpoint here that, unless you are aware of wildly varying numbers of name-value pairs in your application, it’s very likely that a hash table will suffice for your dictionary needs. Instead of faster efficiency the vast majority of time, by using a balanced binary tree you are slower all the time.

After all, every implementation of quicksort will have a sequence of input items that will force an O(n^2) run-time. Using random selection of the pivot can alleviate this to a certain extent (that is, make it extremely unlikely to happen). Yet, how many people worry about that? Indeed, how many people know where quicksort is being used in their run-time library?

Mind you, all this is very academic. The real rule is: measure your own code’s performance. Don’t take big-Oh numbers as gospel: for your application and your data, the real world may be very different.

Search

About Me

I'm Julian M Bucknall, the M because it's my middle initial and because I and the other Julian Bucknall (the movie guy) would like to differentiate ourselves.

I'm a programmer by trade, an actor by ambition, and an algorithms guy by osmosis. I write articles for PCPlus in my spare time, not that there's much of that.

Julian M Bucknall Apart from that, an ex-pat Brit, atheist, microbrew enthusiast, Pet Shop Boys fanboy, slide rule and HP calculator collector, amateur photographer, Altoids muncher.

DevExpress

I'm Chief Technology Officer at Developer Express, a software company that writes some great controls and tools for .NET and Delphi. I'm responsible for the technology oversight and vision of the company.

Validation

Validate markup as HTML5 (beta)     Validate CSS

Bottom swirl

Archives

February 2012 (5)
SMTWTFS
« Jan  
1234
567891011
12131415161718
19202122232425
26272829

Like this Archive Calendar widget? Download it here.

Social networking

Google ads

The OUT Campaign

The OUT Campaign

My Tweets

  • From #devexpress Live Chat today: someone tried to interest us in an idea for a new line of clothing. #makeschangestoroadmap
  • Handy hint: if you are sending your unsolicited résumé to a company, ensure the company's name is spelled correctly in said résumé. #ffs
  • Oooh, pretty. Sudden 8-bit pixellation: an NVIDIA video driver crash. Time to reboot methinks.
Bottom swirl