Posts filed under the 'PCPlus' category

Ray tracing image from June 2010’s PCPlus

I’ve just sent off June 2010’s article for PCPlus to my editor, just a smidgeon late. A couple of days is all. It’s on ray tracing, something I’ve wanted to discuss and play around with for a while. I downloaded POV-Ray, an open-source ray tracing renderer for Windows, OSX, and Linux to use as a test-bench, and spent some fun hours with it.

PCPlus logoFor the article I had to create an original image. Well, not ‘had to’ exactly, but I thought it only right that I show something that didn’t come from wikipedia or some other ray tracing enthusiast’s site. I certainly didn’t want to show the standard reflective ball hovering over a checkerboard image, although I admit snagging the sphere code from Christoph Hormann’s site. I decided to go for an image showing a 6×10 pentomino solution, since the previous article was about pentominoes and how to solve geometric puzzles with them.

Here’s the final image, after I’d spent entirely too much time this morning messing around with various options instead of completing the ruddy article.

Raytraced pentomino solution

(Click to make larger.) In essence I wanted to show off most of the topics I discussed in the article in one image. The pentominoes are translucent, so the shadow is colored. There are two light sources, a main white one and a slightly reddish-tinged dim one. The spheres reflect each other, the solution, and the shadows.

If you’ve downloaded POV-Ray and want to generate this image yourself, here’s the code. If you want to read the article, buy PCPlus’ June issue when it hits the newsstands, or wait until June 2011 when I’ll publish it here on this blog.

Album cover for Shepherd MoonsNow playing:
Enya - Marble Halls
(from Shepherd Moons)



Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 280: Writing a spellchecker

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there’s an Xmas issue as well — so it’s a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. What I’ll do is publish the article from a year ago or so here when I purchase the current issue.

PCPlus logoSince I picked up April 2010’s issue when I was in England just over a week ago, I’m publishing this article from April 2009 slightly earlier than I usually do (which means May 2009’s article will be posted in June or something).

The topic is pretty interesting: how does a spell-checker give you alternatives to a badly-spelled word? The first step — finding out if a word is misspelled — is simple enough as a basic algorithm: just look it up in a big ol’ list o’ words. The next step can seem daunting though: how can you generate a valid list of possible corrections? The article talks about a few possibilities: the Levenshtein distance, the Damerau algorithm, and the Soundex method.

The Levenshtein distance is a bit impractical for a large list of words, although there are certain techniques you can use to speed things up. It essentially calculates the edit distance (the number of character insertions and deletions to go from one word to another) of the misspelled word and every other word in the list. The words with the smallest edit distance are your correction candidates. As I said, possibly too slow for a large dictionary, although it’s a great algorithm for creating diff tools.

The Damerau algorithm essentially calculates all the words that can be formed form the misspelled word by a single mistyped letter and checks those in the dictionary. Those that are present are valid correction candidates. Although single mistakes are common, two or more mistyped letters in a single word is fairly rare, so the algorithm does pretty well (as the article shows).

The Soundex algorithm is a “phonetic” algorithm: it finds candidate words that “sound like” the misspelled word.

Of course the article doesn’t go into deeper, harder algorithms for space reasons. For example, Bloom filters are great for compressing the dictionary at the expense of a few false positives and can be used to replace the article’s hash table for the word list. Different dictionaries could be employed that just store root words and then describe possible suffixes, prefixes, and circumfixes that can go with each word (for example, Hunspell). Better phonetic algorithms now exist than Soundex (for example, Metaphone). Nevertheless, I think the article hangs together quite well and I’m pretty pleased with it.

This article first appeared in issue 280, April 2009.

You can download the PDF here.

Album cover for Plunkett & Macleane [Original Score]Now playing:
Armstrong, Craig - Revelations
(from Plunkett & Macleane [Original Score])


Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 279: JPEG compression

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there’s an Xmas issue as well — so it’s a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. What I’ll do is publish the article from a year ago or so here when I purchase the current issue.

PCPlus logoOne of the topics I wanted to write up back when I had a two-page article was how JPEG compression worked, but I didn’t think I could cover it adequately in such a small space. So for March 2009 I tried with my new three-page allowance, but found that it was equally as difficult. Trouble is, there’s so much to talk about: colour spaces, DCTs, downsampling, Huffman encoding, and so on, so forth. So in the end, it turned into more of a layman’s discussion than any kind of deeper/broader article that laid down the foundations of why JPEG compression works, and why, sometimes, it doesn’t very well.

I also tried to show with an image, what the conversion of an RGB image to the YCbCr colour space would look like, and completely ignored the fact that, although our screens use the RGB colour space, printers use the CYMK colour space. I was expecting it all to get translated from RGB to CYMK properly and the inks to cooperate as they were laid onto paper, etc. Even looking at the PDF of the article, that “decomposition” image looks weird. This is what it should look like (click on it for the full size image):

RGB to YCbCr conversion

I actually created each supplementary image in code by using the RGB-YCbCr conversion equations on the original image and then stitched them together.

    private void button1_Click(object sender, EventArgs e) {
      Bitmap input = new Bitmap(@"D:\Users\Julian M Bucknall\Pictures\Group.jpg");

      Bitmap lumaImage = new Bitmap(input.Width, input.Height, System.Drawing.Imaging.PixelFormat.Format24bppRgb);
      Bitmap crImage = new Bitmap(input.Width, input.Height, System.Drawing.Imaging.PixelFormat.Format24bppRgb);
      Bitmap cbImage = new Bitmap(input.Width, input.Height, System.Drawing.Imaging.PixelFormat.Format24bppRgb);

      for (int c = 0; c < input.Width; c++) {
        for (int r = 0; r < input.Height; r++) {
          Color color = input.GetPixel(c, r);
          int Y  = (int) Math.Round(( 0.2990 * color.R) + (0.5870 * color.G) + (0.1140 * color.B));
          int Cb = (int) Math.Round((-0.1687 * color.R) - (0.3313 * color.G) + (0.5000 * color.B));
          int Cr = (int) Math.Round(( 0.5000 * color.R) - (0.4187 * color.G) - (0.0813 * color.B));
          color = Color.FromArgb(Y, Y, Y);
          lumaImage.SetPixel(c, r, color);
          Cr *= 2;
          if (Cr >= 0)
            color = Color.FromArgb(Cr, 0, 0);
          else
            color = Color.FromArgb(0, -Cr, -Cr);
          crImage.SetPixel(c, r, color);
          Cb *= 2;
          if (Cb >= 0)
            color = Color.FromArgb(0, 0, Cb);
          else
            color = Color.FromArgb(-Cb, -Cb, 0);
          cbImage.SetPixel(c, r, color);
        }
      }
      lumaImage.Save(@"D:\Users\Julian M Bucknall\Pictures\GroupLuma.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
      crImage.Save(@"D:\Users\Julian M Bucknall\Pictures\GroupCr.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
      cbImage.Save(@"D:\Users\Julian M Bucknall\Pictures\GroupCb.jpg", System.Drawing.Imaging.ImageFormat.Jpeg);
    }

Well, OK, there is a little bit of futzing around in there so that you can more easily visualize the Cr and Cb images. I can’t remember now why I chose the code I did; only that there was quite a bit of experimentation behind it so that the results would look good when printed, albeit inaccurate.

Although I’m not 100% happy with how the article turned out when printed, and possibly three pages is still too few to do the subject justice, I did enjoy the research and the playing around that went into it. It is a fascinating topic; one that has more than its fair share of voodoo (where do those constants in the RGB-YCbCr conversion equations come from again?).

This article first appeared in issue 279, March 2009.

You can download the PDF here.

(Quick aside: PCPlus used to put part of their archive as PDFs on the DVD in the back of the magazine. They’ve now moved to a CD instead of a DVD, presumably to save on costs, and the archive is no longer on there. I hear they’re going to publish it online instead, sometime in the near future.)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 278: Rainbow tables

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there’s an Xmas issue as well — so it’s a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. What I’ll do is publish the article from a year ago or so here when I purchase the current issue.

PCPlus logoFebruary 2009’s article was a "’commission’ in the sense that Martin Cooper, the Editor of PCPlus, wrote to me asking what I knew about rainbow tables and wouldn’t it be a good idea if I wrote an article on them. I don’t know about you but, but when the Head Honcho says wouldn’t it be a good idea, you take it as do it now. So I did.

Actually I didn’t know much about them to begin with and the research proved interesting and pleasurable. In essence, rainbow tables are a technique using large pre-computed tables that help you crack hashed passwords. The way it works is to use a class of functions called reduce functions that calculate a contender password from the hash. These reduce functions are used alongside the hash functions, applied in a chain: hash followed by reduce followed by hash. You end up with a candidate password and a final hash, but that chain covers all the intermediary passwords that were also hashed. Given enough reduce functions (thousands of them) and enough time, you’d create a large table of initial passwords and final hashes.

To crack a password, you get its hashed value and check that hash to be in your table. It it is, reproduce the chain until you get to the point where you can read off the password. If not, reduce the given hash with that final reduce function, hash the results, and check that new hash to be in the table. Continue this cycle until you find a match of the computed hash and an entry in the table (and therefore the password from regenerating the chain), or run out of reduce functions. For a more complete description, read the article.

This article first appeared in issue 278, February 2009.

You can download the PDF here.

(Quick aside: PCPlus used to put part of their archive as PDFs on the DVD in the back of the magazine. They’ve now moved to a CD instead of a DVD, presumably to save on costs, and the archive is no longer on there. I hear they’re going to publish it online instead, sometime in the near future.)

Album cover for HeathenNow playing:
Bowie, David - I Took a Trip on a Gemini Spaceship
(from Heathen)


Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 277: Dictionaries and hash tables

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there’s an Xmas issue as well — so it’s a bit more than monthly). The column is called Theory Workshop and appears in the Make It section of the magazine. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. What I’ll do is publish the article from a year ago or so here when I purchase the current issue.

PCPlus logoJanuary 2009 was an important change for me. It seems that the Editor was pleased enough with my pieces (and I presume so were the readers) that my commission each month expanded to three pages instead of the prior two (or, if you’re counting, 2000 words instead of 1300). Not quite a 50% increase in pay to go along with the 50% increase in surface area — ha! — but to be quite honest I didn’t particularly care, and still don’t: I really enjoy writing for them and the money they pay is pretty good anyway. The change in word count meant that I could start to do my topics in more depth. Before I would sometimes be struggling to contain the topic in the space, but now the extra room allowed me to cover more detail.

The first article in this new expanded section had to be a good one. I decided to cover one of my favourite data structures: the hash table or dictionary.

Even with the extra space, all I could cover was the basic hash table together with linear probing as a collision resolution mechanism (and example of open addressing) and the problems of clustering. Hash tables with open addressing is still one of my favourite ways of implementing a dictionary, and so writing the article was fairly quick.

I liked doing the figures too, although Figure 2 is a bit bizarre without some explanation (time is meant to be read from top to bottom, first we insert a record whose hash resolves to index 2, then one at 17, then one at 11, etc; the more we add records, the more collisions there are and they tend to cluster). The figure cries out for animation, as I discovered recently when I slipped in a couple of images of hash tables to my CTO video on seeing things in black and white. There I wanted to show how disruptive it can be when a hash table grows, and it too needed animation, especially in a video like that. I’ve since found out that Illustrator can produce Flash animations from a series of layers, etc, but I haven’t had a chance to play around with that as yet.

Double-take: a hash table “disruptive”? Indeed, yes, under certain situations. Much is made of the fact that insertion and search in a hash table is O(1), that is, it’s constant time whether there are 10 items in the table or 10,000. Not strictly true, as it happens, it’s more of an amortized O(1) over many insertions or searches. The reason is that, during an insertion, there may be a point when the load factor is too high (as explained in the article, for open addressing that’s assumed to be about 2/3 full), and the hash table has to be grown. This requires allocating a new array (traditionally it’s set to twice the size of the original), and then rehashing and inserting all of the current items into the new array. This is an O(n) operation and it happens every n items, so, amortized, it smears out to a constant addition factor over all n items. Whoopee for the big-Oh notation, but in practice, if you have a hash table containing a huge number of items, the time taken for this growth may be noticeable by the user or by a real-time process that’s not expecting it.

This article first appeared in issue 277, January 2009.

You can download the PDF here.

(Quick aside: PCPlus used to put part of their archive as PDFs on the DVD in the back of the magazine. They’ve now moved to a CD instead of a DVD, presumably to save on costs, and the archive is no longer on there. I hear they’re going to publish it online instead, sometime in the near future.)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 276: Space, the final frontier

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. I popped over to B&N this lunchtime and bought the Christmas issue, so here's Christmas 2008's article.

PCPlus logoThis article was about caching data to make algorithms faster. I used a couple of examples to prove my point: Fibonacci numbers using the recursive algorithm and primality testing.

However — I don't know why — it just didn't come out the way I wanted it to. Rereading it now, 15 months after writing it, it feels very under-researched, boring. Possibly I tossed it off in an afternoon under deadline. I also note that the images were added by my editor, they weren't mine (besides which, I only had one). A very disappointing article, I'm afraid, certainly not one of my best. Meh.

This article first appeared in issue 276, Christmas 2008.

You can download the PDF here (but, really, I wouldn't bother).

Album cover for Confessions On A Dance FloorNow playing:
Madonna - Get Together
(from Confessions On A Dance Floor)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 275: Ant Colony Optimization

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. I just now popped over to B&N and bought December's issue, so here's December 2008's article.

PCPlus logoThis particular article is about a fascinating optimization technique that, for some reason, I find easier to understand and implement than, say, genetic algorithms or simulated annealing.It also gave the PCPlus designer an opportunity to have a whopping big picture of an ant on the heading graphic.

Ant Colony Optimization (ACO) is a technique to solve NP-hard problems like the Travelling Salesman Problem (TSP). In essence, you use a model of an ant to wander randomly over the problem space. The ant will find a particular path to a solution and in doing so will leave a digital pheromone along his path. The longer the walk the smaller the pheromone density deposited, the shorter the distance the stronger the pheromone density. Launch a few hundred more ants over the space, and they will tend to follow higher concentrations of pheromone, but still investigate random walks. Eventually, you'll have a pheromone path to a solution that is likely to be fairly optimal. There are a few knobs to tweak in the algorithm, such as how quickly the digital pheromone evaporates.

Because of the "path" aspect, ACOs are great for solving TSP-type problems, and to illustrate it I showed a map of England with 5 major cities on it and invited people to solve the TSP by hand. To emphasize the paths between the cities as being distinct paths I had to smudge Birmingham's position a little because it's directly on the way between London and Manchester, and put it somewhere on the Welsh Borders. Sorry, Ludlow, for dumping Birmingham on you; I should have chosen another city like Bristol instead of Manchester.

This article first appeared in issue 275, December 2008.

You can download the PDF here.

Album cover for ZoolookNow playing:
Jarre, Jean Michel - Ethnicolor
(from Zoolook)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 274: Choosing random samples

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. I bought November's issue this lunchtime, so here's November 2008's article.

PCPlus logoThe premise of this article is pretty simple: for statistical purposes you have to select, at random, n items from a very large set of them. Then, presumably, you will make some deductions about the whole set from this much smaller sample. There are many applications of this and similar algorithms; the most obvious being political polling.

It's instructive to think about the issue before reading the article. Say you have to select exactly three items at random from a set of 10. How would you go about it such that you do not skew the selection? That is, such that every item has an equal probability of being selected? (An example I give in the article is to calculate a random number between 0 and 1 for every item in the set: if it's less than 0.3 select the item, stop when you have three. However, this is very biased to the earlier items, and, indeed, you might not even get three items at all.)

If you manage that, consider the scenario where you don't know the total number of items at the outset, yet you must select 1000 of them such that each has an equal probability of being selected. Totally at random. Without counting them.

Fascinating stuff.

This article first appeared in issue 274, November 2008.

You can download the PDF here.

Now playing:
The Alan Parsons Project - Eye In The Sky
(from The Definitive Collection)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 273: Solitaire cryptography

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. I bought October's when I was in England at the beginning of the month, so here's October 2008's article.

PCPlus logoThis was a hoot to write. I armed myself with a brand new deck of playing cards, a copy of Neal Stephenson's Cryptonomicon, opened at the appendix written by Bruce Schneier, and sat at our patio table and tried to make sense of the Solitaire encryption algorithm described there (it's called Pontifex in the book itself).

It seems Stephenson wanted a non-electronic, but secure, encryption algorithm for two of his characters, something that a secret agent could use behind enemy lines without drawing suspicion. So he asked Schneier whether he had such a thing in his armory of algorithms. Schneier came up with a humdinger of an algorithm that just uses a deck of cards and a bit of time but that is extremely secure.

Drawing the figures was fun too. I managed to find a set of card images online and spent an agreeable couple of hours in Illustrator getting the main deck-cutting procedures into image form. It seems the magazine graphics designer had a field day too with the image at the top of the article. Why can't all algorithmic figures be as fun to draw?

This article first appeared in issue 273, October 2008.

You can download the PDF here.

Now playing:
The Dream Academy - Life In A Northern Town
(from The Dream Academy)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

PCPlus 272: Generating gobbledygook

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. Since I've now got September's issue (and have had it for a couple of weeks), here's September 2008's article.

PCPlus logo Pure fun this time: generating random text. The article shows that generating pure random text where every character has an equal probability of appearing next doesn't work particularly well. Enter Markov chains, where the probabilities of what comes next are skewed to what has just appeared. First we look at characters, so the next character depends on what the previous character was (order-1 Markov chain), all the way up to an order-10 Markov chain (the next character depends on the previous 10 characters). I particularly like the example text generated from War of the Worlds for this latter case:

BOOK ONE THE EVE OF THE WAR

No one would have left an abiding sense of smell, but it had a pair of very large dark eyes of a Martian from the Martians making their blue shirts, dark trousers, and singers.

I just love the idea of the Martians making their blue shirts and dark trousers.

Anyway, I also experimented with Markov chains that use previous words instead of characters, but in reality an order-10 Markov chain based on characters would work very well.

This article first appeared in issue 272, September 2008.

You can download the PDF here.

Now playing:
Art of Noise - Beat Box (Diversion One)
(from The Best of the Art of Noise)

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

About Me

I'm Julian M Bucknall, the M because it's my middle initial and because I and the other Julian Bucknall (the movie guy) would like to differentiate ourselves.

I'm a programmer by trade, an actor by ambition, and an algorithms guy by osmosis. I write articles for PCPlus in my spare time, not that there's much of that.

Julian M Bucknall Apart from that, an ex-pat Brit, atheist, microbrew enthusiast, Pet Shop Boys fanboy, slide rule and HP calculator collector, amateur photographer, Altoids muncher.

DevExpress

I'm Chief Technology Officer at Developer Express, a software company that writes some great controls and tools for .NET and Delphi. I'm responsible for the technology oversight and vision of the company.

The OUT Campaign

The OUT Campaign

Validation

Valid XHTML 1.0 Transitional     Valid CSS!

Bottom swirl

Archives

March 2010 (16)
SMTWTFS
« Feb  
123456
78910111213
14151617181920
21222324252627
28293031

Like this Archive Calendar widget? Download it here.

Search

Google ads

My Tweets

Bottom swirl