PCPlus 272: Generating gobbledygook

I write a monthly column for PCPlus, a computer news-views-n-reviews magazine in the UK (actually there are 13 issues a year — there's an Xmas issue as well — so it's a bit more than monthly). The column is called Theory Workshop and appears in the back of every issue. When I signed up, my editor and the magazine were gracious enough to allow me to reprint the articles here after say a year or so. After all, the PDFs do appear on each issue's DVD after a few months. When I buy the current issue, I'll publish the article from the issue a year ago. Since I've now got September's issue (and have had it for a couple of weeks), here's September 2008's article.

PCPlus logo Pure fun this time: generating random text. The article shows that generating pure random text where every character has an equal probability of appearing next doesn't work particularly well. Enter Markov chains, where the probabilities of what comes next are skewed to what has just appeared. First we look at characters, so the next character depends on what the previous character was (order-1 Markov chain), all the way up to an order-10 Markov chain (the next character depends on the previous 10 characters). I particularly like the example text generated from War of the Worlds for this latter case:

BOOK ONE THE EVE OF THE WAR

No one would have left an abiding sense of smell, but it had a pair of very large dark eyes of a Martian from the Martians making their blue shirts, dark trousers, and singers.

I just love the idea of the Martians making their blue shirts and dark trousers.

Anyway, I also experimented with Markov chains that use previous words instead of characters, but in reality an order-10 Markov chain based on characters would work very well.

This article first appeared in issue 272, September 2008.

You can download the PDF here.

Now playing:
Art of Noise - Beat Box (Diversion One)
(from The Best of the Art of Noise)


Posts on similar topics...

Share it: Digg It!  StumbleUpon  Reddit  Del.icio.us  NewsVine  Furl  BlinkList  Ma.gnolia  Technorati

No Responses

Feel free to add a comment...

Leave a Response

About Me

I'm Julian M Bucknall, the M because it's my middle initial and because I and the other Julian Bucknall (the movie guy) would like to differentiate ourselves.

I'm a programmer by trade, an actor by ambition, and an algorithms guy by osmosis. I write articles for PCPlus in my spare time, not that there's much of that.

Julian M Bucknall Apart from that, an ex-pat Brit, atheist, microbrew enthusiast, Pet Shop Boys fanboy, slide rule and HP calculator collector, amateur photographer, Altoids muncher.

DevExpress

I'm Chief Technology Officer at Developer Express, a software company that writes some great controls and tools for .NET and Delphi. I'm responsible for the technology oversight and vision of the company.

The OUT Campaign

The OUT Campaign

Validation

Valid XHTML 1.0 Transitional     Valid CSS!

Bottom swirl

Archives

July 2010 (3)
SMTWTFS
« Jun  
123
45678910
11121314151617
18192021222324
25262728293031

Like this Archive Calendar widget? Download it here.

Search

Google ads

My Tweets

  • Just about to sign away a heck of a lot of money for a new kitchen. Gotta do it today to get the discount...
  • @stephenpatten Which is as it should be, of course. UNLESS he's acting for one.
  • @stephenpatten Totally understand your position. Getting a little irritated at the guys: it seems the CTO gets worse service than customers.
Bottom swirl