Skip to content

Extreme Improbability – a Practical Example (without using monkeys)

June 21, 2013

About a year ago, I decided to tackle something that has been on my mind as a “bucket list” item for a long time.  I decided to create an experiment that would serve as a practical example of extreme improbability, because I was thinking about the foundational principle behind the idea of macro-evolution – namely, that order (and even life) can come spontaneously out of disorder, if you just try enough times.

In my mind, I was thinking of the old adage about a thousand monkeys, hammering away on a thousand typewriters.  The old saying goes that if you let them bang on those keys for enough time, eventually they’ll reproduce Shakespeare’s Hamlet.

I definitely recognize that that’s not the same thing as proteins joining together under extraordinary circumstances to create the first strand of DNA, but to my way of thinking, the monkeys concept is a much simpler idea than producing life, and anyway I don’t have a particle accelerator.  I know what you’re thinking – Tony Stark built one in his garage using some rubber tubing and a monkey wrench; well, I’m sorry, I’m not Tony Stark.

Anyway, this is what I came up with.  I wrote a C# application that starts with a list of possible characters.  In this case, that was the alphabet (capital and lowercase), a space, a hyphen, apostrophe and quotation marks, colon and semi-colon, a period and a comma.  The application selects one character at random, and builds a new string, then checks to see if the new string matches the same set of characters in a pre-determined pattern.  If it does, it grabs another character and runs the same test.  If we get to the end of the target – that is, if we have a perfect match – then we’re done, but if the program eventually selects a random character that doesn’t fit the target pattern, the test is considered a failure.  The program keeps track of the number of times it has run the test, and it also records the largest set of characters it has managed to match.

The text I selected as my pre-set pattern was the Gettysburg Address, as depicted below:

Four score and seven years ago our fathers brought forth on this continent, a new nation, conceived in Liberty, and dedicated to the proposition that all men are created equal. Now we are engaged in a great civil war, testing whether that nation, or any nation so conceived and so dedicated, can long endure. We are met on a great battlefield of that war. We have come to dedicate a portion of that field, as a final resting place for those who here gave their lives that that nation might live. It is altogether fitting and proper that we should do this. But, in a larger sense, we cannot dedicate-we cannot consecrate-we cannot hallow-this ground. The brave men, living and dead, who struggled here, have consecrated it, far above our poor power to add or detract. The world will little note, nor long remember what we say here, but it can never forget what they did here. It is for us the living, rather, to be dedicated here to the unfinished work which they who fought here have thus far so nobly advanced. It is rather for us to be here dedicated to the great task remaining before us-that from these honored dead we take increased devotion to that cause for which they gave the last full measure of devotion-that we here highly resolve that these dead shall not have died in vain-that this nation, under God, shall have a new birth of freedom-and that government of the people, by the people, for the people, shall not perish from the earth.

Feels just like a trip back to third-grade history class, doesn’t it?  This exceptional speech is made up of ten sentences, 1,448 characters, 261 words.  By way of comparison, Hamlet has 29,551 words and 143,428 characters, so it’s about 100 times as many characters to match.  DNA is so complex that, according to George Church of Harvard (http://hms.harvard.edu/news/writing-book-dna-8-16-12), it takes more than 70 terabytes of data storage to hold the information contained in the human genome.  I don’t know how many characters that would be, but my full-text of Hamlet (in Word format) only takes up about 300k.

All of that is to say that my experiment here is extremely small-scale, and I’m very aware that there are other problems and flaws with it (for example, the character set I used is not really a good analog for the more than 100,000 different types of proteins in the human body), but as I mentioned earlier, I’m not trying to prove or disprove anything.  I just wanted to see a practical example of an attempt to spontaneously produce order out of disorder, based on the premise that all we need is the basic materials, and the patience to keep trying.  So, for my purposes, this was good enough.

Anyway, with all of that in place, I turned my app on and let it run for a while.  Well, a while turned into a few days, and then a few days turned into a few weeks.  At some point I got sick of watching it run, so I converted it into a Windows service and had it run in the background.

At that point, I basically forgot about it.  But a few months later I remembered to check my logs, and here’s what I found.  After 2,147,483,647 attempts, all that I had managed to match was “Four sc”.  The geeks out there might recognize that number – that’s the maximum size of a 32-bit integer in C#.  In other words, I reached the limit of C#’s ability to count (at least in 32-bit integer form) before I had even managed to match two complete words.

Eventually I started doing the math on the probability of these matches, and the numbers are pretty staggering.  Here’s what I found:

1 character: 1/60 or 1.667%

5 characters: 1/777,600,000 or 0.0000000129%

10 characters: 1/6,056,617,600,000,000,000,000,000 or 0.0000000000000000165%

I’ll let you draw your own conclusions from there, but for myself, I am convinced that there is a tipping point where a pattern becomes so complex that it will never emerge on its own, even if you have an ideal circumstance, with exactly the right components thrown in to the mix, and the freedom and patience to try an infinite number of times.  And honestly, that tipping point isn’t very far down the road.  Case in point?  Look again at the odds for matching ten characters, and then remember that Hamlet has over 143,000 characters to match.  My calculator app literally crashed when I tried to calculate that probability.

I hope some of you will find this interesting.  I know it doesn’t have much to do with my usual blog topics, but it was a satisfying journey, and it helped me put some of these ideas in perspective in a way I’d never really done before.  And on top of all of that, I got to check an item off of my bucket list.

Now I just need to decide what to tackle next.

Advertisements
2 Comments leave one →
  1. June 21, 2013 8:20 am

    Tony Stark wishes he was you =-) We are so lucky to have such a smart and curious man leading our technological endeavors! Thank you for seeing this through and drawing these fascinating comparisons.

  2. June 22, 2013 9:10 am

    Your idea that “order (and even life) can come spontaneously out of disorder, if you just try enough times” made me wonder if Chaos Theory might offer another explanation for the Gettysburg Address, e.g. could a monkey farting in Uganda in 1790 have caused Lincoln to write Gettysburg Address—somebody wouldn’t have run away from the smell into the hands of a slave trader, and so on.
    Chaos Theory might explain a causal chain after the fact, but to use it as a predictive tool would run into even longer odds than in your experiment. So much for that idea.
    Now I can’t get your image of “a thousand monkeys hammering away on a thousand typewriters” out of my head. Thanks.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: