Since I've wasted most of today on this (oh feature creep. And an extra "-1".), I thought I'd share. Reading around on Pharyngula, I found myself wanting to make a more customizable, even-less-preprogrammed version of Richard Dawkin's WEASEL program. So, I present you with a Python 3.0+ version of Dawkin's "Methinks it is a weasel" program. It will take any nonzero length starting string, has a maximum generational cap for local maxima, with customizable rates for single-bit errors (like SNPs), duplication errors, number of offspring, weighting for approaching the target sequence, and accepted variability, and whether bad mutations are penalized or not.
The point of the program, as published in The Blind Watchmaker, is to show that with random mutations over time you can expect to see something like order pop out of a random test string. The version I have below takes the genetic modelling a bit further, starting with any string, regardless of length, and duplication / omission errors will trim down to the right solution. In debug mode, there's an additional completely random selection every 500 generations to help pull the program out of local maxima, but it is not in the primary program.
Click the fold to read the source, or download it here
You should be able to copy-paste this into any text editor and save it as a *.py file — but if I have a spacing error and the program doesn't work, just download it from the link above. The file is tab-spaced instead of single-spaced, so it may be easier to read.
# Dawkin's "METHINK IT IS LIKE A WEASEL" program. Latch free and highly configureable # Inspired by http://scienceblogs.com/pharyngula/2009/08/but_why_would_dawkins_want_to.php # CC-BY-SA_3.0 Philip Kahn 2009 import random, math, time, calendar #The starting string. In capitals. Random start is on by default. wstring = "A" # Can change to goal alphabet, such as including lower case, or GTAC. selector = "ABCDEFGHIJKLMNOPQRSTUVWXYZ ,.'-?;!:" # Target output. soln = "METHINKS" #"TO BE, OR NOT TO BE: - THAT IS THE QUESTION: - WHETHER 'TIS NOBLER IN THE MIND TO SUFFER THE SLINGS AND ARROWS OF OUTRAGEOUS FORTUNE, OR TO TAKE ARMS AGAINST A SEA OF TROUBLES, AND BY OPPOSING END THEM? - TO DIE: - TO SLEEP; - NO MORE; AND BY A SLEEP TO SAY WE END THE HEART-ACHE AND THE THOUSAND NATURAL SHOCKS THAT FLESH IS HEIR TO, - 'TIS A CONSUMMATION DEVOUTLY TO BE WISH'D. TO DIE, - TO SLEEP; - TO SLEEP! PERCHANCE TO DREAM: - AYE, THERE'S THE RUB; FOR IN THAT SLEEP OF DEATH WHAT DREAMS MAY COME?" # Mutation rates pointrate = .05 #Accepts to .001 percent. duprate = .01 #Accepts to .001 percent # Number of offspring the sample population will have, # starting from one individual, in the generation time. children = 100 generationtime = 6 # Generation time in years. For final output. # Evolutionary peak stopper. #Stops when reaches a local maxima for N generations. localpeak = 20000 ## Would reccommend 500 000 / generationtime, or ## a half million years, for physical resembelance; ## 250000/children for speed. # What degree of unoptomized individuals in a generation prosper (score variance). var_mult = 1 ## Strongly reccommend > 0.3 #How "valued" are correct characters? cstringweight = .3 ## 1 would be equally valued to everything; ## 2 (default) values correct characters over expansion of genome. ## Values of less than 1 selects for expansion more than correctness. ### Switches randstart = 1 #Sets the start string to something random of random length. penalty = 1 # If mismatches are penalized. 1 for yes, 0 for no. debug = 0 #Display debug code. 1 for yes, 0 for no. ################## End Configuration Here ################## starttime = time.gmtime() # Random start parameters if randstart == 1: stringlength = random.randrange(1,2*len(soln)) # No more than 2x starting length while len(wstring) < stringlength: wstring = wstring + selector[random.randrange(0,len(selector))] print("Random starting string selected. Beginning with string: "+wstring) charweight = (1/len(soln)) var = var_mult * charweight i = 0 #counter scoreold = 0 matchold = 0 nimp=0 lgchanges = 0 wstringn = wstring starts = wstring while wstring != soln: j = 0 wstringa =  while j < children: wstringa.append(wstring) #creates a $children number of copies # See if there is a bit replication wstringn = wstringa[j] ratecheck = random.randrange(1,10000)/10000 if ratecheck < duprate: num = math.floor(duprate/ratecheck) #Number of duplications that appear slength = len(wstringn) if slength !=1: bit = random.randrange(0,slength) else: bit = 0 #you can only duplicate bit 0 if there is one bit if random.random() > .5: #duplication or deletion? if bit !=0: wstringn = wstringn[:bit] + wstringn[bit-num:] #duplicate a random bit else: wstringn = wstringn + wstringn #fix for single character else: wstringn = wstringn[:bit+1] + wstringn[bit-num:] #delete a random bit ratecheck = random.randrange(1,10000)/10000 if ratecheck < pointrate: num = math.floor(pointrate / ratecheck) #number of SNPs (single-bit changes) nc = 0 while nc < num: #loop over it slength = len(wstringn) bit=random.randrange(0,slength,1) letter = selector[random.randrange(0,len(selector),1)] while letter == wstringn[bit]: # Don't replace with same letter letter = selector[random.randrange(0,len(selector),1)] if debug ==1: if nimp > 500: print(" Before: "+wstringn+" @ bit "+str(bit)+" with "+letter) if bit!=0: wstringn = wstringn[:bit] + letter + wstringn[bit+1:] #replace a random bit with a random letter else: wstringn = letter + wstringn[1:] nc +=1 if debug == 1: if nimp > 500: print(wstringn) # Mutations are done on this child. Replace the stored version with this one. wstringa[j]=wstringn j+=1 # Increment child number and repeat for next child # Score the children wstringscore=0 wstringscore =  n=0 matchhigh = 0 for x in wstringa: if len(soln) > len(x): lenweight = len(x) else: lenweight = 2*len(soln) - len(x) #for when it goes above the target length score = lenweight*charweight #Score for length # Check each bit, and score for correctness k = 0 scoremax = 0 match = 0 while k < len(x): if k < len(soln): if x[k] == soln[k]: # Score correct bits up to length of target score = score + charweight*cstringweight match += 1 elif penalty == 1: score = score - charweight #If penalties are on, penalizes wrong bits else: score = score - charweight*cstringweight #penalizes extra bits, longer than target k += 1 if match > matchhigh: # Checks matched bits matchhigh = match if matchhigh > matchold: if debug == 1: print("High match, "+str(matchhigh)+"|"+str(matchold)+": "+x) matchold = matchhigh if debug == 1: print(score,n) if debug == 1: print(wstringscore) if debug == 1: if nimp > 500: print(score) wstringscore.append(score) # add the score to the list n+=1 # Create a list of all items with max score, within variability tolerances m = 0 higharr=0 higharr =  while m < len(wstringscore): if wstringscore[m] > max(wstringscore)-var: # To allow variability higharr.append(m) m += 1 index = random.randrange(0,len(higharr)) # Selects random individual from most-fit population if wstringscore[higharr[index]] < max(wstringscore): if debug == 1: print("ERROR 1: Not most fit individual selected, or rounding error") # Debugging if debug == 1: time.sleep(1) # Debugging elif wstringscore[higharr[index]] < scoreold: if debug == 1: print("ERROR 2: Fittest individual is less fit than any in previous generation.") # Debugging if debug == 1: time.sleep(1) # Debugging else: improvement = wstringscore[higharr[index]] - scoreold ###Book-keeping for generational cap scoreold = wstringscore[higharr[index]]### if improvement == 0: if lgchanges != scoreold: lgchanges = scoreold lastnovel = time.gmtime() nimp += 1 ### else: nimp = 0 ### if debug == 1: print("Improvement: "+str(improvement)) # Debugging if improvement < 0: if debug == 1: print("ERROR 3: Fittest regression from peak.") # Debugging if debug == 1: time.sleep(1) # Debugging winner=higharr[index] # Selects the actual individual if debug == 1: print("Winner: "+str(higharr[index])) wstring = wstringa[winner] # Stores individual for next loop if debug == 1: if i % 500 == 0: wstring = wstringa[random.randrange(0,len(wstringscore))] # Every 500 generations we just pick a random individual i+=1 ## Output formatting, and remainders from debugging cnum=higharr[index] for good in higharr: if good == winner: wstring = wstringa[good] if cnum < 10: cnum = " "+str(cnum) else: cnum = str(cnum) lenmatch = len(wstring)/len(soln) charmatchp=matchhigh/len(soln) totalmatch = (1-lenmatch*charmatchp)*100 if debug ==1: scoreoutput = "Score: " + str(wstringscore[higharr[index]]) + " with " else: scoreoutput = "Currently with " ## output if len(soln) > 30: if i % len(soln) == 0 or wstring == soln or i == 1: print("Generation " + str(i) + ", child "+cnum+": " + wstring + ". "+scoreoutput+str(matchhigh)+" character matches and "+str(len(wstring)/len(soln))+" length match for "+str(totalmatch)+"% error.") else: print("Generation " + str(i) + ", child "+cnum+": " + wstring + ". "+scoreoutput+str(matchhigh)+" character matches and "+str(len(wstring)/len(soln))+" length match for "+str(totalmatch)+"% error.") if nimp > localpeak: wstring = soln # Escapes out from local peak #Final outputs if nimp < localpeak: duration = calendar.timegm(time.gmtime()) - calendar.timegm(starttime) print("Goal "+str(len(soln))+"-bit sequence achieved from source "+str(len(starts))+"-bit sequence in "+str(generationtime*i)+" years with a "+str(len(selector))+"-bit alphabet in "+str(duration)+" seconds.") try: truerand = math.pow(len(selector),len(soln)) statden = 100/truerand except OverflowError: truerand=str(len(selector))+"^"+str(len(soln)) statden = 0 individ = children * i statden = individ * truerand print("Wholly random probability (non-selective): 1 in "+str(truerand)+". Individuals processed: "+str(individ)+", or "+str(statden)+"% random-time.") else: duration = calendar.timegm(lastnovel) - calendar.timegm(starttime) print("Evolutionary local maxima reached in "+str((i-localpeak)*generationtime)+" years. Further improvement required too much regression (total evolution stopped after "+str(i*generationtime)+" years). Difference from goal was "+str(totalmatch)+"% in "+str(duration)+" seconds.") print("Run again, or consider increasing the local maxima variable, or decreasing the weight of correct letters.")
It's pretty startling how fast it converges (if it doesn't reach a local max). Here, it converges in as little as 6500 generations, which would be 13,000 years with the parameters I have. The performance hogging is all from printing it out to the screen, but that's half the fun.
Hey Dembski, is that good enough for you? If so, I'll take a copy of Dawkin's book, please :-D
Information and Links
Join the fray by commenting, tracking what others have to say, or linking to it from your blog.
Vitalita Derma Reviews
Vitalita Derma Reviews | 03/02/2015 00:10
Vitalita Derma Reviews
Vitalita Derma Reviews | 03/02/2015 00:07