[stack] Evolutionary Programming
2008-11-10 01:59:51 GMT
It's been a long time since I've participated in this forum, but it hasn't
been because of loss of interest. Mostly, I've been preoccupied with other
things. I'm very glad to see that this forum is still active.
I thought I'd report on an evolutionary programming project using Joy as the
language of the evolved programs. Several months ago, I started this project
but only worked on it for an hour or two, and then set it aside until the
day before yesterday.
The idea is to write a program that writes Joy programs, tests them for
fitness against samples of the task data, where the task is do something
like handwriting recognition. Those that produce the most nearly correct
results are saved and used as parents for the next generation, which will
consist mostly of variations on each of the parent programs. Because each
one will be tested in isolation against the same test data, there will be no
direct competition for resources (as there would be in Tom Ray's Tierra).
1. A program generator takes as input an existing program and returns a
small modification of it, such as adding, deleting, or replacing a single
atom or item in a list, etc. (I'll talk in Joy terms, since I'm not
sufficiently knowledgeable about any of the other languages discussed in
this forum.) Use this generator to generate a slew of offspring from each of
the programs, if any, that made it through fitness testing for the previous
3. Execute each generated program on the test data and measure fitness based
on the results. Save the programs with the highest fitness rating and
discard all the rest.
4. When completely successful code is produced, stop.
5. Otherwise go back to step, using the most successful programs so far as
parents of the next generation.
Why Use Joy?
I considered using Prolog (before I ever heard of Joy) for this kind of
thing, because prolog programs can effectively modify themselves in useful
ways (by adding and deleting rules, mainly), and because it has an automatic
solution-tree search as the core of the way Prolog programs work.
However, a modified Joy is a lot easier to work with because Joy has no
syntax to speak of (although Prolog's syntax is simple compared to that of,
say, C++), except at the lowest level (lists must have a bracket at each
end, numbers can't have two decimal points, etc.). This makes generation and
modification trivial, because the code doing either doesn't have to worry
about complex control constructs, and it doesn't have to worry about it in
making modifications. Once you have code for properly recognizing aggregate
data types (lists, strings, etc.), your work is largely done.
There is a problem, but it is relatively minor: Exceptions, such as dividing
by zero. Normally, these can crash a program, and code generated
unintelligently can easily lead to such errors.
The solution is a modified Joy that basically handles all errors by ignoring
them. In Visual Basic, this can be done with a "Try" block, which can be
used to cause execution to resume after the block if an error occurs in it.
It can also be done with two versions of an "On Error" statement: "On Error
Resume Next" and "On Error Goto <label>" (where <label> is a label at a
location in the code where you want execution to resume if an error occurs).
Once one has a version of Joy that won't crash on errors (simply by
modifying a bunch of the primitives), one can then generate truly arbitrary
string of code, as long as it is syntactically correct, and they will
"execute" (perhaps doing nothing much at all, but still terminating without
error (infinite loops are still possible in combinators that repeat chunks
of code, but even this can be counteracted by putting a fixed limit of, say,
one million, on the number of iterations).
Sticking closely to the evolutionary paradigm can mean that one must start
with truly simple versions of the task, such as distinguishing between two
values of a variable. If we start with samples of the full task, such as
recognizing full documents of text correctly, we won't be able to provide
any fitness tests that are both meaningful and useful, and it could be
billions of years before anything works. Narrowing the task down to
something completely simple allows for some fitness to be discovered early
on (then we increase the difficulty of the task a little).
So, I'm planning to start my own evolver on a single Boolean variable. When
the evolver finally produces one that can handle this task reliably, I can
switch to having it try to make code that correctly reports the values of
two Booleans. At some point, I should be able to add code for creating
simple images in memory, and start over with one-pixel images, and only one
distinction (white vs. non-white, perhaps).
Eventually, I should be able to use images with enough pixels to represent
symbols such as plus-signs and minus-signs. From there, I should be able to
start using actual image files, which will require scanning samples of my
handwritten material (over two hundred pounds of the stuff, written over the
past twenty-odd years) and cutting it into small, one-character image files.
This will also require that I provide the correct text for each sample. At
first, I will probably just include the text in the names of the files (and
make sure that the generated code has no access to this information, to
avoid evolving code that only reports text from the file names!).
For now, I'll be happy just to get my evolver working well enough to evolve
code to make some simple distinction reliably, even if I have to hand-tweak
mutation-rates and types of mutations allowed. I'm using what will be my
third implementation of Joy (more of a mini-Joy, in this case). (I've got
one implementation in Visual Basic (Visual Studio) and another in Visual
Basic for Applications (the macro language for the Microsoft Office programs
-- Microsoft Word in this case).
My Visual Studio version of Joy "compiles" to lists of what I call "code
objects," and this makes for a very generalized implementation and allows
for easy de-compilation, etc., but it's also slow, so my new version will
not use "objects," and this will eliminate at least one level of
indirection, in that it will use "delegates" (which are basically pointers
to functions, but with type-checking for parameters and return types). It
will also not "compile" code but will only process code into arrays of
strings (even lists with many elements will each just be a single string
until it is "executed" (pushed onto the stack), at which point it will be
made into an array of strings (one string for each element).
This version of Joy will also be the least faithful to Manfred's version.
I'm more interested, this time, in making a language that simply has a
suitable set of primitives for my evolutionary programming purposes. One
difference will be that executing code can define new functions (and
re-define them, thus making them into variables). This violates Joy's
variable-free nature, especially if I allow non-local variables (but I
haven't decided on this yet).
It's interesting that Joy code is more like the genetic code of DNA than
ordinary programming languages are, in that DNA doesn't have much in the way
of syntax, and it doesn't have any explicit looping or other typical control
constructs. DNA is constructed in a concatenative way, as long chains,
without loops or other interconnections. Of course, this parallelism is very
limited, but it suggests why I think Joy is a good programming language for
evolutionary programming: The only structuring is done by defining functions
and making chunks of code into lists. Both of these are forms of
modularization (and defining functions is basically just naming a list
(quoted program) so it can be used by reference rather than being included
bodily (which also requires an extra step to execute its contents as code).
I had hoped to have some evolving going on by now (Sunday evening), but I
took time out to write this post so I may not get to actual evolution for a
day or so, depending on available time. But, I'm close.
I'm working on the Joy source code preprocessor now. Since I've done it
twice before and since I'm using a simplified version of Joy, this shouldn't
I'll report on results, soon, I hope.