Evolutionary Computation and
Artificial Life – Assignment 1
Igal Shilman ID: 304369283 email:
shilman@cs.bgu.ac.il
Question 1
Here is the list of all the schemata that 11010 belongs to:
11010
1101*
110*0
110**
11*10
11*1*
11**0
11***
1*010
1*01*
1*0*0
1*0**
1**10
1**1*
1***0
1****
*1010
*101*
*10*0
*10**
*1*10
*1*1*
*1**0
*1***
**010
**01*
**0*0
**0**
***10
***1*
****0
*****
Question 2
The average fitness is:
where m(H) is the number of instances matches
H in the population.
a.
![]()
![]()
![]()
![]()
b.
![]()
c.

d.

e.
![]()
f.
![]()
g.
If
the assumption that all strings are present in the population were
dropped then
We will get less accurate approximation to the average fitness of the schemata
For instance:
![]()
But
if only the strings 00100, 00110 and 00010 were present then:
![]()
We are sampling fewer values therefore we
get less accurate average.
Question 3
1.
a. o(*****111) = 3 ,l(*****111) = 2 o(1*****1*) = 2 , l(1*****1*)=6
b. the probability for the survival of the schema *****111 after mutation is:
![]()
And the prob’ for 1*****1* to survive is:
![]()
c.
The probability that
crossover will destroy a schema is not greater then:
therefore the probability to survive is at
least: ![]()
So for *****111 the probability to survive is : 0.75
And for 1*****1* is : 0.27
Question 4 (the code is at /ass1/q4.scm)
The goal is to find the minimum of the real valued function g[x,y] in the domain [-2,2]
![]()

As stated in the question description we’ll use the following operators:
|
Parameters |
Kind |
Operator |
|
|
Fitness proportionate selection |
Selection |
|
With a
crossover rate of 0.7 |
Single point crossover |
Recombination |
|
Mutation rate of 0.001 |
Bitwise mutation |
Mutation |
Fitness function: ![]()
Population size:
100 genomes.
Generations:
1000
Representation:
The genome is a binary string of length 2n
representing two integers each of
bits.
The first
bits represent x
and the rest represent y
![]()
![]()
x and y are in the
range
And they should be mapped to real numbers in
the [-2,2] interval, therefore they must be scaled. It can be done via simple
transformation:
Which maps integers from
to reals in [-2,2].
The recombination and mutation will be
used without any special adaptations.
Experiments:
In the first experiment let’s define n =
10, thus it is possible to sample 1024 different points (in jumps of
)
and run execute the GA few times.
Run 1 - 3:
In all the runs the GA converged to the
point (-2/1023, 238/1023) which gave a
fitness value of 0.895956644650997 and g evaluated at this point to
0.116125435276537 which is pretty close to the minimum value that “Mathematica”
found.
, in all the runs the best individuals became
dominant quite early – in generations 34, 11 and 7 and quickly took over the
entire population in few generations using selection and recombination.
In run 3 (for example) we can see how the individual takes over
the population and how quickly the average fitness “catches up” with the best
fitness. The best individual appeared in the population in the 7th
generation. So we can see at the graph below, the best /average fitness up to
the 15th generation rising rapidly.

After the super individual took over the population the average
fitness is bounded between 0.84 and 0.9 and converges quickly to the best
fitness.
Best and average values after super individual took over

Run 4
This time we’ll set n = 16 thus we are able to sample 65536
different points (in jumps of
) so we’d except to see an higher precision
and to get a value that is closer to the minimum value in the domain which is
approximately ~ 0.116016
.
And indeed this run gave us the point (-2/65535 15238/65535) in
generation 345 which holds:
![]()
Summery
Despite the fact that this function has lots of local minimums in that range, the GA
succeed to converge to a good value, yet
the convergence usally happened in first (< 100 )generations of
evolution. so maybe a diffrent operators
or representation is needed here, for instance maybe rank selection that make
it harder for a supper individual to take over the population or maybe dynamic
mutation paramter that will favor exploration at the beginning and at later
phase it will favor explotation of good regions.
Question 5
If
after a certain number of generations fitness no longer increases then
it is possible that the algorithm had reached the optimum, but it is more
likely that the algorithm converged prematurely to a local optimum.
In that case either a different selection method should be considered like rank selection
That makes it harder for a super individual to take over the population and to destroy the variance in the population, or maybe higher mutation rate should be considered to allow the algorithm to escape from a local optimum.
Question 6 (the code is at /ass1/q6.scm)
The goal is to find a subset
(list of primes less then 100) which sums up
to 437.
Representation:
The genome is a binary string of length 25.
And the genotype to phenotype will be evaluated as follows:
Where iff genome[i] = 1 then the prime[i] will be included in subset S.
Fitness function: ![]()
Where ![]()
(
is the ith prime number.)
is
in the range [0,1060] therefore
is
in the range [0,623]
Plot of f(x) in [0,1060]

The maximum of
in
that range, is at,
(
)
So trivially if we’ll
find the maximum of
then the binary string will indicate the
subset of the primes that sums up to 437.
|
Parameters |
Kind |
Operator |
|
|
Fitness proportionate selection |
Selection |
|
With a
crossover rate of 0.7 |
Single point crossover |
Recombination |
|
Mutation rate of 0.001 |
Bitwise mutation |
Mutation |
Population size:
50 genomes.
Generations:
100
Experiments:
Runs 1-3:
All the first 3 runs succeeded finding a subset that’s sums up to 437, and
managed to do so in generations 4 , 5 and 15. This means that all of them
received the max fitness value: 623 Here is one genome for example: (1 0
1 1 1 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 1 1 1 0)
which defines the subset: {2 5 7 11 23 67 71 79 83 89}
that sums to 437.
Average of the 3 first runs

Another constraint
added: find the subset that sums to 437 and has
minimal length.
To handle this constraint we’ll change
the fitness function to:
![]()
we’ll test different values for
=
16,32 , 40
Now we’d except to maximize the number of
zeros and to minimize the difference between the subset sum and 437.
Results:
(the code is at /ass1/q6b.scm)
Table shows best solutions found per α
|
|
Best fitness |
generation |
sum |
length |
|
16 |
911 |
25 |
437 |
7 |
|
32 |
895 |
100 |
437 |
8 |
|
40 |
1341 |
6 |
435 |
7 |
We can see here that for big values for
(
> 40) the “left” side of the fitness functions takes over and the right side
which is responsible for the sum of the subset has smaller effect on the entire
fitness value.
We’d
except to see symmetric effect with small values of
, yet for this problem the algorithm
found genomes with 7 bits set to 1 even for small values of
such as ![]()