Ex 1 – GA

Arie ohana – 33771635,  arie.ohana@gmail.com

Lior Becker – 052690849, liorbecker@gmail.com , http://liorbecker.googlepages.com , http://liorbecker.googlepages.com/

 

Q1:

The bitstring 000 belongs to the following schemata:

000, 00*, 0*0, 0**, *00, *0*, **0, ***

 

A bitstring with a length of 3 has 2^3 schemata that it belongs to.

 

Q2:

Let f(x) = x^3+7.

a. 1***

Bitstring

1000

1001

1010

1011

1100

1101

1110

1111

Decimal

8

9

10

11

12

13

14

15

f(x)

519

736

1007

1338

1735

2024

2751

3382

 

          Average = (519 + 736 + 1007 + 1338 + 1735 + 2024 + 2751 + 3382)/8 = 1709.

 

          b. 0***

Bitstring

0000

0001

0010

0011

0100

0101

0110

0111

Decimal

0

1

2

3

4

5

6

7

f(x)

7

8

15

34

71

132

223

350

 

Average = (7 + 8 + 15 + 34 + 71 + 132 + 223 + 350)/8 = 105.

 

c. 1111

Bitstring

1111

Decimal

15

f(x)

3382

 

Average = 3382.

 

d. ****

Average =

 

 

e. 00*0

Bitstring

0000

0010

Decimal

0

2

f(x)

7

15

 

Average = (7 + 15)/2 = 11.

 

f. 11*1

Bitstring

1101

1111

Decimal

13

15

f(x)

2024

3382

 

Average = (2024 + 3382)/2 = 2703.

 

Q3.

          Problem desc:

                   Let f(x) = x * |tan(x)|,  0 <= x <= 256 (x is in radian).

                   Find the maximum value of f(x) within that range.

 

          Process of work:

                   The problem is solved using a GA.

                   Gene representation: a bit string of 15 bits. A gene with a bitstring of X has a phenotype of (The decimal value of X) * 256 / 2 ^ 15.

                   This gives us a resolution of 0.0078125, and the gene representation covers values between 0 and 256 in that resolution step.

                   Fitness function:

The fitness value of X is the assignment of X's phenotype in f(x).

Note that f(x) is always positive in the given range so the probabilities of selection are always positive.

 

                        Selection method:

                             Fitness proportion function, using roulette wheel sampling.

 

                   Population size = 100.

                   Num of generations = 1000.

 

                   Crossover:

Single point crossover, Pc = 0.7.

 

                   Mutation:

                             Bit flip mutation with Pm = 0.001.

 

                        Steps / Process:

1.      Initial population calculated randomly

2.      Fitness calculation.

3.      Selection.

4.      Crossover and mutation.

5.      Moving to next generation.

 

          Results:

 

                   Test 1: Best IND: value 243.5                  Fitness 9162.539870388147

Test 2: Best IND: value 149.2265625        Fitness 163723.50310612554

Test 3: Best IND: value 155.5078125        Fitness 151884.89414227678

Test 4: Best IND: value 149.2265625        Fitness 163723.50310612554

Test 5: Best IND: value 230.90625           Fitness 285055.71004662913

 

Pm = 0.001 Pc =0.7

Generations = 1000

Population size = 100 

 

 

                                                                   Test1 


Test2


Test3


Test4

 


 

 

Test5


 

 

All Test Results  Avg / Best

 

          Conclusions:

1.      The GA reaches a near optimal point in a relatively short time (app. 10 – 20 generations), and there is a slow progress from that stage onward.

2.      The fitness function caused the algorithm to locate a local maximum in several instances of run.

Causes:

Super individual - the fitness \ selection function caused the creation of a Super individual.

                         Solutions:

Using other fitness function (square root), using different fitness functions in different stages in the algo.

Using the tournament selection method.

3.      Since tan(x) evaluates to infinite in some points in that range, the algorithm returned several distinct points as maximum.

 Note that the searching range 2^15 is finite.

4.      The Larger the BitString (Gene size) the more effect it will have on the accuracy of the GA.

5.      Picking other Gene representation could increase the accuracy, for example reading the bitstring as float (sign, mantissa, exp).

6.      The problem is not so hard, relatively small searching area, that’s why a local max point was discovered fast.

7.      In the All Test Results  Avg / Best  plot we can see that the yellow, blue and green tests have found a local max point, whereas the purple test has found a much better result.

 

 

·         Due to the conclusions we decided to make some changes the and increase the Gene size to 500 bit.

The result is the plot below.

You can see the gradual improvement with the generations (until generation 450).

The Best individual here is 10,700,000 (much better then the last testing).

 


 

Q4.

          A GA can be made to run faster using the following techniques:

(a) Implementing a specific memory management for the algorithm to avoid the recurring creation and destruction of genes, so that in every iteration the memory for the newly created individuals will not need to be created again.

 

(b) Separating the population into several machines. Instead of running the algorithm on a single machine, we can use several machines, since working in parallel could cause the algorithm to work much faster.

 

(c) Minimizing the search space by choosing a sophisticated representation for the gene (with no elimination of possible solutions). E.g. the 8-queens problem shown in class.

 

Q5.

          Problem desc:

                   Find the best solution for the knapsack 0/1 problem with a given set of items, their weights and values, and a maximum weight of 625.

                  

          Process of work:

                   The problem is solved using a GA.

                   Gene representation: a bit string of 50 bits. Each bit representing an item. If bit i is 1, then the item is included, otherwise it is excluded from the solution.

 

                   Fitness function:

                                    The fitness value is the sum of all values of the items in the solution (sum of P(i)*X(i)). If the total weight exceeds 625, then the fitness is reset to 0.

 

                        Selection method:

                             Fitness proportion function, using roulette wheel sampling.

 

                   Population size = 100.

                   Num of generations = 500.

 

                   Crossover:

Single point crossover, Pc varies in each test.

·         In case the crossover produces an illegal solution (i.e. its total weight exceeds 625),

we randomly choose a set bit in the gene and set it to zero (so weight decreases).

 

                   Mutation:

                             Bit flip mutation with Pm varies.

 

                        Steps / Process:

6.      Initial population calculated randomly

7.      Fitness calculation.

8.      Selection.

9.      Crossover (and fix, if needed) and mutation.

10.  Moving to next generation.

 

          Results:

 

Fitness value

Weight

Gene

Pc

Pm

912

614

11111010110111011011111101010111111010011110110010

0.7

0.001

791

568

10111010100111010010111101010111110110110011110010

0.4

0.001

869

619

11111010110111001011011101011111110000111110101011

0.1

0.001

733

607

10111001111011011001100101000110011010110111011011

0.7

0.01

724

591

11100111000111011000010001010101111011111111100110

0.7

0.1

 

 

 

Test 1  Pc = 0.7 , Pm = 0.001


 

Test 2 Pc=0.4 , Pm= 0.001


Test 3 Pc= 0.1 , Pm= 0.001


Test 4 Pc=0.7, Pm= 0.01


Test 5 Pc=0.7, Pm=0.1


 

All Tests best Avg comparison


 

All tests best Fitness comparison


 

          Conclusions:

1.      In tests 4 and 5 it's clear that when the Pm parameter is too large, the population is not evolving and seems to act randomly.

In test 5 this characteristic is much more acute, in test 4 Pm=0.01 and in test 5 Pm= 0.1.

Due to this fact we can see that in test 4 we do have a little growth in the beginning, in contrast to test 5 where the growth seems to be absence.

To conclude: In this experiment it is best to use a Pm such as 0.001.

 

2.      The first three tests show that when Pc is large enough (such as 0.7, in the first test) there is a constant growth in the algorithm result (which is good), although the growth   is pretty slow. Tests 2 and 3 show that if Pc is too small the population still evolves but the evolution stops pretty fast.

To conclude: Using a large Pc parameter is preferable over a smaller one.

 

3.      The best result we came up with was 912, test 1, and that only encourages our conclusions.

4.      The best approximation we got to 625 was… 625.

Value: 849.0 Weight: 625.0 Gene: 11111100100001011111110101110110010010111111010111 (though that doesn’t necessarily mean it will produce the finest solution).

 

 

Code Links:

Ex 3 code

Ex5 code