Question about rate of genetic operations, and selection in GAs.

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 May 30, 2007 at 19:18

Dear all,

As we known, in GAs, there is the selection and a few main operators, e.g. crossover, mutation, and survival (a fixed number of best individuals copy themselves).

I found some information from different websites about their rates. It can be summarized as follows.

Selection: The way of selection could better be “Roulette Wheel selection”. However, if I picked the proportion of the population, e.g. the toppest 15 test cases among all in each iteration, is it convincing? I feel that Roulette Wheel selection is a little hard to implement…

Crossover rate: around 0.7

Mutation: 0.001 (very low value)

survival: 1-0.7-0.001 = 0.299.

Do you have some comments on selection and rate of genetic operator?

Thanks in advance,

B.W

DXMM2007

10 Replies

Please log in or register to post a reply.

6837d514b487de395be51432d9cdd078
0
TheNut 179 May 30, 2007 at 22:39

It depends on your application, but generally:

Mutation:
- To low a value and you may end up with a stagnant population
- To high a value and you’re better off just running a randomizer in a loop and wait for it to magically stumble upon the answer.

Crossover:
- To low a value and you reduce your genetic diversity
- To high a value and you end up stagnating some of your population (wasting)

The values you specified are typically the “defaults”, though really you must experiment with it. Sometimes I like to assign a high mutation rate as it can get me to the result quicker, but at the trade-off of having a sparse population (ie: good for short-term, bad for long-term).

Roulette wheel in my opinion is not a good selection scheme. There’s the possibility you could end up taking a bunch of worthless chromosomes that only set you back a couple generations. The chances of this happening frequently is also equally possible.

It’s been a while, but I recall using tournament selection mixed with elitism provided good results. This guarantees you keep at least some of your best chromosomes (elitism), and maintain genetic diversity (tournament). With elitism you pop off the best of the best from your list. With tournament, you randomly loop N times, keeping the best of the randomly selected chromosome.

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 May 31, 2007 at 14:35

Hey TheNut,

Thanks for your reply.

Actually, in my project, I have to use pseudo random numbers generated by C lib functions, e.g. srand(). Therefore, the seeds which is used to set the random number generator (rand(seeds)) will be regarded as the gene for the
parents. However, I can’t figure out whether the cross-over can make the springoffs have some behaviors of the parents, since if you set the different seeds, the random number generated by C will be totally different (AFAIK).

Thus, in my case I don’t use cross-over, the whole evolution relies on the mutation, e.g. 100%, and I keep the number of population in each iteration as fixed number, which may not lead to your mentioned sparse population. What’s your comment on it?

However, I may not follow up your reply about Mutation. e.g. “- To high a value and you’re better off just running a randomizer in a loop and wait for it to magically stumble upon the answer.” Do you mean the problem can’t be
solved finally?

Another question is about your introduced tournament selection mixed with elitism. For the elitism part, do you mean to pick the best chromosomes in first iteration / initial population, then keep them in the rest iterations, or keep the best ones which can be survive in the number of iterations, e.g. 3 or 5 times iterations?

Can I realize the mutation will try to have a good global solution space, whereas the crossover will find the local optimum?

Sorry for many questions,

Thanks in advance,

B.W.

DXMM2007

@TheNut

It depends on your application, but generally:

Mutation:
- To low a value and you may end up with a stagnant population
- To high a value and you’re better off just running a randomizer in a loop and wait for it to magically stumble upon the answer.

Crossover:
- To low a value and you reduce your genetic diversity
- To high a value and you end up stagnating some of your population (wasting)

The values you specified are typically the “defaults”, though really you must experiment with it. Sometimes I like to assign a high mutation rate as it can get me to the result quicker, but at the trade-off of having a sparse population (ie: good for short-term, bad for long-term).

Roulette wheel in my opinion is not a good selection scheme. There’s the possibility you could end up taking a bunch of worthless chromosomes that only set you back a couple generations. The chances of this happening frequently is also equally possible.

It’s been a while, but I recall using tournament selection mixed with elitism provided good results. This guarantees you keep at least some of your best chromosomes (elitism), and maintain genetic diversity (tournament). With elitism you pop off the best of the best from your list. With tournament, you randomly loop N times, keeping the best of the randomly selected chromosome.

6837d514b487de395be51432d9cdd078
0
TheNut 179 May 31, 2007 at 16:55

I’m a little lost in your recent description. The chromosomes don’t have anything to do with seeding a randomizer. You have a population, which is N number of chromosomes. When you initially create them, you randomize their initial genetic sequence, but only once. Only through evolution can your chromosomes mature to solve the problem.

You should never used 100% mutation. That means every generation you’re going to end up with another random bag of chromosomes. There’s no way to keep good ones or reject the bad ones. It’s as I said, 100% mutation is no more different than running a randomizer in a loop waiting for that magic sequence to pop-up. That’s not what GAs are designed for. It is impossible to solve a problem with GA’s and 100% mutation. Even if by pure fluke you found a solution, it would automatically be trashed in the next generation.

“in my case I don’t use cross-over”
- You should. How else can a GA evolve? Pure mutation alone cannot solve for this. It’s like creating a clone of yourself over and over again. You need to hook up two chromosomes and share their attributes. The union can be beneficial or counterproductive, but this is the way of life.

For elitism, for every generation pick N chromosomes that have the best fitness. Keep them for the next generation (push them into your chromosome offspring pool). By doing this, you’re rejecting N other chromosomes that probably would have plagued your population through natural evolution (crossover, mutate, etc…).

“Can I realize the mutation will try to have a good global solution space, whereas the crossover will find the local optimum?”
Yes, but mutation does not guarantee “a good global solution space”. It’s very nature is to explore the solution space, not decide which part of it is better and which is worse. That’s what your fitness function does.

132630cdb7f405243228d5c964565e9c
0
SochauX 101 Jun 07, 2007 at 09:24

Hello,

I am surprised about what you say, and i think you should take a look at the implications of all that is discussed here :

  • first : ellitism is not a real good solution, in fact, it can show awfull results. If you consider your fitness space of research as a gaussian mixture (which is usualy the case) if you allways take the best individuals, you might quickly get stuck into local optimum that may be very bad solutions, and only the mutation might help you leaving this area… but then, if you have a 0.0001 mutation rate, you might take ages to leave it, and maybe only to a new local optima.

  • Always about the selection. If you just take a tournament selection. You have a bias here. What if one or two individuals have infinitly higher fitnesses than the others, they will almost allways be selected without giving a chance to the others.

  • The higher, not the better. A lower solution might be nearer to the global optimum, so you might want to give a chance to other solutions. If it where not the case, you probably want to use somethig else than genetic algorithm. A better solution is “ranked based selection” of the individuals.

  • about the crossover/mutation rates. Crossover favors the exploitation on the population, that is to try to combine and look in a close space. Mutation allows a lot if well considered (gaussian mutation in continuous space for instance), both exploitation (looking near) or exploration (looking away). So mutation might give better results.

  • so mutation is more complete and crossover might not be used under different conditions. In fact, crossover does not allways make sense in you research space, mutation does.

  • moreover, you have way different crossover operators just than spliting into 2 and recombine.

  • you might also try to consider changing parameter accross the research (if convergeance, raise, else lower..)

  • depending on your problem, if the solution space is continuous and not bits, you might consider more sophisticated GA such as Evolution Strategies, which gives real good results in comparison, and only using mutations. CMAes for instance, determines the best mutation rate in all dimensions at each generation to optimize the research of the solution, and give the best general results so far.

  • finally, consider the GA as a random based optimization procedure, but, you include many heuristics giving it sense (evaluation/selection/reproduction), it is not a random walk such as montecarlo procedure.

Still, all thoses remarks have to take into account the fitness landscape. All theses parameters have to be tunned considering the problem. The good parameters for a problem might give worse results on another problem.

Hope i made clear comment, don’t hesitate to discuss.

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 Jul 05, 2007 at 18:58

Dear all,

Thanks for your replies.

I have one more question. I.e. by using the heuristic approaches, no matter Evolution Strategy, GAs, Simulated annealing, Tabu search or others, after several times running, are the results (one generated by each running) equal to the rest approximately? Or ideally, they should be. Otherwise, the parameters of the method should be adjusted.

Thanks,

B.W.

DXMM2007

132630cdb7f405243228d5c964565e9c
0
SochauX 101 Jul 07, 2007 at 00:54

Hello,

to what i understand, you mean that you want to know if all the methods should give the same results ? It depends both on the approaches and the problem. Under some assumptions, the no free lunch theorem tells you that no algorithm is always better than another on all problems. So both the problem and the parameters of the different methods might have importance. If the problem is simple, it does not matter, but then, it is no use to have a stochastic search. Moreover, theses approaches uses random and so you have to run it severall times to ensure its performance on the problem with the standard deviation.

To be short, the parameters are important since they define the way you can explore the search space or exploit it. Some methods should show better results depending on both the problem and the parameters.

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 Jul 08, 2007 at 21:31

Hey SochauX.

Thanks for your reply.

I tested the performance of the algorithm by using the same parameters, but it returned difference value per time, e.g. some better, some good. From your explanation, I realized that I have to run it with probably 100 or 1,000 times, and analyze the results with the help from standard deviation, e.g. the smaller standard deviation, the better algorithm is.

Is my understanding right?

Thanks!

DXMM2007
@SochauX

Hello,

to what i understand, you mean that you want to know if all the methods should give the same results ? It depends both on the approaches and the problem. Under some assumptions, the no free lunch theorem tells you that no algorithm is always better than another on all problems. So both the problem and the parameters of the different methods might have importance. If the problem is simple, it does not matter, but then, it is no use to have a stochastic search. Moreover, theses approaches uses random and so you have to run it severall times to ensure its performance on the problem with the standard deviation.

To be short, the parameters are important since they define the way you can explore the search space or exploit it. Some methods should show better results depending on both the problem and the parameters.

132630cdb7f405243228d5c964565e9c
0
SochauX 101 Jul 09, 2007 at 21:14

Hello,

it is hard to tell, i would say it just depends on you needs. If you need to run it with a general tuning on different problems or not. I would say that robustness is what we usualy need, and that mean a low variance/deviation, but not necessarily precisely tuned but generally good parameters.

Also, don’t feel the need to run it thousand times or more. Usualy, 11 shots gives you sufficient approximation with a high confidence, since the std keep low. The value 11 is statistically computed to give a trade-off between confidence and computation time… Of course, the higher the better, but not allways possible… especially with GA since they aim at difficult problems which costs a lot in computation time and so on.

SochauX

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 Jul 10, 2007 at 12:59

Hey SochauX,

Thanks for your reply!

It is really helpful. I will summarize all the above discussions when I am free!

Have a nice summer,

B.W.

DXMM
@SochauX

Hello,

it is hard to tell, i would say it just depends on you needs. If you need to run it with a general tuning on different problems or not. I would say that robustness is what we usualy need, and that mean a low variance/deviation, but not necessarily precisely tuned but generally good parameters.

Also, don’t feel the need to run it thousand times or more. Usualy, 11 shots gives you sufficient approximation with a high confidence, since the std keep low. The value 11 is statistically computed to give a trade-off between confidence and computation time… Of course, the higher the better, but not allways possible… especially with GA since they aim at difficult problems which costs a lot in computation time and so on.

SochauX

E724d8394de209a7f094a4d16cdc9cc6
0
DXMM2007 101 Aug 30, 2007 at 09:35

Hey SochauX,

May I ask you one further question? E.g.

Where can I find the evidence about your mentioned 11 shots. I think if the mutation is the main activity, then my algorithm is something close to Evolutionary Strategy. Therefore, is these shots 11 still working?

Thanks in advance,

DXMM2007
@SochauX

Hello,

it is hard to tell, i would say it just depends on you needs. If you need to run it with a general tuning on different problems or not. I would say that robustness is what we usualy need, and that mean a low variance/deviation, but not necessarily precisely tuned but generally good parameters.

Also, don’t feel the need to run it thousand times or more. Usualy, 11 shots gives you sufficient approximation with a high confidence, since the std keep low. The value 11 is statistically computed to give a trade-off between confidence and computation time… Of course, the higher the better, but not allways possible… especially with GA since they aim at difficult problems which costs a lot in computation time and so on.

SochauX