| <p>One of our foundational questions tied to the optimization of the GA has been "How many individuals should we simulate". Up to now, our minds were made up for us by the speed of arasim being great enough that the time cost of simulating individuals was great enough that the improvements made from having more were not enough to justify the slowdown. However, with the upgrade to the faster, more recent version of arasim, I decided to re-examine this. This was also spurred on by the fact that the last time I ran this test we were testing GA performance by final generation metrics rather than by how many generations it took to reach a benchmark. So in one of my optimization tests, I tracked this data. </p>
<p> </p>
<p>To start, using the same run proportions, using a .5 chi-squared benchmark, the average time across all 89 run types used in this run was 25.4 generations for 50 individuals as compared to 8.3 generations running the same test for 100. Furthermore, the minimum number of generations for 50 individuals was 4.8 while using 100 individuals yielded 2.4. So on average running 100 individuals was about 3 times fast at reaching this benchmark than with 50. And when comparing the best result regardless of run type, 100 individuals was still 2 times quicker than the min for 50 individuals. Finally, the run that yielded an average of 2.4 generations for 100 individuals took an average of 29.2 generations with 50 individuals or roughly 12 times the generations. </p>
<p> </p>
<p>For the test we will discuss, we ran 89 different run types that all used 60% rank, 20% roulette, and 20% tournament selection respectively. These test had the following ranges:</p>
<p>6-18% of individuals through reproduction (steps of 3%)</p>
<p>64-88% of individuals through crossover (steps of 12%)</p>
<p>0-10% mutation rate (steps of 5%)</p>
<p>1-5% sigma on mutation (steps of 1%)</p>
<p> </p>
<p>These tests also used our fitness scores with simulated error of .1 to imitate arasim's behavior and as such we used the chi-squared value to evaluate these scores as there is no error on those values. </p>
<p> </p>
<p>Comparing this same test with a tighter chi-squared benchmark of .25, we see similar results. On average 50 individuals took 37.1 generations to reach this point while 100 individuals took 16.0 generations. Similarly, the minimums amount of gens for 50 individuals was 15.4 while 100 individuals was 5. Finally, the corresponding run for the 5 generation min with 100 individuals took 41.8 generations with 50 individuals. These correspond to speed up's of 2.3, 3.08, and 8.36 respectively. </p>
<p> </p>
<p>This data implies that on average, independent of run type, we should expect to have to use 2-3 times fewer generations while running 100 individuals than we would running 50 individuals but we could see up to 8-12 times fewer generations to reach benchmarks. Another data set using a different set of selection methods was also tested for this and again yielded similar results, though overall the runs from the first batch were better across both 50 and 100 individuals and so those results are likely to be more indicative of the parameters we use in a true run. </p>
<p> </p>
<p>The data being examined in these results can be found here: https://docs.google.com/spreadsheets/d/1GlfnjQSO6VI8MuUGYTUcLkjwDZU98nyFFysgTTfVFOE/edit?usp=sharing </p> |