Updates and Results Talks and Posters Advice Ideas Important Figures Write-Ups Outreach How-To Funding Opportunities GENETIS
  GENETIS, Page 8 of 13  ELOG logo
Entry  Fri Dec 11 17:47:48 2020, Ethan Fahimi, Friday Updates 
Ethan F Fixed the issued AREA was having with finding test_{ind}.txt, now to fix problems with finding Veff and the project should be working.
   
   
   
   
   
   
Entry  Fri May 20 17:17:32 2022, Ryan Debolt, Fitness Functions Test 8x

Bellow lies plots testing different scores and comparing them using a chi^2 score.

The functions used are as follows

Gaussian: e^(-2) (Red)

 

Inverse: 1/(1+(O-E)^2) (Purple)

 

Algebraic: 1/(1+(chi^2) )^(3/2)) (Green)

 

Chi: 1/(1+chi^2) (Blue)

Which are plotted here:

Entry  Fri Aug 28 16:31:56 2020, Alex M, First Results Slides GENETIS_Meeting_Loop_Results_2020_08_24.odp

Here's a power point I showed today at the GENETIS meeting for the two most recent runs we did. The first run was symmetric and we ended it at 15 generations. The second one is ongoing and is shown up through generation 9 on the power point.

Entry  Tue Apr 4 13:51:56 2023, Alex M, Final To Do List for PUEO Loop 

At the main GENETIS meeting yesterday, we collected the last few things that need to be worked out in order to get the PUEO loop in working order. Here's a list

  • Skeletonize the xmacros (Alex)  04/10/23 In progress

    • This involves substituting some of the text to be variables that are replaced by the loop for each generation. 

    • Requires some slight modification of Part B1 in the loop.

  • We need to modify PUEOsim to read in different gain files (Dylan) Changes made 04/10/23  Testing to be done.

    • Currently, PUEOsim reads in files like vv_0, vh_el, etc. It needs to be modified to take in the gainfiles as arguments and read different ones for each individual.

    • Requiers slight modification of the batch job scripts, but these are relatively small

  • Slight modification on the fitness score read in script (Dylan)  Changes made 04/10/23  Testing to be done.

    • Now that PUEOsim is working and has been modified to output txt files with the effective volume, we just need to slightly amend the the fitness score script to read from those files.

  • Fix Ezio's comments on XFintoPUEO.py and modify to read in outputs from two files (Jacob) 04/10/23 in progress, further work this week

    • Since we are running XF twice (once for each power source), we need the conversion script to read in from two files and print out to 8 files (vv_0, vh_el, etc)

Entry  Tue Nov 5 16:19:20 2019, Julie Rolla, Fall update 

We are off again with a new team. The new team consists of Alex Machtay, Alex Patton, Cade Sbrocco, Eliot Ferstl, Evelyn Shank, Mitchell Halley, Scott Janse. A spreadsheet of tasks and deadlines can be found here: https://docs.google.com/spreadsheets/d/1CkcdbediFbripdlwPw6UmhAHdAl5IdkL2efaseJkTTM/edit#gid=0

Updates:

We are now running on OSC. We have acquired a version of XF on OSC and have now transferred the loop to OSC. All of the xmacros for XF should be functioning, and we have managed to make the bash script auto import the xmacro scripts to XF when it boots. The loop exists on /fs/projects/PAS0654/BiconeEvolutionOCS. This is a shared copy so that everyone can run off of the same installation. This is important so that we can make sure everyone is pushing edits regularly, and so that everyone is always running with the most up-to-date edits. 

The loop had not yet been tested for it's current status. Alex M. is testing it and we are waiting to hear back. 

 

Tasks:

  • Check continuity of loop
    • Update: Alex M is testing this
  • Install Arasim to /fs/projects/PAS0654/BiconeEvolutionOCS to be edited and messed with
    • Update: this has been completed as of 11/5/19
  • Test Alex M's code for editing Arasim's Setup.txt variables
  • Clean bash script
    • Update: Mitchell is working on this.
  • Make version of bash script which writes .job scripts for Arasim
    • This must write a different .job for each individual so it parallelizes Arasim running for each individual
    • The number of .jobs written must depend on $NPOP
      • ie this bash script should be writing .jobs from scratch each time as (1) the number of them we need depends on $NPOP, (2) the names of the input and output files depends on $NPOP. 
        • It should also delete .jobs at the end -- ie when all generations have completed so that we don't have a random number of .jobs floating around. For example, say we run with 10 individuals one time, and it writes 10 .jobs ($NPOP=10), and then the next run we only have $NPOP=5, we will have 5 extra .jobs floating around not being used unless we clean them all at the end line of the run in the bash script. 
    • Update: Work on this has begun by Alex M, Cade, and Evelyn (Alex P has also been messing with Arasim).
  • Make XF be submitted as a job
    • Cade has ideas for this.


Students have been posting individual updates here: https://www.dropbox.com/home/GP_Antennas/Updates

 

Entry  Wed Jun 3 15:20:04 2020, Julie Rolla, Existing Info for Current Projects AREAthesis.pdf

Paperclip Antenna (Ryan and Evelyn working on it):

Asymmetric Bicone (Leo and Eliot working on it):

AREA (Ben working on it):

  • Github? We need to ask Steph where their past work exists.
  • See AREA thesis attached

Symmetric Bicone(Alex M. and Alex P. working on it):

Entry  Fri Jun 2 00:21:36 2023, Ryan Debolt, Error test results test_results(1).png

Attached is a plot containing bar graphs with error bars representing the average number of generations it took for the GA to achieve a chi-squared value of 0.25 (roughly equated to a 0.8 out of a max 1.0 fitness score). Unlike the fitness scores used by the GA, these values do not have simulated error attached to them and are therefore a better measure of how well the GA is optimizing. These results were obtained by running 10 tests in the test loop for each design, population, and error combination and solving for the average number of generations to meet our threshold and the standard deviation of those scores. From this, a few things immediately pop out, I will address the more obvious one later. But essential for this test, we can see that increasing our population size seems to have a more direct impact on the number of generations needed to reach our threshold than decreased error does in both designs. My best guess regarding this is that the GA depends on diversity in its population in order to produce efficient growth, and an increase in the number of individuals contributes to this, allowing the GA to explore more options.

 

This leads me to the more easily spotted trend; PUEO is much slower than ARA. This presents an anomaly as this is the opposite of what we would expect from this test loop as PUEO has fewer genes (7) to optimize than ARA (8). It is also important to note that no genes are being held constant in this test for either design, so both designs have the full range of designs provided they are within the constraints. With that in mind, my guess as to what causes this phenomenon is that PUEO's constraints are much stronger than ARA's. How this may affect the growth is that it more heavily bounds the possible solutions, which makes it harder for the GA to iterate on designs. It is possible that during a function like a crossover, the only possible combinations for a pair of children are identical to their parents, effectively performing reproduction. This could limit the genetic diversity in a population and therefore cause an increase of generations needed to reach an answer. We could in theory test this by relaxing the constraints done by PUEO and then running the test again to see how it compares.

 

Finally, You will notice that no AREA designs are shown. This is because, under current conditions, they never reached the threshold within 50 generations. However, Bryan and I think we know what is happening there. AREA has 28 genes, about four times the amount of genes that our other designs use. Given that our current test loop fitness measure is dependent on a chi-squared value given by: SUM | ((observed(gene) - expected(gene))^2) / expected(gene) |, we can see that given more genes, the harder it gets to approach zero. For example, we can imagine in an ARA design if each index of the sum equals 0.1, you would get a total value of 0.8, while AREA would get a value of 2.8, which seems considerably worse, despite each index being the same. Upon further thinking about it, Bryan and I do not think that a chi-squared is the best measure of fit we could be using in this context. Another thing we thought about is that we have negative expected values in some cases. We have skirted around this by using absolute values, but upon reflection believe this to be an indicator of a poor choice in metric. Chi-Squared calculations seem to be a better fit for positive, independent, and normally distributed values, rather than our discrete values provided by our GA. With this in mind, we propose changing to a Normalized Euclidean Distance metric to calculate our fitness scores moving forward. This is given by the calculation: d = sqrt((1/max_genes) * SUM (observed(gene) - expected(gene))^2). This accomplishes a few things. First, it keeps the same 0 -> infinity bounds that our current measure has, allowing our 1/(1+d) fitness score to be bounded between 0 and 1. Second, it forces all indices to be positive so we don't need to worry about negative values in the calculation. Third, this function is weighted by the number of genes present for any given design, making them easier to compare than our current measure. Finally, as our GA is technically performing a search in an N-dimensional space for a location that provides a maximized fitness score, it makes sense to provide it the distance a solution is from that location as a measure of fit in our test loop. We created a branch on the test loop repository to test this and the results are promising as results from the three designs are much more comparable for the most part (though we still see some slowdown we think is contributed to constraints as mentioned above). Though some further input would be appreciated before we begin doing tests like the one we have done in the plot below.

Entry  Tue Jun 28 12:49:53 2022, Dennis Calderon, Effective Volumes from AraSim: Curved Sides and Straight Sides (Paper Run) AraSim_Summary_Effective_Volumes.pdf

Summary of results for 3million event simulations in AraSim with both GENETIS version and more recent {~11/2021) version.

Using errors for effective volume from the .out files.

Example shown below.

Radius: 3000 [m]
IceVolume : 8.4823e+10
test Veff(ice) : 6.50867e+09 m3sr, 6.50867 km3sr
test Veff(water eq.) : 5.96845e+09 m3sr, 5.96845 km3sr
And Veff(water eq.) error plus : 0.543588 and error minus : 0.543588


 

Entry  Sun Jun 18 21:32:03 2023, Dylan Wells, Default Toyon Antenna Simulation 248599981-8eacbc83-d42e-4da4-9746-bda05b2b4a38.png248599974-bee7a4db-34c6-4cfd-bde0-3682ff3ebaf7.png

To act as our comparison to the evolved antennas while plotting, we have done a simulation of pueoSim with 4,000,000 neutrinos for the measured toyon gains found in /fs/ess/PAS1960/buildingPueoSim/pueoBuilder/components/pueoSim/data/antennas/measured

In order to run the jobs, I used the runJobs.sh script found in /fs/ess/PAS1960/buildingPueoSim which submits job runs of 40,000 neutrinos at a time, mirroring what we do in the actual loop.

The resulting data is now located in /fs/ess/PAS1960/HornEvolutionOSC/GENETIS_PUEO/BiconeEvolution/current_antenna_evo_build/XF_Loop/Evolutionary_Loop/Toyon_Outputs

Veff: 11750.0503856

Plus Error: 89.0051458227

Minus Error:88.9007824246

The fitness score plots have been updated to read this value in as a comparison.

 

Entry  Mon Oct 5 21:06:29 2020, Everyone, Data Runs 

Machtay_20200831_Asym_Length_and_Angle    10 individuals
Machtay_20200911_Symmetric                             10 individuals, fewer neutrinos
Machtay_20200914_Asymmetric_50_Individuals  50 individuals, fewer neutrinos
Machtay_20200929_Asymmetric_test_2               50 individuals, fewer neutrinos, broaden parameter range

 

Entry  Fri Aug 7 14:16:01 2020, Alex M, Daily Update 8/7/20 
Name Update Plans for Monday

Alex M

   

Alex P

   

Eliot

   

Leo

   

Evelyn

   

Ryan

   

Ben

   

Ethan

   

 

Entry  Wed Aug 5 16:08:40 2020, Alex M, Daily Update 8/5/20 
Name Update Plans for Tomorrow
Alex M

Continued running the loop up to gen 15. I also adjusted the LRTPlot script so that it can slightly spread the points along the generation axis so that you can see overlapping points, but I still need to implement this in the loop. 

Helped Ryan design the bash script for running through paperclip many times. It's working and he ran it for a while to get ~1000 outputs checking the performance of different mutation sizes. Next he'll parse through them to find what sigma appears best.

Amy suggested that I make it possible for the roulette algorithm to record the mutations and parents for each individual so that we can get a better sense of the history of the evolution, so I'll work on that tomorrow. If it looks to be working, I'll start a run with an asymmetric length.
Alex P    
Eliot    
Leo    
Evelyn    
Ryan Completed bash script for automating runs and generated 1100 txt and csv (each) files of runs to be able to see how the ratio of roulette to tournament and standard deviation in the mutation function affect the fitness scores after 100 generations.  Work on writing a program to go through the txt files and collect the information of the highest fitness score, stored in line 6 of each one,  and figure out which combination of parameters gives the highest average.
Ben    
Ethan    

 

Entry  Tue Aug 4 17:30:09 2020, Alex M, Daily Update 8/4/20 
Name Update Plans for tomorrow
Alex M I've continued the run I started yesterday. I ran into a bug where the LRT plots weren't being made but it should be fixed now. The loop is on generation 10, but it hasn't really converged or shown a clear pattern. We'll keep this run going until at least gen 15 probably. If we finish this run, I can start another run using the asymmetric version. The goal would be to do a similar number of generations using the asymmetric length first, followed by adding on the asymmetric opening angle.
Alex P    
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan    

 

Entry  Mon Aug 31 18:32:21 2020, Alex M, Daily Update 8/31/20 
Name Update Plans for tomorrow
Alex M

I started working on the slides for our meeting with Wolfgang this Friday. I'll share it via one drive/email/slack tomorrow for comments from everyone.

Over the weekend I finished the asymmetric length run. I still have a couple of fixes to make on plotting scripts, so I'll update those plots tomorrow after making the fixes. I also started a new run today using both an asymmetric length and an asymmetric opening angle. The plots aren't automatically uploading to dropbox anymore because we hit our limit, so I'll have to add them manually. You can see the asymmetric run on dropbox under  Machtay_2020_08_27_Asymmetric_Length_Real_Run(SECOND REAL RUN).

I'll finish up and share rthe power point with everyone and then fix up the plotting scripts and continue the current run. I also need to try getting onto the UW super computer in advance of the ARA bootcamp on Wednesday/Thursday. 
Alex P    
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan Futher worked on documentation of the genetic algorithm. Ben is currently doing a big run to test the program. Once the big run is finished, we will look at the data and attempt to fix any issues that may arrise.

 

Entry  Mon Aug 3 16:39:56 2020, Alex M, Daily Update 8/3/20 
Name Update Plans for Tomorrow
Alex M I fixed the issues I outlined a little while ago that we needed to resolve before starting a real run. I started a real run with 10 individuals for up to 20 generations using the database for a symmetric bicone. The name of the run is Machtay_20200803_Master_Symmetric_Database_Real (whew!). Right now it's on gen 2 and plots are on drop box (see the link at the bottom).  I'm going to keep the run going and hopefully be most of the way to 10 generations by the end of tomorrow. Julie and I also discussed some potential improvements that can be made to the loop. For one, there's some consolidation and cleaning up that should be done of the main directory. We were also speculating about a way that we could quickly see when AraSim jobs fail and rerun them in real time rather than waiting for all the jobs to finish to realize we need to rerun.
Alex P    
Eliot    
Leo    
Evelyn    
Ryan Continued working on the bash script to automate runs for paperclips.  Finish the bash script and do the runs and start combing through the results. 
Ben    
Ethan    

Dropbox link: https://www.dropbox.com/home/GP_Antennas/Updates/DailyFitnessScoreImages/Machtay_20200803_Master_Symmetric_Database_Real

Entry  Mon Aug 10 16:12:28 2020, Alex M, Daily Update 8/10/20 
Name Update Plans for Tomorrow
Alex M Over the weekend I tried making some fixes we discussed. I fixed the roulette algorithm (the symmetric version) so that the random number generator is only set once and then passed to functions. It will now print the values in the generator to files to be plotted. I tried running the loop over the weekend with these fixes but the AraSim jobs kept failing. So today I consolidated the two loops and I started working on a fix for AraSim that will automatically check if an AraSim job fails in real time while the loop is running and resubmit it so that we don't get stuck waiting for the jobs to rerun.  I'm going to finish the AraSim fix tomorrow and then try starting a run. I'm going to be out on Wednesday and Thursday, so I'll need someone else to keep the run going.
Alex P    
Eliot    
Leo    
Evelyn    
Ryan Created a program that combed through all the files, got the average high score of each set of parameters, and saved that information in a master file to quickly use for finding which parameters worked best.  Create the programs to make the desired graphs we wanted.
Ben    
Ethan    

 

Entry  Thu Jul 9 16:54:27 2020, Alex M, Daily Update 7/9/20 
Name Update Plans for Tomorrow
Alex M

Finished running the loop for 15 generations. The last plots are on Dropbox under Length_Cutoff_Test_3. There are three things to fix before running this version of the loop again. See below the table.

We started running the next step, which is the database version. 

Julie, Ben, Ethan, and I all met to talk about AREA. We started going through the code to understand it, but we're stuck because we don't know if AraSim Lite is actually in the directory or not.

I started rewriting the AREA section.

I'll keep the run going, but I might have to stop it and look for bugs if the evolution doesn't track with the previous run. We named this one Data_Base_Test_7_9.

Alex P    
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan    

 

Three things to change before doing the next symmetric run (w/o database):

1. Change how the mutation is done. We have at least a partial solution to this--the standard deviations were all 1 due to type errors in c++. We already have a fix for this.

2. The bicone separation distance was too small (0.05 cm). We should increase it to 2-3 cm to represent an actual buildable bicone.

3. We kept the random number seeds in. This isn't a bug, but it's something we need to remember to change before we run this again.

Entry  Wed Jul 8 15:09:42 2020, Alex Patton, Daily Update 7/8/20 
Name Update for Today Plans for Tomorrow
Alex M

Continued the run from yesterday. Alex P and I applied the changes we made to the bash script for fixing the bugs we found in the simulation number to the database version. Once we get 15 generations of data for this run we'll start testing the database version. Once that works we'll have Eliot and Leo run their version with only 3 genes and if we get the same genes all around we'll be able to merge the asymmetric branch with the database branch and then merge with master. I also fixed up a bug I found in one of the plotting programs that meant it was missing the first individual, it should be good now.

Ryan and I talked a bit about how to apply crossover and mutation to the paperclip. I'm not sure that crossover makes sense, but I think it's possible. He's going to start by doing crossover of the chromosomes and then when that's working he can try crossing over the genes (this is where I'm not sure things make sense, but who knows). Then he can implement mutation in that function.

Julie and I spoke with Ben about AREA and started helping him siftt through all the files. We're going to meet with Ethan tomorrow to catch him up to speed and start reviewing the bash script that runs all of AREA.

Finish this run, test the database. I'll meet with Julie, Ben, and Ethan to talk about AREA. Also, Eliot and Leo found a potential issue in the roulette algorithm that I asked them to look at, so we might try making a fix in there and see if that allows our mutations to be more reasonable. Alex and I are going to have to meet with Eliot and Leo soon so we can all work together on merging our branches and resolving merge conflicts.

 

Alex P Made a version of my c++ programs that detect whether an individual matches the database and logs that/writes to the database in a way that allows the number of genes to be a variable so that it can be compatible with new antenna designs. Then set to fix the scripts that used the database to fix the error of the right simulation numbers while continuing the current run so that we can test if the database version gives the same results and then will start running the database version and make sure it is good before we merge. Run database test and if it has any differences or bugs from the original, fix those before we merge. Also have Eliot and Leo run thiers but with just three parameters so that we can test and make sure the accuracy between all versions is the same and none have bugs present before we merge.
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan    

 

Entry  Tue Jul 7 14:39:12 2020, Alex Patton, Daily Update 7/7/20 
Name Update for Today Plans for Tomorrow
Alex M

Found a bug while running the loop, fixed it and starting running again. Added a few things to the paper.

Alex and I will keep the loop running tomorrow. We're gonna try to meet with Eliot and Leo to work on implementing the bug fixes we've made into their version of the loop, then merge it with our branch. We need to add it to the database version too, and then after all of that we can merge with the master branch.
Alex P Now that length constraint is working properly ran a few generations and found a small error with the mutation calculation that allowed for the theta to go below zero, fixed that change and started a run to have a bugless run. Continue this run for 15 generations or so and then work on the merging process with Eliot and Leo's branch and then once that is working merge with master once we have confirmed the absence of bugs
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan    

 

Entry  Thu Jul 30 15:36:54 2020, Alex M, Daily Update 7/30/20 
Name Update Plans for Tomorrow
Alex M

Tested the database setting for the symmetric loop. It ran well and gave correct results, so all that's left to test is the asymmetric loop (going to start that momentarily). 

Julie and I helped Ben and Ethan set up a git repository for AREA and gave a brief review of using git.

Assuming that the asymmetric setting isn't too dificult to get working on the master branch, I'll be able to start collecting real data. I'll need to implement the fixes I outlined last week on Wednesday (see the ELOG post), but those should be quick changes. I'll start by using the database for a symmetric bicone with 10 individuals per generation.
Alex P    
Eliot    
Leo    
Evelyn    
Ryan    
Ben    
Ethan    

 

ELOG V3.1.5-fc6679b