Updates and Results Talks and Posters Advice Ideas Important Figures Write-Ups Outreach How-To Funding Opportunities GENETIS
  Important Plots, Tables, and Measurements, Page 2 of 2  ELOG logo
New entries since:Wed Dec 31 19:00:00 1969
ID Date Author Type Categorydown Subject Project
  13   Thu Mar 23 21:43:12 2017 Abdullah AlhagAnalysisAnalysisThe results from running Karoo on the inelasticity data 

See attatchd file.

Attachment 1: The_result_of_using_Karoo_on_inelasticity_data.pdf
  17   Mon Apr 24 22:27:29 2017 Brian ClarkAnalysisAnalysisEstimate of ARA Station-Year/ LivetimeARA

In response to a request by Amy, I make an estimae of the number of deep "station-years" of data obtained by ARA so far. This means roughly (# deep stations) * (# months livetime). This is very approximate, and only counts days where ARA has data in the storage vault on cobalt. It doesn't verify that cal pulsers are running, or that we actually have data for very hour of every day, etc.

Not accounting for 2013 A1 data, I get the following estimate. All I did was "ls | wc -l" on all of the data directories to count the number of days.
ARA1: 285 days 2012 + 124 days 2014 + 29 days 2015 + 117 days 2016 + 0 days 2017 = 555 days total
ARA2: 211 days 2013 + 310 days 2014 + 345 days 2015 + 314 days 2016 + 109 days 2017 = 1289 days total
ARA 3: 214 days 2013 + 303 days 2014 + 251 days 2015 + 292 days 2016 + 0 days 2017 = 1060 days total
So for the three stations that is 2904 days total, or ~8 station years of data.
 
The spreadsheet with the calculation is attached, including which directories I searched over to make the count.
Attachment 1: Livetime_Estimate.pdf
Attachment 2: Livetime_Estimate.xlsx
  20   Fri Jul 28 17:43:20 2017 Abdullah AlhagAnalysisAnalysis GP algorithms 

In this post, I will be pointing out the advantage and the disadvantage of the GP algorithms I came across, particular Eureqa and HeuristicLab.

Eureqa is by far the fastest genetic algorithm software I came across. It is over simplified and easy to use. It has some built-in fitness function and also with some playing with the function that is being solved for and some other feature, it is possible for one to write his/her own fitness function. Moreover, the software is available for free for academic use and for most platform. Other features come with the software is the ability to normalize the data in different ways and even handle outliers and missing data. The program support a large collection of functions including trig and more complex one.

One the other hand, HeuristicLab is much slower than Eureqa but still far faster than Karoo-GP. The latest version of the software was released a year ago, and the support for the software is fairly slow. It is only supported for Windows; however, there is plans to adopted to Linux systems. The software support way more feature than Eureqa or karoo and even different regression and classification algorithms. You could also get the function ready to use in many software such as MATLAB, Excel, Mathematica, and much more. Another cool feature is that it shows you a three of the function and the weight of each node (operation or operand), greener means the node has more weight, see attached. It should be noticed that the software has the tendency to grow large three which could be fixed by changing the default max three length and the max three depth. The software has a problem with the last update of windows 10, you will get the blue screen if you opened too many windows, so be careful.

In booth software you could change how much of the data set goes to training and how much goes to testing, be sure to shuffle the data in HeuristicLab as it will otherwise distribute the data as training and testing non-randomly. Booth software by default shows plot of the current function with the x axis being the data entries (row numbers) and y axis being the target values with a curve of the function estimate values.

Below is a simple run of both software on a fake data that Prof. Amy gave me, see out.txt for data.

For start, using eureqa I got a few functions with an average of mean of R^2 of 0.98 or more which is very good.

First function: frequency = (571953.335547372*y + 15786079*x^2*y^4 + 297065746*x*y^2*asinh(x))/factorial(7.44918899410674 + x + y)

The first has an R^2 of 0.99(1 means perfect fit) and mean absolute error of 2.98(0 means perfect fit, data dependent, not normalized), see plot1 for 1D plot of the function estimate values vs target values.

 

Second function: frequency = (3569.91823898791*x*y - 149.144996501988 - 100.216589664235*x^2)/(5.26462242203202^x + 7.09216771444399^y*x^(1.3193170267439*x))

The second has an R^2 of 0.994(1 means perfect fit) and mean absolute error of 3.2(0 means perfect fit, data dependent, not normalized), see plot2 for 1D plot of the function estimate values vs target values.

Also attached is 2D plot of the function against the data, the function plotted is the second function, but all are very similar, see Eq23.

 

Using HeuristicLab. The function below has an R^2 of 0.987, mean absolute error of 3.6 and normalized mean squared error of 0.012.

The function is: (((EXP((-1.3681014170483*'y')) * ((((-1.06504220396658) * (2.16142798579652*'x')) * (3.44831687407881*'y')) / ((((1.57186418519305*'y') + (2.15361749794796*'y')) / ((1.6912208581006*'y') / (EXP((1.80824695345446*'x')) * 16.3366330774664))) / ((((2.11818004168659*'x') * (1.10362178478116*'y')) - ((((-1.06504220396658) * (2.16142798579652*'x')) * (3.44831687407881*'y')) / ((((2.11818004168659*'x') + (2.15361749794796*'y')) / ((10.9740866421104 + (1.8106235953875*'y')) - (2.15361749794796*'y'))) / (((-7.8798958167) + (-6.76475761634751)) + ((2.87007061579651*'x') + (2.15361749794796*'y')))))) - (((((-8.85637334631747) * ((1.9238243855142*'y') - (1.01219957177297*'y'))) + (((-6.37085286789103) * 5.99856391145622) - ((-12.9565240969832) - 2.84841224458228))) - ((2.11818004168659*'x') * (1.10362178478116*'y'))) - (((2.11818004168659*'x') * ((((0.197306324191089*'y') + (0.255996267596584*'y')) - (2.16142798579652*'x')) - (((-1.06504220396658) * (2.16142798579652*'x')) / (12.2597910897177 / (1.25729305246107*'y'))))) - (EXP((-1.3681014170483*'y')) - ((((-6.29806655709512) * 6.39744830364858) / (12.2597910897177 / (0.728256926023423*'x'))) - (1.10362178478116*'y'))))))))) * 179.42788632856) + 2.24688690162535)

 

As you could see, HeuristicLab tend to generate function which are extremely large, this one has a depth of 15 and a length of 150.

 

See plot3 for 1D plot of the function estimate values vs target values, note that it is different from before because the data is shuffled,  and three.jpg for the three representation of the function showing the wight of each node, greener means has more weight.

Attachment 1: plot1.png
plot1.png
Attachment 2: plot2.png
plot2.png
Attachment 3: plot3.png
plot3.png
Attachment 4: Eq23-0.png
Eq23-0.png
Attachment 5: three.png
three.png
  22   Wed Sep 13 09:28:15 2017 Amy ConnollyAnalysisAnalysisInfo on generating pseudoexperiments, calculating likelihoods from them and finding p-valuesOther

Will point to a bunch of papers and stuff here.

  28   Tue Oct 10 11:04:05 2017 Brian ClarkAnalysisAnalysisTestbed Channel Mapping and Antenna InformationARA

This is the Testbed polarization channel mapping. This is the polarization result if you use the getGraphfromRFChan Function:

/* Channel mappings for the testbed
Channel 0: H Pol
Channel 1: H Pol
Channel 2: V Pol
Channel 3: V Pol
Channel 4: V Pol
Channel 5: H Pol
Channel 6: V Pol
Channel 7: H Pol
Channel 8: V Pol
Channel 9: H Pol
Channel 10: V Pol
Channel 11: H Pol
Channel 12: H Pol
Channel 13: H Pol
Channel 14: Surface
Channel 15: Surface
*/

Also, the Testbed has a somewhat bizarre menagerie of antennas. Here's how to understand it:

Check out the table of antennas for the testbed (table 1 in both papers): https://arxiv.org/pdf/1105.2854.pdf , https://arxiv.org/pdf/1404.5285.pdf
 
Basically the testbed was weird. There are four bowtie slotted cylinders deployed at ~30 m (these are the "deep hpol") and four bicones deployed at ~30 m (these are the "deep vpol"). So 8 total there: four V, four H.
 
Then, there are two quad slotted cylinders deployed at ~30m, but because they are different from bowties, they are technically hpol, but aren't counted as deep hpol. That brings us to 10. Four V, six H.
 
Then, there are two discones at ~2m, which count as vpol, but because they are different than the bicones and deployed shallow, they aren't deep. That brings us to 12 total: six vpol, six hpol.
 
Then, there are two batwings at ~2m, which counts as hpol, but because they are different than the bowtie and the quad slot and deployed shallow, they're in a class of their own. That brings us to fourteen: six vpol, eight hpol.
 
Finally, there are two fat dipoles right on the surface, which count as neither polarization, which brings us up to 16 total.
  Draft   Fri Dec 16 11:24:09 2016  Software Installation   
ELOG V3.1.5-fc6679b