ELOG Updates and Results

Subject

Project

Thu Mar 23 21:43:12 2017

Abdullah Alhag

Analysis

The results from running Karoo on the inelasticity data

See attatchd file.

Attachment 1: The_result_of_using_Karoo_on_inelasticity_data.pdf

Mon Apr 24 22:27:29 2017

Brian Clark

Analysis

Estimate of ARA Station-Year/ Livetime

ARA

In response to a request by Amy, I make an estimae of the number of deep "station-years" of data obtained by ARA so far. This means roughly (# deep stations) * (# months livetime). This is very approximate, and only counts days where ARA has data in the storage vault on cobalt. It doesn't verify that cal pulsers are running, or that we actually have data for very hour of every day, etc.

Not accounting for 2013 A1 data, I get the following estimate. All I did was "ls | wc -l" on all of the data directories to count the number of days.

ARA1: 285 days 2012 + 124 days 2014 + 29 days 2015 + 117 days 2016 + 0 days 2017 = 555 days total

ARA2: 211 days 2013 + 310 days 2014 + 345 days 2015 + 314 days 2016 + 109 days 2017 = 1289 days total

ARA 3: 214 days 2013 + 303 days 2014 + 251 days 2015 + 292 days 2016 + 0 days 2017 = 1060 days total

So for the three stations that is 2904 days total, or ~8 station years of data.

The spreadsheet with the calculation is attached, including which directories I searched over to make the count.

Attachment 1: Livetime_Estimate.pdf

Attachment 2: Livetime_Estimate.xlsx

Fri Jul 28 17:43:20 2017

Abdullah Alhag

Analysis

GP algorithms

In this post, I will be pointing out the advantage and the disadvantage of the GP algorithms I came across, particular Eureqa and HeuristicLab.

Eureqa is by far the fastest genetic algorithm software I came across. It is over simplified and easy to use. It has some built-in fitness function and also with some playing with the function that is being solved for and some other feature, it is possible for one to write his/her own fitness function. Moreover, the software is available for free for academic use and for most platform. Other features come with the software is the ability to normalize the data in different ways and even handle outliers and missing data. The program support a large collection of functions including trig and more complex one.

One the other hand, HeuristicLab is much slower than Eureqa but still far faster than Karoo-GP. The latest version of the software was released a year ago, and the support for the software is fairly slow. It is only supported for Windows; however, there is plans to adopted to Linux systems. The software support way more feature than Eureqa or karoo and even different regression and classification algorithms. You could also get the function ready to use in many software such as MATLAB, Excel, Mathematica, and much more. Another cool feature is that it shows you a three of the function and the weight of each node (operation or operand), greener means the node has more weight, see attached. It should be noticed that the software has the tendency to grow large three which could be fixed by changing the default max three length and the max three depth. The software has a problem with the last update of windows 10, you will get the blue screen if you opened too many windows, so be careful.

In booth software you could change how much of the data set goes to training and how much goes to testing, be sure to shuffle the data in HeuristicLab as it will otherwise distribute the data as training and testing non-randomly. Booth software by default shows plot of the current function with the x axis being the data entries (row numbers) and y axis being the target values with a curve of the function estimate values.

Below is a simple run of both software on a fake data that Prof. Amy gave me, see out.txt for data.

For start, using eureqa I got a few functions with an average of mean of R^2 of 0.98 or more which is very good.

First function: frequency = (571953.335547372*y + 15786079*x^2*y^4 + 297065746*x*y^2*asinh(x))/factorial(7.44918899410674 + x + y)

The first has an R^2 of 0.99(1 means perfect fit) and mean absolute error of 2.98(0 means perfect fit, data dependent, not normalized), see plot1 for 1D plot of the function estimate values vs target values.

Second function: frequency = (3569.91823898791*x*y - 149.144996501988 - 100.216589664235*x^2)/(5.26462242203202^x + 7.09216771444399^y*x^(1.3193170267439*x))

The second has an R^2 of 0.994(1 means perfect fit) and mean absolute error of 3.2(0 means perfect fit, data dependent, not normalized), see plot2 for 1D plot of the function estimate values vs target values.

Also attached is 2D plot of the function against the data, the function plotted is the second function, but all are very similar, see Eq23.

Using HeuristicLab. The function below has an R^2 of 0.987, mean absolute error of 3.6 and normalized mean squared error of 0.012.

The function is: (((EXP((-1.3681014170483*'y')) * ((((-1.06504220396658) * (2.16142798579652*'x')) * (3.44831687407881*'y')) / ((((1.57186418519305*'y') + (2.15361749794796*'y')) / ((1.6912208581006*'y') / (EXP((1.80824695345446*'x')) * 16.3366330774664))) / ((((2.11818004168659*'x') * (1.10362178478116*'y')) - ((((-1.06504220396658) * (2.16142798579652*'x')) * (3.44831687407881*'y')) / ((((2.11818004168659*'x') + (2.15361749794796*'y')) / ((10.9740866421104 + (1.8106235953875*'y')) - (2.15361749794796*'y'))) / (((-7.8798958167) + (-6.76475761634751)) + ((2.87007061579651*'x') + (2.15361749794796*'y')))))) - (((((-8.85637334631747) * ((1.9238243855142*'y') - (1.01219957177297*'y'))) + (((-6.37085286789103) * 5.99856391145622) - ((-12.9565240969832) - 2.84841224458228))) - ((2.11818004168659*'x') * (1.10362178478116*'y'))) - (((2.11818004168659*'x') * ((((0.197306324191089*'y') + (0.255996267596584*'y')) - (2.16142798579652*'x')) - (((-1.06504220396658) * (2.16142798579652*'x')) / (12.2597910897177 / (1.25729305246107*'y'))))) - (EXP((-1.3681014170483*'y')) - ((((-6.29806655709512) * 6.39744830364858) / (12.2597910897177 / (0.728256926023423*'x'))) - (1.10362178478116*'y'))))))))) * 179.42788632856) + 2.24688690162535)

As you could see, HeuristicLab tend to generate function which are extremely large, this one has a depth of 15 and a length of 150.

See plot3 for 1D plot of the function estimate values vs target values, note that it is different from before because the data is shuffled, and three.jpg for the three representation of the function showing the wight of each node, greener means has more weight.

Attachment 1: plot1.png

Attachment 2: plot2.png

Attachment 3: plot3.png

Attachment 4: Eq23-0.png

Attachment 5: three.png

Wed Sep 13 09:28:15 2017

Amy Connolly

Analysis

Info on generating pseudoexperiments, calculating likelihoods from them and finding p-values

Other

Will point to a bunch of papers and stuff here.

Tue Oct 10 11:04:05 2017

Brian Clark

Analysis

Testbed Channel Mapping and Antenna Information

ARA

This is the Testbed polarization channel mapping. This is the polarization result if you use the getGraphfromRFChan Function:

/* Channel mappings for the testbed
Channel 0: H Pol
Channel 1: H Pol
Channel 2: V Pol
Channel 3: V Pol
Channel 4: V Pol
Channel 5: H Pol
Channel 6: V Pol
Channel 7: H Pol
Channel 8: V Pol
Channel 9: H Pol
Channel 10: V Pol
Channel 11: H Pol
Channel 12: H Pol
Channel 13: H Pol
Channel 14: Surface
Channel 15: Surface
*/

Also, the Testbed has a somewhat bizarre menagerie of antennas. Here's how to understand it:

Check out the table of antennas for the testbed (table 1 in both papers): https://arxiv.org/pdf/1105.2854.pdf , https://arxiv.org/pdf/1404.5285.pdf

Basically the testbed was weird. There are four bowtie slotted cylinders deployed at ~30 m (these are the "deep hpol") and four bicones deployed at ~30 m (these are the "deep vpol"). So 8 total there: four V, four H.

Then, there are two quad slotted cylinders deployed at ~30m, but because they are different from bowties, they are technically hpol, but aren't counted as deep hpol. That brings us to 10. Four V, six H.

Then, there are two discones at ~2m, which count as vpol, but because they are different than the bicones and deployed shallow, they aren't deep. That brings us to 12 total: six vpol, six hpol.

Then, there are two batwings at ~2m, which counts as hpol, but because they are different than the bowtie and the quad slot and deployed shallow, they're in a class of their own. That brings us to fourteen: six vpol, eight hpol.

Finally, there are two fat dipoles right on the surface, which count as neither polarization, which brings us up to 16 total.

Draft

Fri Dec 16 11:24:09 2016

Software Installation