Updates and Results Talks and Posters Advice Ideas Important Figures Write-Ups Outreach How-To Funding Opportunities GENETIS
  Place to document instructions for how to do things  ELOG logo
Message ID: 32     Entry time: Mon Dec 17 21:16:31 2018
Author: Brian Clark 
Subject: Run over many data files in parallel 
Project:  

To analyze data, we sometimes need to run over many thousands of runs at once. To do this in parallel, we can submit a job for every run we want to do. This will proceed in several steps:

  1. We need to prepare an analysis program.
    1. This is demo.cxx.
    2. The program will take an input data file and an output location.
    3. The program will do some analysis on each events, and then write the result of that analysis to an output file labeled by the same number as the input file.
  2. We need to prepare a job script for PBS.
    1. This is "run.sh"; this is the set of instructions to be submitted to the cluster.
    2. The instructions say to:
      1. Source a a shell environment
      2. To run the executable
      3. Move the output root file to the output location.
    3. Note that we're telling the program we wrote in step 1 to write to the node-local $TMPDIR, and then moving the result to our final output directory at the end. This is better for cluster performance.
  3. We need to make a list of data files to run over
    1. We can do this on OSC by running ls -d -1 /fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event*.root > run_list.txt
    2. This places the full path to the ROOT files in that folder into a list called run_list.txt that we can loop over.
  4. Third, we need to script that will submit all of the jobs to the cluster.
    1. This is "submit_jobs.sh".
    2. This loops over all the files in our run_list.txt and submits a run.sh job for each of them.
    3. This is also where we define the $RUNDIR (where the code is to be exeucted) and the $OUTPUTDIR (where the output products are to be stored)

Once you've generated all of these output files, you can run over the output files only to make plots and such.

 

Attachment 1: demo.cxx  1 kB  Uploaded Mon Dec 17 22:17:00 2018  | Show | Hide all | Show all
Attachment 2: run.sh  704 Bytes  Uploaded Mon Dec 17 22:17:00 2018  | Hide | Hide all | Show all
#/bin/bash
#PBS -l nodes=1:ppn=1
#PBS -l mem=4GB
#PBS -l walltime=00:05:00
#PBS -A PAS0654
#PBS -e /fs/scratch/PAS0654/shell_demo/err_out_logs
#PBS -o /fs/scratch/PAS0654/shell_demo/err_out_logs

# you should change the -e and -o to write your 
# log files to a location of your preference

# source your own shell script here
eval 'source /users/PAS0654/osu0673/A23_analysis/env.sh'

# $RUNDIR was defined in the submission script 
# along with $FILE and $OUTPUTDIR

cd $RUNDIR

# $TMPDIR is the local memory of this specific node
# it's the only variable we didn't have to define

./bin/demo $FILE $TMPDIR 

# after we're done
# we copy the results to the $OUTPUTDIR

pbsdcp $TMPDIR/'*.root' $OUTPUTDIR
Attachment 3: run_list.txt  1 kB  Uploaded Mon Dec 17 22:17:00 2018  | Hide | Hide all | Show all
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1000.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1001.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1002.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1004.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1005.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1006.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1007.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1009.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1010.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1011.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1012.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1014.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1015.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1016.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1017.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1019.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1020.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1021.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1022.root
/fs/scratch/PAS0654/ara/10pct/RawData/A3/2013/sym_links/event1024.root
Attachment 4: submit_jobs.sh  497 Bytes  Uploaded Mon Dec 17 22:17:00 2018  | Show | Hide all | Show all
ELOG V3.1.5-fc6679b