Using TREE-PUZZLE on Big Red at IU
On this page:
- Introduction
- General information about submitting jobs
- Using default options
- Using more than 4 processes
- Running for more than 2 hours
- Specifying options to TREE-PUZZLE
Introduction
TREE-PUZZLE builds evolutionary trees from molecular sequence data
using maximum-likelihood methods. On Big Red at Indiana
University, TREE-PUZZLE is installed in
/N/soft/linux-sles9-ppc64/tree-puzzle-5.2. Documentation
is available in
/N/soft/linux-sles9-ppc64/tree-puzzle-5.2/doc; it's also
available
online. Both the single-process (serial) and multi-process
versions are installed. The single-process version is Puzzle, and you
can place it on your path using the command:
You can run the multi-process version using a script called
ppuzzlejob, which should be available from the command
line on Big Red by default. This script submits a job to the batch
queue. References to TREE-PUZZLE below are to the multi-process
(parallel) version.
General information about submitting jobs
The script ppuzzlejob submits a job to the queue on
Big Red. It should report that your job has been submitted. When your
job has finished, you will receive mail. To check the status
of your job, run the command:
Replace username with your username.
Using default options
If you're satisfied with using 4 processes for up to two hours and with TREE-PUZZLE's default options, run TREE-PUZZLE by changing into the directory that contains your data file and running the command:
ppuzzlejob my_data_file
Replace my_data_file with the name of the file that
contains your data.
Using more than 4 processes
Use the -p option to specify the number of
processes that you would like to use:
Replace num_of_procs with the number of processes that
you want, and my_data_file with the name of your data
file. For example, to use 64 processes on a file named
globin.a, run the command:
When specifying processes, use a multiple of 4. If you do not, your requested number will be replaced with the multiple of 4 that is just larger than your request.
Doubling the number of processes has been shown to halve execution
time up to at least 12 processes in published results. The
degree to which TREE-PUZZLE scales beyond 12 processes is unknown,
although it probably scales quite well given the nature of the problem
that it solves. The maximum number of processes you can request is 128
in the queue to which the job is submitted by default. Another queue
is available that supports more processes. For details, see the Unix
man page for ppuzzlejob.
Running for more than 2 hours
Jobs are allowed to run for only 2 hours unless you request more
time. You can request at most 336 hours (14 days) from the default
queue to which ppuzzlejob submits jobs. (Other queues are
available that allow less time. For details, see the Unix man
page for ppuzzlejob.)
Use the -wallhours option to request more time in integer
hours. For example, to run the same job as above for 42 hours, you
would use the command:
Specifying options to TREE-PUZZLE
TREE-PUZZLE accepts options from a file that is separate from your
data file. The options file contains two lines per option: the first
contains the option name and the second the value. The file ends with
a y to signal the end of options. For
example, to set the value of option t to 10,
you would place the following in the options file, with no leading
spaces:
When running ppuzzlejob, specify the option file using
the -f option:
Replace optfile and datafile with the names
of your options and data files, respectively. For example, to run
TREE-PUZZLE with an option file named globin.opts, a data
file named globin.dat, and 32 processes, you would use:
Consult the TREE-PUZZLE documentation, as described above, for the options and their meanings.
This document was developed with support from the National Science Foundation (NSF) under Grant No. 0503697 to the University of Chicago and subcontracted to Indiana University. Additional support was provided by IU through its participation in the TeraGrid, which is supported by the NSF under Grants No. 0833618, SCI451237, SCI535258, and SCI504075. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.
Last modified on May 23, 2008.







