Indiana University
University Information Technology Services
  
What are archived documents?
Login>>
Login

Login is for authorized groups (e.g., UITS, OVPIT, and TCC) that need access to specialized Knowledge Base documents. Otherwise, simply use the Knowledge Base without logging in.

Close

Using TREE-PUZZLE on Big Red at IU

On this page:


Introduction

TREE-PUZZLE builds evolutionary trees from molecular sequence data using maximum-likelihood methods. On Big Red at Indiana University, TREE-PUZZLE is installed in /N/soft/linux-sles9-ppc64/tree-puzzle-5.2. Documentation is available in /N/soft/linux-sles9-ppc64/tree-puzzle-5.2/doc; it's also available online. Both the single-process (serial) and multi-process versions are installed. The single-process version is Puzzle, and you can place it on your path using the command:

soft add +tree-puzzle

You can run the multi-process version using a script called ppuzzlejob, which should be available from the command line on Big Red by default. This script submits a job to the batch queue. References to TREE-PUZZLE below are to the multi-process (parallel) version.

General information about submitting jobs

The script ppuzzlejob submits a job to the queue on Big Red. It should report that your job has been submitted. When your job has finished, you will receive mail. To check the status of your job, run the command:

llq -u username

Replace username with your username.

Using default options

If you're satisfied with using 4 processes for up to two hours and with TREE-PUZZLE's default options, run TREE-PUZZLE by changing into the directory that contains your data file and running the command:

ppuzzlejob my_data_file

Replace my_data_file with the name of the file that contains your data.

Using more than 4 processes

Use the  -p  option to specify the number of processes that you would like to use:

ppuzzlejob -p num_of_procs my_data_file

Replace num_of_procs with the number of processes that you want, and my_data_file with the name of your data file. For example, to use 64 processes on a file named globin.a, run the command:

ppuzzle -p 64 globin.a

When specifying processes, use a multiple of 4. If you do not, your requested number will be replaced with the multiple of 4 that is just larger than your request.

Doubling the number of processes has been shown to halve execution time up to at least 12 processes in published results. The degree to which TREE-PUZZLE scales beyond 12 processes is unknown, although it probably scales quite well given the nature of the problem that it solves. The maximum number of processes you can request is 128 in the queue to which the job is submitted by default. Another queue is available that supports more processes. For details, see the Unix man page for ppuzzlejob.

Running for more than 2 hours

Jobs are allowed to run for only 2 hours unless you request more time. You can request at most 336 hours (14 days) from the default queue to which ppuzzlejob submits jobs. (Other queues are available that allow less time. For details, see the Unix man page for ppuzzlejob.)

Use the -wallhours option to request more time in integer hours. For example, to run the same job as above for 42 hours, you would use the command:

ppuzzle -p 64 globin.a -wallhours 42

Specifying options to TREE-PUZZLE

TREE-PUZZLE accepts options from a file that is separate from your data file. The options file contains two lines per option: the first contains the option name and the second the value. The file ends with a  y  to signal the end of options. For example, to set the value of option  t  to 10, you would place the following in the options file, with no leading spaces:

t 10 y

When running ppuzzlejob, specify the option file using the  -f  option:

ppuzzlejob -f optfile datafile

Replace optfile and datafile with the names of your options and data files, respectively. For example, to run TREE-PUZZLE with an option file named globin.opts, a data file named globin.dat, and 32 processes, you would use:

ppuzzlejob -p 32 -f globin.opts globin.dat

Consult the TREE-PUZZLE documentation, as described above, for the options and their meanings.

This document was developed with support from the National Science Foundation (NSF) under Grant No. 0503697 to the University of Chicago and subcontracted to Indiana University. Additional support was provided by IU through its participation in the TeraGrid, which is supported by the NSF under Grants No. 0833618, SCI451237, SCI535258, and SCI504075. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

This is document awwe in domains all and tgrid-all.
Last modified on May 23, 2008.

Comments/Questions/Corrections

Use this form to offer suggestions, corrections, and additions to the Knowledge Base. We welcome your input!

If you are affiliated with Indiana University and would like assistance with a specific computing problem, please use the Ask a Consultant form, or contact your campus Support Center.

Contact Information

Note: We will reply to your comment at this address. If your message concerns a problem receiving email, please enter an alternate email address.