Using PCAP on Big Red at IU
PCAP is a software package for large-scale assembly of genomic sequences with quality values and with or without forward-reverse read pairs. At Indiana University, PCAP is available on Big Red.
Note: The related package, CAP3, is for small-scale assembly of EST sequences with or without quality values. For more, see Using CAP3 on Big Red at IU.
On this page:
Request a Big Red account
Access to Big Red is provided to all IU faculty and graduate students, and faculty-sponsored undergraduates and staff. Instructional use is limited to courses that have been approved by the Director for Research Technologies. If you use Big Red, you need to know the Big Red usage policies.
To request a Big Red account, use the Account Management System (AMS); see At IU, if I already have some computing accounts, how do I get others? For more, see Getting started on Big Red.
Send in a license agreement
PCAP is restricted-use software. If you are at IU and want to use this package, see CAP3 and PCAP Assembly Programs for information on obtaining a license agreement. Then, email Bioinformatics Support agreeing to the license terms and requesting the use of the PCAP package on Big Red.
Set up SoftEnv and submit jobs
- To set up SoftEnv, add the line
+pcapto your.softfile, and then execute theresoftcommand.
- Read the instruction files before using the package:
- To learn to use PCAP, read
/N/soft/linux-sles9-ppc64/PCAP-64/doc/Doc. - To learn to use
pcap.rep, read/N/soft/linux-sles9-ppc64/PCAP-64/doc/Doc.rep. - To learn to use
autopcap, read/N/soft/linux-sles9-ppc64/PCAP-64/doc/autopcap.doc.
- To learn to use PCAP, read
- To see the options for PCAP,
pcap.rep, andautopcap, enter the name of the program at the command line. You will see something like the following.For PCAP:
jdoe@BigRed~> pcap VersionDate: 06/07/05 Usage: pcap File_of_file_names [options] File_of_file_names is a file of names of read files If File_of_file_names is named 'xyz', then the file of constraints must be named 'xyz.con'. Options (default values): -a N specify band expansion size N > 10 (15) -b N specify min distance between diag. bands N > 30 (65) -c N specify base quality cutoff for clipping N > 5 (10) -e N specify segment pair score cutoff N > 30 (40) -f N specify chain score cutoff N > 60 (80) -g N specify gap penalty factor N > 0 (6) -i N specify max length of a read end to clip N > 50 (400) -j N specify max sum of quality values to clip N > 1000 (3500) -k N specify max sum of qv outside similarity N > 100 (400) -l N specify min depth of coverage for repeats N > 20 (75) -m N specify match score factor N > 0 (2) -n N specify mismatch score factor N < 0 (-5) -o N specify overlap length cutoff > 20 (30) -r N specify directory name for base/quality files (null) Note: If base/quality files are in the current directory, then the -r option must not appear on the command line. -s N specify overlap similarity score cutoff N > 100 (1000) -t N specify number of segment pairs cutoff N > 10 (150) -w N specify number of words cutoff N > 20 (500) -x N specify prefix string for output file names (pcap) -y N specify number of processors N > 0 (1) -z N specify processor id N >= 0 (0)For
jdoe@BigRed:~> pcap.rep VersionDate: 06/07/05 Usage: pcap.rep File_of_file_names [options] File_of_file_names is a file of names of read files If File_of_file_names is named 'xyz', then the file of constraints must be named 'xyz.con'. Options (default values): -a N specify band expansion size N > 10 (15) -b N specify min distance between diag. bands N > 30 (65) -c N specify base quality cutoff for clipping N > 5 (10) -e N specify segment pair score cutoff N > 30 (40) -f N specify chain score cutoff N > 60 (80) -g N specify gap penalty factor N > 0 (6) -i N specify max length of a read end to clip N > 50 (400) -j N specify max sum of quality values to clip N > 1000 (3500) -k N specify max sum of qv outside similarity N > 100 (400) -l N specify min depth of coverage for repeats N > 20 (75) -m N specify match score factor N > 0 (2) -n N specify mismatch score factor N < 0 (-5) -o N specify overlap length cutoff > 20 (30) -q N specify read index cutoff N > 1000 (50000) -r N specify directory name for base/quality files (null) Note: If base/quality files are in the current directory, then the -r option must not appear on the command line. -s N specify overlap similarity score cutoff N > 100 (1000) -t N specify limit in millions for pruning overlaps N >= 0 (10) -u N specify word length 11 < N < 13 (12) -v N specify max no. of words in a superword 3 < N < 15 (9) -w N specify max no. of superword occurrences N > 2 (100) -x N specify prefix string for output file names (pcap) -y N specify number of processors N > 0 (1) -z N specify processor id N >= 0 (0)pcap.rep:For
jdoe@BigRed~> autopcap Usage: autopcap FileOfFileNames [options] FileOfFileNames is a file of file names Options (default value): -d N specify stringent qual diff score cutoff N > 20 (130) -l N specify min depth of coverage for repeats N > 20 (75) -m N specify amount of available memory in GB N >= 1 (1) -p N specify running pcap jobs in parallel N >= 0 (1) -s N specify adjusted overlap score cutoff N > 100 (4500) -t N specify overlap percent identity cutoff N > 75 (92) -v N specify program type: 1 for PCAP; 0 for PCAP.REP (1) -y N specify number of pcap jobs N >= 2 (2)autopcap: - If your job will run for fewer than 20 minutes, call PCAP or
pre.pcapin the directory where your input files are.If your program will take more than 20 minutes to run, use the
serialjobscript to submit a batch job. To learn how to use theserialjobcommand, enterman serialjob.
Output that PCAP generates will be in the same directory as your input files.
Last modified on November 24, 2010.







