Quarry at Indiana University
Note: Following a system-wide upgrade in December 2012, Quarry now runs Red Hat Enterprise Linux version 6 (RHEL 6) and uses the Modules package (instead of SoftEnv) for manipulating user environments. For more, see Information about the 2012 upgrade to Quarry at IU. If you encounter any problems or have questions, email the High Performance Systems group.
On this page:
- Introduction
- System information
- Available software
- Requesting an account or software
- Connecting and logging in
- Managing local and remote files
- Using the vi editor
- Using the GNU Emacs editor
- Command files (shell scripts)
- Using Modules to set up your software environment
- File storage options
- Parallel computing and MPI
- Compiling programs and submitting batch jobs to TORQUE (OpenPBS)
- Compiling and submitting parallel jobs
- Usage policies
- Support
Introduction
Quarry (
quarry.uits.indiana.edu) is Indiana University's
primary Linux cluster computing environment for research and research
instruction use. It also serves as a "condominium cluster" environment
for researchers, research labs, departments, and schools that want to
have computational nodes housed within the IU Bloomington Data Center
and managed by UITS Research Technologies staff. Additionally, Quarry
provides a Virtual
Machine hosting environment for the Extreme Science and
Engineering Discovery Environment (XSEDE), the National Science
Foundation's largest advanced cyberinfrastructure facility.
Quarry consists of IBM HS21 Bladeservers and IBM iDataPlex dx340 rack-mounted servers running Red Hat Linux. The Bladeservers run RHEL 4.8, and the dx340 servers run RHEL 5.6. Job management is provided by the TORQUE resource manager (also called PBS) and the Moab job scheduler. The Modules system is used to simplify application access and environment configuration.
If you are interested in hosting computational nodes within Quarry in a "condominium computing" approach please, see At IU, what physical and virtual options does the Data Center provide for housing departmentally owned IT equipment, and what other options are available? Condominium computing is what it sounds like intuitively, but without the condominium association fees.
If you or your IU sub-unit have money available to use for a computing cluster, you always have the option of buying and operating your own cluster, as well as managing it, backing it up, and securing it against hackers. Alternatively, you can purchase nodes that are compatible with IU's Quarry cluster, have them installed in the very secure IUB Data Center, have them available when you want to use them, and have them managed, backed up, and secured by UITS Research Technologies staff. With this option, you get access to your nodes within seconds of requesting their use. Additionally, when they are not in use, they become available to others in the IU community, thereby expanding the computing capability available to the IU community while conserving natural resources and energy. Since any piece of computing equipment has a relatively short useful life (about four years), and it takes considerable energy and a variety of metals to make a computer node, the least environmental impact is achieved by using computing equipment to its absolute maximum capability. Because of the benefits to the IU community, UITS generally hosts condominium computing nodes without charging maintenance or operations fees.
System information
| System configuration | Aggregate information | Per node |
|---|---|---|
| Machine type | Cluster computing, high-throughput computing, and condominium computing for research and research instruction | |
| Operating system | RHEL 4.8 (Bladeservers) RHEL 5.6 (dx340 servers) |
|
| Memory model | Distributed and shared | |
| Processor cores | 2,960 | 8 |
| CPUs | Intel Xeon 5335 quad-core processors | |
| Nodes | 370 compute nodes 2 image server nodes 3 management nodes |
|
| RAM | 1,536 GB | 8 GB or 16 GB |
| Network | Each Blade chassis has a 10-gigabit Ethernet connection that connects to the other IU research systems. 14 HS21 Blades, each with gigabit Ethernet adapters, share that connection. | |
| Local storage | 40.85 TB | 36 GB locally attached SAS disk (Bladeservers), 160 GB locally attached SAS disk (dx340) |
| RMax | 26 teraflops (8.96 Bladeservers + 17.1 dx340 servers) | |
| Storage information | ||
| File systems
Note: Replace |
Home directory/N/u/username/Quarry
User home directories are NFS-exported from a Network-Attached Storage (NAS) device - 10 GB shared by accounts on Mason, Big Red, and Research Database Complex (if you have them) Local scratch space 73 GB Serial Attached SCSI (SAS) Shared scratch space Shared scratch space is hosted on the Data Capacitor. Note: Indiana University will soon replace its current Data Capacitor with Data Capacitor II, a high-speed, high-capacity storage facility for very large data sets. With 5 PB of storage, Data Capacitor II will support big data applications used in computational research. IU partnered with DataDirect Networks, Inc. (DDN) to develop Data Capacitor II, which is scheduled to be installed in the IU Data Center in spring 2013. For more about Data Capacitor II, see the November 8, 2012, press release. If you have questions about how the change to Data Capacitor II will affect your research, email the High Performance File Systems group. |
|
| Backup and purge policies | Files older than 60 days are periodically purged, following user notification. | |
Available software
Software installed on Quarry is made available to users via Modules, an environment management system that lets you easily and dynamically add software packages to your user environment. For a list of software modules available on Quarry, see Quarry Modules in the IU Cyberinfrastructure Gateway.
For more on Modules, see the Using Modules to set up your software environment below.
For more on the IU Cyberinfrastructure Gateway, see What is the IU Cyberinfrastructure Gateway?
Requesting an account or software
Requesting an account
Accounts on Quarry are available to IU undergraduate and graduate students, faculty, and staff. To request an account on Quarry, use the Account Management Service (AMS); see At IU, if I already have some computing accounts, how do I get others?
Note: Accounts remain valid only while the account
holder is a registered IU student, or an IU faculty or staff
member. On Big Red, Quarry, Mason, and the Research Database Complex,
accounts are disabled during the semester following the account
holder's departure from IU, and then are purged within six months. To
exempt a research systems account from disabling, email a request to
the Support Center's Academic Accounts team
( valid@indiana.edu ). If the request is
approved, the account will remain activated for one calendar year
beyond the user's departure from IU. Then, at the end of the year, the
account will be purged. Extensions beyond one year for research
accounts are granted only for accounts involved in funded research and
having an IU faculty sponsor, or with approval of the Dean or Director
of Research and Academic Computing.
Requesting software
If you are at IU and have an account on one of IU's research systems, you can request software using the Research Systems Software Request form.
Connecting and logging in
If you are at IU, you can access Quarry using an SSH client; see At IU, what SSH/SFTP clients are supported and where can I get them?
Log into Quarry with your Network ID over
SSH, using a command similar to (replace
your_username with your Quarry username):
Make sure to read the message of the day (MOTD), as it contains news and information regarding the status of the cluster.
The default shell is bash. When you log in
for the first time, you will be prompted to select your preferred
login shell by the changeshell program:
If you use the Bash, Bourne, or Korn shells, the system will
automatically read and execute commands from the
/etc/profile file and your own ~/.profile
(and ~/.bash_profile, in the case of Bash). With the
csh and tcsh shells, the .login
and .cshrc (or .tcshrc) files are read. For
details, see In Unix, what is the shell? or the Unix man
pages.
Head, interactive, and compute nodes
Nodes on Quarry are labeled using qXXXX designations,
with XXXX ranging from 0151 to
0432:
-
Head (login) nodes: (
q0141-q0144) When you log into Quarry as shown above, you will be logging into one of the four head (or login) nodes. The exact node you are assigned when you log in is determined on a round-robin basis. The head nodes have a 20-minute time limit on interactive use. If your interactive job requires more than 20 minutes of processor time, run it from one of the interactive nodes.
-
Interactive nodes: (
q0145-q0148) Use these nodes to run interactive jobs that require more than 20 minutes of processor time. To access an interactive node, use SSH from the Quarry login node: ssh q0145 -
Compute nodes: (
q0151-q0432) These nodes are for running jobs via the batch queuing system. They are accessible only from the head nodes (or from each other); you cannot use SSH to connect directly to the compute nodes from the outside world.
Notes
-
Windows users: If you use an SSH client in
Windows, you cannot open tools that need a graphical user interface
(GUI), like the TotalView debugger. You'll need X
Window emulation software, such as Cygwin. UITS
recommends using XLiveCD
created by the Research Technologies division of UITS.
-
X applications: To use the TotalView
Debugger/Intel Trace Analyzer, or any other graphical application,
from Quarry compute nodes, you must disable X
forwarding by specifying the
-xflag: ssh -x your_username@quarry.uits.indiana.eduTo use common graphical applications, such as Emacs, from the head node, do not use the
-xoption; the default X forwarding must remain enabled. -
Intra-cluster logins: When you log into your
Quarry account for the first time, passphrase-less SSH keys will be
automatically created in your home directory. Those keys should enable
you to log into compute nodes that you have gained access to through
TORQUE without entering a passphrase. Thus, parallel jobs
should run seamlessly on multiple compute nodes without any manual
intervention.
However, you may see the following error message when you try to access assigned compute nodes:
Permission denied (publickey,password,keyboard-interactive)This indicates that the intra-cluster RSA key pair in your home directory is either not present or corrupted. If this happens, enter
gensshkeys(in/opt/xcat/sbin) from any login node. That will generate a passphrase-less key pair, and allow you seamless intra-cluster logins between any nodes in the cluster assigned for your use by TORQUE. -
Forwarding email address for job-related
messages: Quarry will send email about your jobs to the
address specified in the
~/.forwardfile in your home directory. (Note the period [.] preceding the filename.) By default, this is the email address you provided when you requested your account.If you'd like to change this email address, enter a command similar to the following (replace
jdoe@Quarry:~> echo "username@host.com" > ~/.forwardusername@host.comwith your email address):Be sure to use a valid email address; if you do not, you will not be notified about the status of your jobs.
Managing local and remote files
Like other Unix systems, Linux uses a hierarchical, tree-structured directory system to organize files. Data on local disk resides in journaled file systems that contain a root directory from which associated files and subdirectories branch. Directories are catalogs of files that associate names with files, and are used to segregate files into related groups. To display information about currently mounted file systems, at the shell prompt, enter:
dfEach file is described by an inode. An inode contains critical
information about the file, such as file type (directory, ordinary,
character special, block special, or pipe), ownership, access
permissions, group ID, file size, file creation, modification, and
access times, and a pointer to its data blocks. To list information
about the files within your current working directory, at the shell
prompt, enter ls -al . Following is a sample of
ls output:
The first character in the first field indicates the file type
(e.g., - for ordinary file, d for
directory). The next nine characters in the first field indicate the
file access permissions: r for read, w for
write, and x for execute; a hyphen (-)
indicates that the corresponding read, write, or execute permission is
denied. Characters 2, 3, and 4 indicate the file owner's permissions,
characters 5, 6, and 7 indicate the group's permissions, and the last
three characters indicate all others' permissions. Execute permission
for a directory allows users to search through and list its
contents.
Field 2 indicates the number of links to the file. The next two
fields show the owner and the group associated with the file. Field 5
gives the size of the file in bytes. Field 6 is the time the file was
created or last modified. The last field is the name of the file. The
files named . and .. indicate the current
and the parent directory, respectively. For more about ls
and available options, enter man ls .
To change directories, use the cd command. You can use
an absolute or relative path name with the cd
command. For example, assume you are currently in the
/usr/local directory. To get to the
/usr/local/bin directory, you could enter either cd
/usr/local/bin (using the absolute path name) or cd
bin (using the path name relative to
/usr/local). To get back to /usr/local, you
could enter cd /usr/local or just cd .. (to
go up to the parent directory). Entering cd without an
argument will put you into your home directory. To determine what your
current working directory is, enter pwd .
Following are some common commands for working with files:
- To view a file on your screen, enter
cat. To view one screen at a time, entercat | more.
- To copy a file to the same or a different directory, enter
cp. This command overwrites any existing file of the same name. For more information, enterman cp.
- To rename a file or to move it to a new directory, enter
mv. For more information, enterman mv.
- To delete a file, enter
rm. For more about deleting single or multiple files, enterman rm.
- To create a directory, use the
mkdircommand. To delete an empty directory, use thermdircommand. For more information, enterman mkdirorman rmdir.
You can also use the commands for managing local files to manage
remote NFS files, provided that the access permissions
allow it. For example, you cannot use rm on a file in a
remote file system that's mounted as read-only.
For access to remote files other than those already accessible to
Quarry, you may access a remote host using the Secure Shell
scp or sftp commands to copy files to or
from the remote system.
To see how many 1 KB blocks of disk space you are using, enter
du -k . To see the number of 1 KB blocks in your
disk quota, enter quota .
You may exceed your soft quota up to the limit of your hard quota. If you exceed your soft quota, you will receive a message to that effect. You will then have seven days to reduce your use below your soft quota. If you do not reduce your disk usage below your soft quota, you will receive a grace period of seven days, after which you will not be able to write to your disk until you reduce your disk usage below your soft quota.
Using the vi editor
Before entering vi, make sure your environment variable
TERM is set correctly. When vi starts, it looks at TERM to see what
kind of terminal you are using. It then loads the proper terminal
control information from /etc/termcap (the terminal
capabilities file). Enter the appropriate commands below depending on
your shell:
- If you use the Bourne (
sh) or Korn (ksh) shell: $ TERM=vt100 $ export TERM $ vi filename - If you use the C shell (
csh): % setenv TERM vt100 % vi filename
For information about common vi commands, see How do I use the vi text editor?, A quick reference list of vi editor commands, or In vi, how can I access deleted text?
You can find more information in Learning the vi Editor by Linda Lamb (O'Reilly and Associates, Inc., 1992).
Using the GNU Emacs editor
GNU Emacs is a version of Emacs written by the author of the original (PDP-10) Emacs, Richard Stallman. GNU Emacs retains all the functionality of other Emacs editors, but it is also customizable and extensible by modifying its Lisp editing commands. GNU Emacs is designed to work well with X Window systems, but it also functions with ASCII terminals.
The account creation process on Quarry copies a .emacs
file into the newly created home directory:
Because GNU Emacs uses the Ctrl-q and
Ctrl-s key sequences as commands, disable the use of
Ctrl-q and Ctrl-s as terminal output
start/stop signals before entering GNU Emacs. To do this, enter
stty -ixon . After completing your GNU Emacs
session, you may re-enable terminal start/stop signals by entering
stty ixon .
GNU Emacs has an extensive interactive help facility, but to use it
you must know how to manipulate Emacs windows and buffers. (For help
with the Emacs notation that follows, see In Emacs, how are keystrokes denoted?) To enter
Emacs help, with an Emacs window open, press
C-h . To access an interactive tutorial that teaches
the fundamentals of Emacs, press C-h t . To
find a command given its functionality, press
C-h a . The Help Character command
(C-h c) describes a given character's effect, and
the Help Function command (C-h f) describes a given
Lisp function specified by name.
For information about common Emacs commands, see GNU Emacs Quick Reference Guide.
Command files (shell scripts)
Shell scripts are command files interpreted by a shell. These
scripts may contain any statements valid in the shell in which they
are interpreted. You can use them to perform repetitive tasks, such as
setting up your environment, compiling and executing programs,
submitting batch jobs, and performing system management tasks.
Locally written shell script utilities are contained in
/usr/local/bin.
Shell programming constructs include variable assignment, argument handling, evaluation of integer expressions, conditional execution, flow control, file status checking, string and integer comparisons, reading of input, interrupt signal traps, and menu screen generation.
The following example of a C shell script illustrates some basic shell programming constructs.
C shell script
# Scriptname: showuser # Purpose: display information about a user set USAGE = "Usage: showuser userid" # Test for argument if ($#argv > 1) then echo $USAGE exit 1 endif if ($#argv == 1) then set USERNAME = $1 else if ($#argv == 0) then echo "Enter the user's login name: " set USERNAME = $< endif set IFS = ":" # internal field separator # Use sed to edit /etc/passwd, putting blank in empty field sed "s/$IFS/ /g" < /etc/passwd > $HOME/passwd echo "_________________________________________________________" echo "Userid UID GID Full Name Home Directory Login Shell" echo " " # Use awk to find the correct line and print it out awk "/$USERNAME/" $HOME/passwd echo "_________________________________________________________" rm $HOME/passwd exitThe following reference guides may be of interest:
Using Modules to set up your software environment
The Modules environment package allows you to easily and dynamically customize your environment and specify which versions of installed software you use.
Some common Modules commands include:
| Command | Action |
|---|---|
module avail
|
List all software packages available on the system. |
module avail package
|
List all versions of package available on the system,
for example:module avail openmpi |
module list
|
List all packages currently loaded in your environment. |
module load package/version
|
Add the specified version of the package
to your environment, for example:module load intel/11.1 To load the default version of the |
module unload package
|
Remove the specified package from your environment.
|
module swap package_A package_B
|
Swap the loaded package (package_A) with another
package (package_B).
This is synonymous with: module switch package_A package_B |
module show package
|
Shows what changes will be made to your environment (e.g., paths to
libraries and executables) by loading the specified
package.
This is synonymous with: module display package |
To make permanent changes to your environment, edit your
~/.modules file. For more, see In Modules, how do I save my environment with a .modules file?
For more about the Modules package, see the module manual page
and the modulefile manual
page. Additionally, see On IU's Mason and Quarry clusters, how do I use Modules to manage my software environment?
File storage options
You can store files on your home directory or in scratch space:
-
Home directory: Your Quarry home directory disk
space is allocated on a Network-Attached Storage (NAS) device. You
have a 10 GB disk quota, which is shared with Big Red and the Research
Database Complex (RDC) if you have accounts on those
systems.
-
Local scratch: Scratch disk space is available
locally on each node in
/scratch(19 GB). Files in/scratchare automatically deleted once they are 14 days old.
-
Shared scratch: Shared scratch space is hosted on
the Data Capacitor. The path to your scratch space is
/N/dc/scratch/username(replaceusernamewith your username). Files older than 60 days are periodically purged, following user notification.
For more, see At IU, how much disk space is available to me on the research systems?
Parallel computing and MPI
The Quarry cluster is equipped with a proprietary low-latency interconnect. Parallel jobs can be run over gigabit Ethernet; the cluster is connected to a Force10 E1200 switch.
To see which versions of Open MPI and MPICH are currently
available, use the appropriate module avail command:
Note: Open MPI and MPICH compiled with the 64-bit Intel v10 compilers are available. The MPICH package is currently available compiled only with the Intel compiler suite.
To add either package to your environment:
- Add the necessary compiler suite to your environment:
- GNU Compiler Collection (GCC): module load gcc
- Intel Compiler Suite: module load intel
- Load the module for the appropriate package, compiler suite
(
suite), and version (<version>):
- MPICH: module load mpich/<suite>/<version>
- Open MPI: module load openmpi/<suite>/<version>
Briefly, use mpicc or mpif77 to compile
and link MPI-enabled software. Then, use mpirun to
execute the job on multiple nodes of the cluster. In practice, you
must submit your job through the TORQUE resource manager to ensure
fair and efficient use of the cluster.
Note: On Quarry, Moab serves as the job scheduler for the TORQUE resource manager. For information on Moab, see:
Compiling programs and submitting batch jobs to TORQUE (OpenPBS)
Two simple programs are listed below for your convenience:
C program
/* * Copyright 2005, The Trustees of Indiana University. * Original author: Arvind Gopu (UITS-RAC-HPC group)... * . . . [ snip ] . . . */ #include<stdio.h> #include<math.h> int main () { double PI2=3.141592654/2.0, theta, sintheta; int i, N=4; for (i=0; i<=N; i++) { theta = i * (PI2/N); sintheta = sin (theta); printf (" Sin (%8.6lf) = %8.6lf \n", theta, sintheta); } return 0; }Fortran program
C C Copyright 2005, The Trustees of Indiana University. C Original author: Don Berry (UITS-RAC-HPC group); C . . . [ snip ] . . . C program sine real :: PI2=3.141592654/2.0 integer :: N=4 real x,s integer i do i=0,N x=i*(PI2/N) s=sin(x) write(6,"(f11.6 f11.6)") x,s end do endThese commands use the GNU compilers (gcc and
g77 for C and Fortran programs, respectively)
to compile the two example programs:
Submit job(s) to TORQUE
It's best to write your job in its own script file, and tell TORQUE
to execute that. Following is a script that runs the example
sine_c program:
Use the script to submit your job to TORQUE:
[jdoe@Quarry]$ qsub submit_sine_c.sh 95.m2.quarry.teragrid.iu.eduCheck the job status with the qstat command:
For more about submitting and tracking jobs, see What is TORQUE, and how do I use it to submit and manage jobs on Quarry at IU?
Output and error files
Assuming your job runs to completion, you can find messages it
tried to print on the console in an output file. You can specify the
directories where the output file and the error file go using the
-o and -e flags,
respectively. The default file name of these files is of the format
job_name.osequence_number and
job_name.esequence_number, where job_name is
the name of the job (check out the -N option)
and sequence_number is the job number assigned when the
job is submitted.
For example, if TORQUE assigned job id
95.m2.quarry.teragrid.iu.edu to your job, with jobname
job_sine_c, the output file would be named
job_sine_c.o95 and the error file would be named
job_sine_c.e95. (Also see the use of the
-j TORQUE directive as explained in the
section above.)
Check the output files:
[jdoe@Quarry]$ cat job_sine_c.o348921 Sin (0.000000) = 0.000000 Sin (0.392699) = 0.382683 Sin (0.785398) = 0.707107 Sin (1.178097) = 0.923880 Sin (1.570796) = 1.000000 [jdoe@Quarry]$ cat job_sine_f.o348922 0.000000 0.000000 0.392699 0.382683 0.785398 0.707107 1.178097 0.923880 1.570796 1.000000Compiling and submitting parallel jobs
Here is a simple C program you can submit to use multiple processors:
#include <stdio.h> #include <mpi.h> /* #include "VT.h" */ /* only needed for Vampir Trace */ int main( int argc, char *argv[] ) { /* Remember: We are programming according to the SPMD model! */ /* This means that each and every processor executes this program. */ /* It is convenient to reference the typical processor in the */ /* first person, using "I" and "me" and "my." In other words, */ /* read through this program as if you were one of the processors. */ /* My ID number or my "rank": */ int myrank, length; char myname[BUFSIZ]; /* I initialize for myself the MPI API: */ MPI_Init(&argc, &argv); /* Who am I? I request my ID number: */ MPI_Comm_rank(MPI_COMM_WORLD, &myrank); MPI_Get_processor_name (myname, &length); //VT_symdef( 200, "for-loop", "Calculation" ); /* I print the standard greeting along with my ID: */ printf( "Hello, parallel worlds! This is processor %s and my rank is %d!\n", myname, myrank ); /* Finally I close the MPI API: */ MPI_Finalize(); return 0; }Here is the script file to submit the job:
#! /bin/bash #PBS -l nodes=1:ppn=4,walltime=1:00 #PBS -m ea #PBS -N My_mpi_job # Get the number of processors NP=`wc -l $PBS_NODEFILE | awk '{print $1}'` # print out the nodes where the job runs echo "Execute node list:" sort -u $PBS_NODEFILE # change to the working directory cd simple_quarry_jobs # run an mpi job: mpirun -np $NP -machinefile $PBS_NODEFILE helloWorldsCompile and submit:
[jdoe@Quarry]$ mpicc -o helloWorlds helloWorlds.c [jdoe@Quarry]$ qsub submit_parallel.sh 97.m2.quarry.teragrid.iu.eduLook at the output file:
[jdoe@Quarry]$ cat job_helloWorlds.o348923 LAM 7.1.1/MPI 2 C++/ROMIO - Indiana University Hello, parallel worlds! This is processor bc27 and my rank is 1! Hello, parallel worlds! This is processor bc25 and my rank is 2! Hello, parallel worlds! This is processor bc27 and my rank is 0! Hello, parallel worlds! This is processor bc25 and my rank is 3! LAM 7.1.1/MPI 2 C++/ROMIO - Indiana UniversityUsage policies
Quarry serves multiple communities as a research cluster and as a general-purpose Linux environment. For information about home directories, CPU limits and batch jobs, and queue properties, see Quarry usage policies.
All IU students, faculty, staff, and affiliated researchers may request accounts on Quarry via the IU Account Management Service AMS. For more about IU computing accounts and your responsibilities as a computer user, see:
Note: The scheduled monthly maintenance window for Quarry is the first Tuesday of each month, 7am-7pm.
Support
Quarry is supported by the UITS High Performance Systems group. If you have system-specific questions about Quarry, email High Performance Systems. If you have questions about compilers, programming, scientific/numerical libraries, or debuggers, email Scientific Applications and Performance Tuning.
Announcements, downtime information, and documentation are available on the Quarry cluster home page.
Last modified on May 10, 2013.







