About the College Archive Tool (CAT)

On this page:


The College Archive Tool (CAT) is a web-based application designed to help Indiana University research projects archive Geode-Project directories and store those archives on the Indiana University Scholarly Data Archive (SDA).

CAT file transfers are executed using Globus transfer nodes via GlobusAuth and GlobusTransfer APIs. Security is managed via SSH keys and GlobusAuth tokens stored in an enterprise HashiCorp Vault instance. Transfer jobs are orchestrated via Apache Airflow, and metadata is preserved in a MySQL instance on the IU Research Database Complex (RDC).

The Research Technologies division of UITS provides several systems and services that meet certain requirements established in the HIPAA Security Rule thereby enabling their use for research involving data that contain protected health information (PHI). However, using a UITS Research Technologies resource does not fulfill your legal responsibilities for protecting the privacy and security of data containing PHI. You may use these resources for research involving data containing PHI only if you institute additional administrative, physical, and technical safeguards that complement those UITS already has in place.

Request access

Access to CAT is available to the storage administrators for IU College of Arts and Sciences projects with space on Geode-Project. To request access, email rds-admin@rtinfo.indiana.edu.

To use CAT, you also must Create a Globus ID.

User interface

The CAT user interface consists of a set of panes and action buttons.

  • Panes:
    • Storage: List and select files and directories on Geode-Project.
    • Archives: List and select completed archives.
    • Contents: See the files contained in the folder selected in the Storage pane or the files contained in the archive selected in the Archive pane.
      You can only view files in the Contents pane. To manage files, use the Storage and Archives panes.
  • Action buttons:
    • Archive: Archive the directory selected in the Storage pane.
    • Restore: Transfer files from the archive selected in the Archives pane.
    • Search: Display archives that contain files or directories matching the given search string.

Archive a Geode-Project directory

To transfer a Geode-Project directory to an archive on the SDA:

  1. In the Storage pane, browse to find the directory to archive, and then click to select it.
  2. Click Archive.

The archive process is asynchronous. CAT creates manifests for GPFS file permissions, creates an archive file for the contents of the directory, moves the archive onto the SDA, and then removes the directory from Geode-Project. You'll be notified in email following the success or failure of each stage:

  • Pre-processing: preserve permissions, add metadata, create archive
  • Transfer from Geode-Project to SDA
  • Post-processing: clean-up on Geode-Project, complete the archive process

Restore an archived Geode-Project directory

To restore a Geode-Project directory from an archive on the SDA:

  1. In the Archives pane, browse to find the archive to restore, and then click to select it.
  2. Click Restore.

The restore process is asynchronous. CAT will transfer the archive to your Geode-Project space and extract the contents. You'll be notified in email following the success or failure of each stage:

  • Transfer from SDA to Geode-Project
  • Post-processing: extracting the contents of the archive and application of GPFS permissions.

To find archives that contain a particular file or directory name, enter a search string in the Search text field, and then click Search. CAT searches the metadata (including file lists) of the archives stored in your project's SDA account.

  • CAT does not search file contents.
  • Search terms are not case-sensitive (capitalization is ignored).

CAT supports two types of search:

  • Phrase search: Search terms enclosed in quotes are treated as phrases. Use phrase search to find a file or directory name that contains spaces.
  • Token search: For search terms not enclosed in quotes, CAT matches multiple tokens to any file names in the archive. For example, if you enter inventory.txt and sales.txt, CAT will find all files with those names.

Get help

Research computing support at IU is provided by the Research Technologies division of UITS. To ask a question or get help regarding Research Technologies services, including IU's research supercomputers and research storage systems, and the scientific, statistical, and mathematical applications available on those systems, contact UITS Research Technologies. For service-specific support contact information, see Research computing support at IU.

This is document bgpl in the Knowledge Base.
Last modified on 2021-08-16 16:41:33.