Use HTAR with your SDA account
On this page:
- Before you begin
- About HTAR
- HTAR command syntax
- HTAR at IU
- HTAR limitations
- Use HTAR to create archives on the SDA
- Use HTAR to create an index for an archive
- Use HTAR to extract files from an SDA archive
- Alternative authentication methods
Before you begin
About HTAR
The HPSS TAR (HTAR) command-line utility lets you create and work with .tar
archives in HPSS. With HTAR, you can aggregate files stored on your local filesystem into .tar
archives that are written directly HPSS. HTAR writes new archives directly to HPSS without creating intermediate archives on your local system or using HSI (or some other HPSS data transfer tool) to place archives in HPSS.
As HTAR creates each archive, it automatically builds a corresponding external index (.idx
) file and stores it in the same HPSS directory as the archive. HTAR also can build (or rebuild) an index file for an HPSS .tar
archive that does not have one, either because the archive was created using some other utility, or because the index was accidentally deleted.
Additionally, you can use HTAR to extract the entire contents of an HPSS archive to your local filesystem, or retrieve only certain specified files and/or directories.
HTAR command syntax
The general syntax for HTAR commands is as follows:
htar [action_options] -f [archive_name] [control_options] [file_list]
At least one action option, plus the -f
option for specifying the archive's filename, are always required. For [file_list]
, indicate which files should be archived, extracted, or processed; use a space-delimited list of files or directory names (wildcard characters are accepted). By default, HTAR copies files from your current local directory into an archive file it creates in your HPSS home directory. To target an alternate source or destination directory, specify the path relative to your local or SDA home directory.
HTAR at IU
At Indiana University, HTAR is available on the IU research supercomputers, allowing you to create and work with .tar
archives on the Scholarly Data Archive (SDA). To use HTAR, you first must add it to your user environment by loading the hpss
module; on the command line, enter:
module load hpss
You can save your customized user environment so that it loads every time you start a new session; for instructions, see Use modules to manage your software environment on IU research supercomputers.
Once the hpss
module is loaded, you can execute HTAR commands from the system's command line.
Alternatively, for use on your personal workstation, you may contact the UITS Research Storage team to request HTAR bundled with HSI. Bundles are available for Red Hat Enterprise Linux 5 and 6, Ubuntu Linux, macOS, and Windows (running Cygwin).
HTAR limitations
While the archive created by HTAR can be of unlimited size (within the SDA's capacity), be aware of the following limitations:
- File size: An individual file within the archive may not be larger than 68 GB.
- Directory paths: The directory path of any file may not exceed 154 characters in length.
- File names: File names may not exceed 99 characters in length.
- Number of files: A single HTAR archive may contain a maximum of 1 million files.
Use HTAR to create an archive on the SDA
The following examples demonstrate how to use HTAR on IU research supercomputers to create .tar
archives on the SDA.
- To copy all files in the current local directory into an archive (for example,
my_archive.tar
) that's created in your SDA home directory, on the command line, enter:htar -c -f my_archive.tar *
In this command, the
-c
action option opens a connection to the SDA and copies all files in your home directory (denoted by the*
wildcard character) into an archive that's created in the current working directory. The-f
option assigns the archive a name (my_archive.tar
).Note:HTAR will overwrite a pre-existing archive of the same name without prompting you. - To copy every file stored in a local subdirectory (for example,
~/my_files
) into an archive (for example,my_files.tar
) that's created in a pre-existing SDA subdirectory (for example,my_archives
), specify each path relative to the respective home directory; for example:htar -c -f my_archives/my_files.tar "my_files"
In this command, the
-c
action option opens a connection to HPSS and copies all files from the specified local directory (~/my_files
) into an archive that's created in the specified SDA subdirectory; the-f
option specifies the path to the archive and its name.Note:Do not include a tilde to represent your home directory (
~/
) in the path to your local subdirectory. If you include the tilde (~
) representing your local home directory in your HTAR file list, each entry in the resulting archive's index file will be prepended with the absolute path from your local system'sroot
directory. This becomes an issue when you use HTAR to extract files from that archive, as HTAR uses the absolute paths prepended to the archive's index entries to create a new set of nested subdirectories locally, and then stores the extracted files in the bottom-level directory.For example, if user
darvader
on Big Red 200 is archiving files from the~/death_star
directory but includes the tilde in the local path (enters"~/death_star"
instead of"death_star"
) of the HTAR file list, all index entries for the resulting archive will be prepended withN/u/darvader/BigRed200
. Afterward, when the user wants to extract files from that archive, HTAR will read the archive's index entries, and consequently save the extracted files locally to~/N/u/darvader/BigRed200/death_star
(and not to~/death_star
). - To create an HPSS archive (for example,
my_archive.tar
) in an SDA directory that does not already exist, add the-P
control option to automatically create any non-existing subdirectories included in the archive file's pathname:htar -c -f new_directory/new_subdirectory/my_archive.tar -P "local_dir"
Use HTAR to create an index for an HPSS archive
For each archive created in the above examples, HTAR simultaneously creates a corresponding index file (for example, my_archive.tar.idx
) and stores it in the same HPSS directory as the archive.
You can use HTAR to recreate an index that has been accidentally deleted, or to create an index for an existing .tar
archive that was created with another application.
If the index file for an archive (for example, archive_name.tar
) is missing, you will see the following error when you try to list or extract the files it contains:
"No such file: archive_name.idx"
To (re)build an index file for an HPSS .tar
archive (for example, old_archive.tar
) that's missing its index, on the command line of your local system, enter:
htar -Xf old_archive.tar
In this command, the -X
action option opens a connection to HPSS, reads the old_archive.tar
file indicated by the -f
option, builds an index file for the archive (for example, old_archive.tar.idx
), and stores it in the same directory as the archive.
Use HTAR to extract files from an SDA archive
The following examples demonstrate how to use HTAR to extract files from an archive stored on your SDA account.
cd
) into the new directory before running HTAR.
- To extract all files from an archive (for example,
my_archive
) stored in your SDA home directory, on your local system's command line, enter:htar -x -f my_archive.tar
In this command, the
-x
action option opens a connection to HPSS and extracts the entire contents of the archive specified by the-f
option (my_archive.tar
). - To extract one or more specific files or directories from an archive without retrieving the entire archive, on your local system's command line, enter:
htar -xvf test.tar file1 file4 file7
In this command, the
-x
action option opens a connection to HPSS and, from the archive specified by the-f
option (test.tar
), extracts only the files listed (file1
,file4
, andfile7
).Note:Because HTAR leaves processing of wildcard characters to the shell, you cannot use
*
to select multiple filenames when retrieving files from an archive stored in HPSS. To display the names of the files in contained in an archive (for example,archive_10.tar
) stored in your HPSS home directory, on your local system's command line, enter:htar -tf archive_10.tar
In this command, the
-t
action option lists the files contained in the archive specified by the-f
option (archive_10.tar
). Files are listed in the order in which they appear in the archive.
Alternative authentication methods
By default, HTAR will prompt for login information (known as the "combo" authentication method). You also can set the authentication method explicitly by defining the HPSS_AUTH_METHOD environment variable; for example:
- In the
csh
ortcsh
, enter:setenv HPSS_AUTH_METHOD combo
- In the
ksh
orbash
shell, enter:export HPSS_AUTH_METHOD=combo
Alternatively, if your binaries are built with the appropriate method, you can use the HPSS_AUTH_METHOD environment variable to enable authentication based on either existing Kerberos credentials (known as the "Kerberos" method) or Kerberos keytabs (known as the "keytab" method):
- Kerberos: To define the HPSS_AUTH_METHOD environment variable to enable the "kerberos" authentication method:
- In the
csh
ortcsh
shell, enter:setenv HPSS_AUTH_METHOD kerberos
- In the
ksh
orbash
shell, enter:export HPSS_AUTH_METHOD=kerberos
- In the
- Keytab: To use the "keytab" method, you also must define the HPSS_KEYTAB_PATH environment variable (using the path to your keytab file) and the HPSS_PRINCIPAL environment variable (using the appropriate login name). For example, to define the required environment variables to enable the "keytab" authentication method:
- In the
csh
ortcsh
shell, enter the following (replaceusername
with the appropriate login name andpath/to/my_keytab
with the path to your keytab file):setenv HPSS_PRINCIPAL username setenv HPSS_AUTH_METHOD keytab setenv HPSS_KEYTAB_PATH /path/to/my_keytab
- In the
ksh
orbash
shell, enter the following (replaceusername
with the appropriate login name andpath/to/my_keytab
with the path to your keytab file):export HPSS_PRINCIPAL=username export HPSS_AUTH_METHOD=keytab export HPSS_KEYTAB_PATH=/path/to/my_keytab
- In the
Related documents
This is document awgg in the Knowledge Base.
Last modified on 2023-05-10 13:07:27.