On the SDA at IU, how is data integrity assured?

Indiana University's HPSS system, the Scholarly Data Archive (SDA), is designed to be a data archive, with the expectation that data stored there will be good, long-term copies of the original data. To help keep data stable, both the transfer mechanism (TCP/IP) and the storage media (disk and tape) have checksums, and the storage media also have error correction mechanisms to deal with small media defects and bit loss.

However, there can be no guarantee that the data will remain intact forever, or that data corruption will not occur. Even very low error rates are non-zero, and when massive amounts of data are transferred, undetected errors may occur. Error correction schemes also can fail when data are stored on magnetic media for long periods of time.

IU periodically rewrites data from one storage medium to another. Disk-to-tape transfers are nearly immediate, and tape-to-tape transfers are done when other data on the tape are deleted, or when new tape technology is implemented. This protects against some errors stemming from long-term storage.

Once the data are in HPSS, to ensure integrity IU relies on the checksumming and error correction capabilities of the TCP/IP protocol and the storage media. If errors occur, a copy is available to help recover data.

To check data integrity further, you can run a checksumming algorithm (e.g., MD5) on a file. With HSI clients version 4 and higher, checksums are computed by default when you transfer files to the SDA. With HSI, you also can create and view checksums for files already stored on the SDA; see How do I use HSI to create and manage checksums? Alternatively, you can download the file and perform the checksum locally.

This is document awax in the Knowledge Base.
Last modified on 2015-02-13 00:00:00.

  • Fill out this form to submit your issue to the UITS Support Center.
  • Please note that you must be affiliated with Indiana University to receive support.
  • All fields are required.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.

  • Fill out this form to submit your comment to the IU Knowledge Base.
  • If you are affiliated with Indiana University and need help with a computing problem, please use the I need help with a computing problem section above, or contact your campus Support Center.

Please provide your IU email address. If you currently have a problem receiving email at your IU account, enter an alternate email address.