Lustre file systems at IU

On this page:


Lustre overview and key components

Lustre is a high performance storage architecture and scalable parallel file system for use with computing clusters, supercomputers, visualization systems, and desktop workstations. Lustre can scale to provide petabytes of storage capacity, with hundreds of gigabytes per second of I/O bandwidth, to thousands of clients. Lustre also features integrated network diagnostics, and mechanisms for performance monitoring and tuning.

Lustre started as a research project at Carnegie Mellon University, and is now developed and distributed as open source software under the GNU General Public License version 2 (GPLv2). Development of Lustre is supported by the non-profit Open Scalable File Systems OpenSFS organization. For more, see the Lustre website.

Key components of a Lustre file system include:

  • Lustre clients: Lustre clients run on computational, visualization, or desktop nodes that communicate with file system's servers via the Lustre Network (LNET) layer, which supports a variety of network technologies, including InfiniBand, Ethernet, Seastar, and Myrinet. When Lustre is mounted on a client, its users can transfer and manage file system data as if they were stored locally (however, clients never have direct access to the underlying file storage).
  • Management Target (MGT): The MGT stores file system configuration information for use by clients and other Lustre components. Although MGT storage requirements are relatively small even in the largest file systems, the information stored there is vital to system access.
  • Management Server (MGS): The MGS manages the configuration data stored on the MGT. Lustre clients contact the MGS to retrieve information from the MGT.
  • Metadata Target (MDT): The MDT stores filenames, directories, permissions, and other namespace metadata.
  • Metadata Server (MDS): The MDS manages the namespace metadata stored on the MDT. Lustre clients contact the MDS to retrieve this information from the MDT. The MDS is not involved in file read/write operations.
  • Object Storage Targets (OSTs): The OSTs store user file data in one or more logical objects that can be striped across multiple OSTs.
  • Object Storage Server (OSS): The OSS manages read/write operations for (typically) multiple OSTs.

Implementation at IU

At Indiana University, the UITS High Performance File Systems (HPFS) team operates the following Lustre-based file systems for use on IU's research supercomputers.

Slate

Slate is a centralized, high performance Lustre file system designed for the persistent storage of scholarly data to meet the needs of data-intensive workflows and analytics running on IU's research supercomputers. Slate is not subject to a purge policy.

Space on Slate is available to all IU research supercomputer users. To create a Slate account, follow the instructions in Get additional IU computing accounts.

When your account on Slate is created, your space will be mounted on the IU research supercomputers at (replace username with your IU username):

/N/slate/username

The default quota allotment is 800 GiB per user. Upon request, your quota may be increased to a maximum of 1.6 TiB. To request more space on Slate, contact the UITS High Performance File Systems (HPFS) group using the Research Technologies contact form (from the "Choose an area to direct your question to" drop-down, select High performance storage). Additionally, an inode quota (sometimes called "file quota") limits the number of objects a single user can create to 6.4 million.

For more, see:

Slate-Project

Space on the Slate-Project file system is available to IU researchers who need shared/project space or an allocation larger than the 1.6 TiB available per user on Slate.

Initial project requests are limited to 30 TiB. Additional, non-billed Slate-Project capacity may be available upon request and is dependent on the requestor's total Slate-Project allocation. Project capacity is billed at $5.12 TiB per month (an IU departmental account is required) when the total Slate-Project allocation exceeds 120 TiB.

To get space on the Slate-Project file system, fill out and submit the Slate-Project request form. Once your request is approved, you will be able to create a Slate-Project account using the instructions in Get additional IU computing accounts.

When created, your Slate-Project space will be mounted on the IU supercomputers at (replace project_name with your project name):

/N/project/project_name

For more, see:

Slate-Scratch

Slate-Scratch is a large-capacity, high-throughput, high-bandwidth Lustre-based file system designed for the temporary storage of computational data to meet the needs of data-intensive workflows and analytics running on Indiana University's research supercomputers.

Slate-Scratch directories are created automatically for all users with accounts on IU's research computing systems. If you have an account on an IU research computing system, your Slate-Scratch directory is mounted at (replace username with your IU username):

/N/scratch/username

Users are allotted 100 TiB of storage capacity. An inode quota (sometimes called "file quota") limits the number of files and directories a single user can create to 10 million.

Space on Slate-Scratch is not intended for permanent storage, and data are not backed up. Files in scratch space may be purged if they have not been accessed for more than 30 days.

For more, see:

Some helpful commands

Following are some helpful commands for working with files on Lustre file systems:

  • Get the total amount of data stored, for example, in your Slate account (replace username with your IU username):
    du -hc /N/slate/username
  • Check your inode usage (the number of items stored in your allotted space), for example, on Slate (replace username with your IU username):
    lfs quota -h -u username /N/slate
  • List your files in reverse order by date modified:
    find . -type f -exec ls -1hltr "{}" +;
    Note:
    On Lustre file systems, using the ls command with the -l option to list the contents of a directory in long format can cause performance issues for you and other users, especially if the directory contains a large number of files. Because Lustre performs file read/write and metadata operations separately, executing ls -l involves contacting both the Lustre MDS (to get path, ownership, and permissions metadata) and one or more OSSs (which in turn must contact one or more OSTs to get information about the data objects that make up your files). Use ls -l only on individual files (for example, to get a file's actual, uncompressed size) or directories that contain a small number of files.
  • Set Lustre striping:
    lfs setstripe -c X <file|directory>

    In the example above, replace X with the number of stripes to set for a file or directory (the default is one stripe).

    Note:
    Too many stripes may negatively impact performance (16 should be the maximum). Also, setstripe does not affect existing data.
  • Show the number of stripes for a file and the OSTs on which the stripes are located:
    lfs getstripe <file|directory>

Get help with Lustre file systems at IU

For technical support or general information about the Slate, Slate-Project, or Slate-Scratch file system, contact the UITS High Performance File Systems (HPFS) group using the Research Technologies contact form; from the "Choose an area to direct your question to" drop-down, select High performance storage.

To receive maintenance and downtime information, subscribe to the hpfs-maintenance-l@indiana.edu mailing list; see Subscribe to an IU List mailing list.

This is document ayfh in the Knowledge Base.
Last modified on 2024-08-08 12:17:27.