ARCHIVED: On XSEDE, what is Wrangler?

This content has been archived, and is no longer maintained by Indiana University. Information here may no longer be accurate, and links may no longer be available or reliable.

The Wrangler Data Analysis and Storage system is a transformational data management system available for use on the Extreme Science and Engineering Discovery Environment (XSEDE). Wrangler was developed and is operated through a partnership between Indiana University, the Texas Advanced Computing Center (TACC), and the University of Chicago, with funding support from the National Science Foundation (NSF).

Wrangler is designed to help advance complex, data-intensive research by meeting critical needs for storing, managing, moving, and analyzing massive, diverse data sets (i.e., Big Data).

The Wrangler system features 96 Dell R730 analytics nodes located at TACC, with an additional 20 nodes hosted at IU. Each node is equipped with two 12-core Intel Xeon E5-2680 v3 CPUs, 128 GB of 1,600 MHz DDR4 memory, and 146 GB of local disk storage. The TACC and IU systems are connected to the Internet via an aggregate 100 Gb/s link, providing a maximum potential network throughput of 200 Gb/s. Wrangler login and compute nodes are interconnected with 40-gigabit links. Wrangler is based on the CentOS 7 operating system, with job scheduling and distribution handled by the Simple Linux Utility for Resource Management (SLURM).

Wrangler at TACC features a high-speed flash storage tier capable of supporting 1 TB/s and 250 million IOPS, enabling real-time analytics at scale. Wrangler at TACC also provides access to TACC shared resources and features.

Wrangler's bulk storage tier employs two identical 10 PB Lustre-based systems for secure, high-performance disk storage. Each system includes access to data replication via iRODS. Wrangler also includes mechanisms for custom databases, Big Data workflows, and software for data curation.

Wrangler is integrated with Globus to ensure rapid, reliable data transfers, and provides flexible support for a variety of analytics methods and technologies. For a list of software available on Wrangler at both IU and TACC, refer to the Software page in the XSEDE User Portal. Both sites use the LMOD implementation of the Environment Modules package, allowing users to dynamically customize their software environments.

XSEDE researchers can request Startup allocations or Research allocations through the XSEDE Resource Allocation System (XRAS). For instructions, see Request Steps in the XSEDE User Portal.

Once a Wrangler allocation is awarded, the principal investigator or a delegate must log into the Wrangler Data Portal to manage users on the allocation and select required services.

XSEDE researchers can access Wrangler via GSI-OpenSSH using the XSEDE Single Sign-on Login Hub and their XSEDE-wide login credentials. For access via SSH, you must activate a TACC user account through the TACC User Portal. (TACC will email you a welcome message that includes your TACC User Portal username, a link to the TACC User Portal, and instructions for activating your account.)

For user documentation, see TACC Wrangler User Guide on the XSEDE User Portal.

Following is a summary of the differences between the TACC and IU portions of Wrangler:

Component TACC Wrangler IU Wrangler
Lustre disk storage 10 PB (raw) 10 PB (raw)
Compute nodes 96 20
Flash memory 500 TB of DSSD n/a
Login node wrangler.tacc.xsede.org wrangler.iu.xsede.org
Globus endpoint xsede#wrangler xsede#wrangler-iu

If you have questions about Wrangler, contact the XSEDE Help Desk.

For more about XSEDE compute, advanced visualization, storage, and special purpose systems, see the Resources Overview, Systems Monitor, and User Guides. For scheduled maintenance windows, outages, and other announcements related to XSEDE digital services, see User News.

The IU portion of Wrangler is managed by the High Performance File Systems (HPFS) group, part of the Systems directorate of UITS Research Technologies.

This document was developed with support from National Science Foundation (NSF) grants 1053575 and 1548562. Any opinions, findings, conclusions, or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the NSF.

This is document bdts in the Knowledge Base.
Last modified on 2018-02-21 14:06:14.