File Systems
Vista has two major file systems of its own, which each user can access from any Vista node through the variables $HOME and $SCRATCH. These are NFS-based flash file systems from VAST Data. The $HOME file storage area is relatively small, but it is backed up regularly; while the much larger $SCRATCH storage area is designed for short-term use, as any files in $SCRATCH are subject to periodic purge.
Vista also has a shared storage area which accessible to each user as $WORK. It is globally connected to all of TACC's HPC resources. This file system is based on Lustre, a parallel distributed file system designed for high performance as well as high capacity. The Parallel I/O topic of the Cornell Virtual Workshop describes Lustre in more detail.
Thus, Vista has one Lustre file system mounted, which each user sees as $WORK, and two VAST Data file systems, which each user sees as $HOME and $SCRATCH. $HOME is of course the standard location where each user lands upon logging in. All of these locations are accessible from any login or compute node, though their intended purposes and associated restrictions differ. The table below summarizes the defining features and limitations for each.
Remember that the Good Conduct guidelines state that you should not stress the shared filesystems.
TACC recommends that any and all job related data, both input and output, be stored on $SCRATCH and not $WORK. They find that due to NVIDIA's unified memory architecture, the Grace Hopper nodes use GPU memory as a file system cache, and this memory occasionally cannot be reclaimed after an application exits, resulting in a hung node that requires a reboot.
File System Details
| Filesystem | $HOME |
$WORK |
$SCRATCH |
|---|---|---|---|
| Intended Use | Storing source code and important files | Storing large files, packages, and software | Storing large files, I/O for jobs |
| Command-line Shortcuts to Access | cdh, cd ~, or cd |
cdw |
cds |
| Quota | 23 GB | 1 TB1 | No restrictions |
| Maximum Files Allowed | 500,000 | 3,000,000 across all TACC systems | No restrictions |
| Lustre Defaults | N/A (VAST) | 1 stripe, 1MB stripe size | N/A (VAST) |
| Backed Up | Yes | No | No |
| Purged | No | No | Yes, all files not accessed2 in 10 days |
| Shared Between TACC Systems | No | Yes | No |
1 This quota is on the total size for all files in the $WORK mount on TACC systems, since the file system is shared. For more information, see the $STOCKYARD section below.
2 Access time is based on a modification or read of the file, or execution of the file on a compute node. For more information, see the Vista User Guide: File Systems.
$STOCKYARD
TACC has set up a shared file system called $STOCKYARD for their various HPC systems, which contains a $WORK directory for each system the user has an allocation on. This enables you to access certain files from any TACC system without the hassle of moving them around. For example, here are the contents of one user's $STOCKYARD directory, which can be reached from Vista with the shortcut command cdy (or cdg):
Based on this output, we can see that the user has received allocations on Frontera and Vista. On each TACC system, $WORK evaluates to the specific $WORK directory of that system.
If you were to echo the $STOCKYARD and $WORK variables, you would see something like:
So $STOCKYARD/vista is equivalent to the $WORK directory while on the Vista system. Note that 10000 will be replaced with a different number, and <username> will be your TACC username. For more information on the shared file systems, refer to the File Systems portion of the Vista User Guide.
Navigating File Systems from TACC Training
CVW material development is supported by NSF OAC awards 1854828, 2321040, 2323116 (UT Austin) and 2005506 (Indiana University)