There are several distinct file spaces available on TRACE-HPC, each serving a different function.
- $HOME, your home directory on TRACE-HPC$PROJECT, persistent file storage on Ocean. $PROJECT is a larger space than $HOME.
- $LOCAL, Scratch storage on local disk on the node running a job
- $RAMDISK, Scratch storage in the local memory associated with a running job
File
...
Three months after your grant expires all of your TRACE-HPC files associated with that grant will be deleted, no matter which file space they are in. You will be able to login during this 3-month period to transfer files, but you will not be able to run jobs or create new files.
File permissions
Access to files in any TRACE-HPC space is governed by Unix file permissions. If your data has additional security or compliance requirements, please contact trace-it-help@andrew.cmu.edu.
Unix file permissions
For detailed information on Unix file protections, see the man page for the chmod (change mode) command.
To share files with your group, give the group read and execute access for each directory from your top-level directory down to the directory that contains the files you want to share.
chmod g+rx directory-name
Then give the group read and execute access to each file you want to share.
chmod g+rx filename
To give the group the ability to edit or change a file, add write access to the group:
chmod g+rwx filename
Access Control Lists
If you want more fine-grained control than Unix file permissions allow —for example, if you want to give only certain members of a group access to a file, but not all members—then you need to use Access Control Lists (ACLs). Suppose, for example, that you want to give janeuser read access to a file in a directory, but no one else in the group.
Use the setfacl (set file acl) command to give janeuser read and execute access on the directory:
setfacl -m user:janeuser:rx directory-name
for each directory from your top-level directory down to the directory that contains the file you want to share with janeuser. Then give janeuser access to a specific file with
setfacl -m user:janeuser:r filename
User janeuser will now be able to read this file, but no one else in the group will have access to it.
To see what ACLs are set on a file, use the getfacl (get file acl) command.
There are man pages for chmod, setfacl and getfacl.
$HOME
This is your TRACE -HPC home directory. It is the usual location for your batch scripts, source code and parameter files. Its path is /jet/home/username, where username is your TRACE username. You can refer to your home directory with the environment variable $HOME. Your home directory is visible to all ofTRACE-HPC's nodes.
Your home directory is backed up daily, although it is still a good idea to store copies of your important files in another location, such as the Ocean file system or on a local file system at your site. If you need to recover a home directory file from backup send email to trace-it-help@andrew.cmu.edu. The process of recovery will take 3 to 4 days.
$HOME quota
Your home directory has a 25GB quota. You can check your home directory usage using the my_quotas command. To improve the access speed to your home directory files you should stay as far below your home directory quota as you can.
Grant expiration
Three months after a grant expires, the files in your home directory associated with that grant will be deleted.
$PROJECT
$PROJECT is persistent file storage. It is larger than your space in $HOME.
The path of your Ocean home directory is /ocean/projects/groupname/username, where groupname is the Unix group id associated with your grant. Use the id command to find your group name.
The command id -Gn will list all the Unix groups you belong to.
The command id -gn will list the Unix group associated with your current session.
If you have more than one grant, you will have a $PROJECT directory for each grant. Be sure to use the appropriate directory when working with multiple grants.
$PROJECT quota
Storage quota
Your usage quota for each of your grants is the Ocean storage allocation you received when your proposal was approved. If your total use in Ocean exceeds this quota you won't be able to run jobs on TRACE-HPC until you are under quota again.
Use the my_quotas or projects command to check your Ocean usage. You can also check your usage on the TRACE User Portal.
If you have multiple grants, it is very important that you store your files in the correct $PROJECT directory.
Inode quota
In order to best serve all TRACE-HPC users, an inode allocation has been established for $PROJECT. It will be enforced in addition to the storage quota for your grant. The inode allocation is proportional to the size of your storage quota, and is set at 6070 inodes per GB of storage allocated. There is currently no inode quota on home directories in the Jet file system.
Inodes are data structures that contain metadata about a file, such as the file size, user and group ids associated with the file, permission settings, time stamps, and more. Each file has at least one inode associated with it.
To view your usage on TRACE-HPC, use the my_quotas command which shows your limits as well as your current usage.
[user@trace-hpc-login013 ~]$ my_quotas
The quota for project directory /ocean/projects/abcd1234
Storage quota: 9.766T
Storage used: 1.384T
Inode quota: 60,700,000
Inodes used: 453,596
Tips to reduce your inode usage:
- Delete files which are no longer needed
- Combine small files into one larger file via tools such as zip or tar
Should you need to increase your storage allocation or inode limit, please submit a supplement via the TRACE allocations system. If you have questions, please email trace-it-help@andrew.cmu.edu.
Grant expiration
Three months after a grant expires, the files in any Ocean directories associated with that grant will be deleted.
$LOCAL
Each of TRACE-HPC's nodes has a local file system attached to it. This local file system is only visible to the node to which it is attached, and provides fast access to local storage.
In a running job, this file space is available as $LOCAL.
If your application performs a lot of small reads and writes, then you could benefit from using this space.
Node-local storage is only available when your job is running, and can only be used as working space for a running job. Once your job finishes, any files written to $LOCAL are inaccessible and deleted. To use local space, copy files to it at the beginning of your job and back out to a persistent file space before your job ends.
If a node crashes all the node-local files are lost. You should checkpoint theses files by copying them to Ocean during long runs.
$LOCAL size
The maximum amount of local space varies by node type.
To check on your local file space usage type:
du -sh
No Service Units accrue for the use of $LOCAL.
Using $LOCAL
To use $LOCAL you must first copy your files to $LOCAL at the beginning of your script, before your executable runs. The following script is an example of how to do this
RC=1
n=0
while [[ $RC -ne 0 && $n -lt 20 ]]; do
rsync -aP $sourcedir $LOCAL/
RC=$?
let n = n + 1
sleep 10
done
Set $sourcedir to point to the directory that contains the files to be copied before you call your executable. This code will try at most 20 times to copy your files. If it succeeds, the loop will exit. If an invocation of rsync was unsuccessful, the loop will try again and pick up where it left off.
At the end of your job you must copy your results back from $LOCAL or they will be lost. The following script will do this.
mkdir $PROJECT/results
RC=1
n=0
while [[ $RC -ne 0 && $n -lt 20 ]]; do
rsync -aP $LOCAL/ $PROJECT/results
RC=$?
let n = n + 1
sleep 10
done
This code fragment copies your files to a directory in your Ocean file space named results, which you must have created previously with the mkdir command. It will loop at most 20 times and stop if it is successful.
$RAMDISK
You can use the memory allocated for your job for IO rather than using disk space. In a running job, the environment variable $RAMDISK will refer to the memory associated with the nodes in use.
The amount of memory space available to you depends on the size of the memory on the nodes and the number of nodes you are using. You can only perform IO to the memory of nodes assigned to your job.
If you do not use all of the cores on a node, you are allocated memory in proportion to the number of cores you are using. Note that you cannot use 100% of a node's memory for IO; some is needed for program and data usage.
This space is only available to you while your job is running, and can only be used as working space for a running job. Once your job ends this space is inaccessible and files there are deleted. To use $RAMDISK, copy files to it at the beginning of your job and back out to a permanent space before your job ends. If your job terminates abnormally, files in $RAMDISK are lost.
Within your job you can cd to $RAMDISK, copy files to and from it, and use it to open files. Use the command du -sh to see how much space you are using.
If you are running a multi-node job the $RAMDISK variable points to the memory space on the node that is running your rank 0 process.