Welcome to the Middle Tennessee State University (MTSU) Introduction to High-Performance Computing (HPC) resources. This repository is designed to assist Computational and Data Science (CDS) faculty and students in effectively utilizing the HPC clusters available at MTSU.
- Prerequisites
- Cluster Overview
- Getting Started
- File Transfers
- SLURM Job Scripts
- Example Code
- Getting Help
- HPC Frequently Asked Questions (FAQ)
Before using the HPC clusters, ensure you have the following:
-
MTSU Account:
- You must have an active MTSU account and be added to the appropriate user groups to access the clusters.
-
SSH Client:
- Install an SSH client on your local system. Examples:
- Linux/macOS: Use the built-in
sshcommand. - Windows: Use PuTTY or Windows Terminal.
- Linux/macOS: Use the built-in
- Install an SSH client on your local system. Examples:
-
At least one of the following if off campus:
- Jumphost Access:
- This must be requested via ITD Help Desk. More information at https://help.mtsu.edu/sp
- VPN Access:
- This will likely only be available to faculty/staff.
- Note: If on campus, you can directly ssh to cluster resources. WiFi still requires use of VPN or jumphost.
- Jumphost Access:
-
Basic Linux Knowledge:
- Familiarity with the Linux command line is essential for effective cluster usage.
For a detailed overview of the HPC clusters, including hardware specifications, available software modules, and storage resources, please refer to the Cluster Overview document.
To begin using the HPC resources:
-
Accessing the Clusters:
- Use SSH to connect to the cluster head nodes.
- Ensure you have the necessary credentials and VPN access if required.
-
Loading Modules:
- The clusters utilize the Environment Modules system.
- Load the required modules for your workflow using the
module loadcommand. - For Python environments, it is recommended to use the
minicondamodule to manage dependencies.
-
Submitting Jobs:
- Write SLURM job scripts to define your computational tasks.
- Submit jobs using the
sbatchcommand.
Detailed commands/guide for remote file transfer are located here.
Pre-configured SLURM job scripts are available to assist you in submitting jobs efficiently. These scripts are located in the /projects/examples/scripts directory on the cluster. The available scripts include:
basic_cpu.slurm: Template for basic CPU jobs.array_job.slurm: Template for job arrays.gpu_job.slurm: Template for GPU-accelerated jobs.high_mem.slurm: Template for high-memory jobs. (*Babbage only)
To use these scripts:
-
Copy the desired script to your working directory:
cp /projects/examples/scripts/basic_cpu.slurm ~/my_jobs/ -
Modify the script parameters as needed.
-
Submit the job:
sbatch ~/my_jobs/basic_cpu.slurm
Detailed examples for submitting SLURM jobs can be found in SLURM Job Submission Examples.
Example code is provided to demonstrate various computational tasks:
-
Python:
- CPU-based matrix multiplication.
- GPU-accelerated TensorFlow model training.
-
R:
- Parallel processing using the
foreachanddoParallelpackages.
- Parallel processing using the
These examples are located in the /projects/examples directory on the cluster:
- Python scripts:
/projects/examples/python/ - R scripts:
/projects/examples/r/
Feel free to copy and modify these scripts to suit your research needs.
For assistance with HPC resources:
- Email: [email protected]
- Documentation: Additional documentation is available at help.mtsu.edu/kb under the Computational Science Systems Knowledge Base.
For more details on HPC cluster usage, SLURM job management, and troubleshooting, see the HPC FAQ:
--