Example Job Scripts¶

This page provides structured job script examples adapted for REPACSS. For definitions and scheduler behavior, refer to the Job Basics page. These examples are designed for CPU and GPU partitions such as zen4, h100, and standard.

Tip

Interactive jobs in GPU partitions are granted scheduling priority on REPACSS.

Warning

Each Slurm CPU is a hyperthread. To bind OpenMP threads to physical cores, use --cpu-bind=cores.

Note

To run jobs in parallel, use &, and use wait to synchronize them.

Job Types¶

Jobs on REPACSS can be submitted in two main forms:

Interactive Jobs: Real-time sessions for testing and debugging.

interactive -c 8 -p h100

Batch Jobs: Scheduled jobs submitted via script.

sbatch job.sh
sbatch -p zen4 job.sh
sbatch -p h100 job.sh

Script Templates¶

Basic MPI Job Script¶

This example demonstrates how to run a pure MPI application. For details on available MPI implementations and their usage, see MPI Implementations and Usage Guidance.

Note

Steps 1-2 below show how to create and compile a test program. If you already have your own MPI program compiled, you can skip to step 3 and use your program instead.

Create a file named mpi_program.c with the following content:

#include <stdio.h>
#include <mpi.h>

int main(int argc, char **argv) {
    int rank, size;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Get_processor_name(processor_name, &name_len);

    printf("Hello from processor %s, rank %d out of %d processors\n",
           processor_name, rank, size);

    MPI_Finalize();
    return 0;
}

Load the required modules and compile the program:

module load gcc/14.2.0
module load mpich/4.1.2
mpicc mpi_program.c -o mpi_program

Create a file named mpi_job.sh with the following contents:

#!/bin/bash
#SBATCH --job-name=mpi_job
#SBATCH --output=mpi_job.out
#SBATCH --error=mpi_job.err
#SBATCH --partition=zen4
#SBATCH --time=01:00:00
#SBATCH --nodes=2
#SBATCH --ntasks=8
#SBATCH --cpus-per-task=1
#SBATCH --mem-per-cpu=2G

# Load modules
module load gcc/14.2.0
module load mpich/4.1.2

# Run MPI program using srun (for OpenMPI/MPICH)
srun -n 8 ./mpi_program

To determine your resource needs, refer to the Determine Resource Needs documentation.

Make the script executable and submit it using sbatch:
```
sbatch mpi_job.sh
```

Note

For Intel MPI, use mpirun -np 8 ./mpi_program instead of srun. See MPI Implementations and Usage Guidance for details.

Hybrid MPI+OpenMP Job Script¶

This example demonstrates how to run a hybrid MPI+OpenMP application that uses both MPI for inter-node communication and OpenMP for intra-node parallelism.

Note

Steps 1-2 below show how to create and compile a test program. If you already have your own hybrid MPI+OpenMP program compiled, you can skip to step 3 and use your program instead.

Create a file named hybrid_program.c with the following content:

#include <stdio.h>
#include <mpi.h>
#include <omp.h>

int main(int argc, char **argv) {
    int rank, size;
    int thread_id, num_threads;
    char processor_name[MPI_MAX_PROCESSOR_NAME];
    int name_len;

    MPI_Init(&argc, &argv);
    MPI_Comm_rank(MPI_COMM_WORLD, &rank);
    MPI_Comm_size(MPI_COMM_WORLD, &size);
    MPI_Get_processor_name(processor_name, &name_len);

    #pragma omp parallel private(thread_id, num_threads)
    {
        thread_id = omp_get_thread_num();
        num_threads = omp_get_num_threads();
        printf("MPI rank %d/%d on %s: OpenMP thread %d/%d\n",
               rank, size, processor_name, thread_id, num_threads);
    }

    MPI_Finalize();
    return 0;
}

Load the required modules and compile the program:

module load gcc/14.2.0
module load mpich/4.1.2
mpicc -fopenmp hybrid_program.c -o hybrid_program

Create a file named hybrid_job.sh with the following contents:

#!/bin/bash
#SBATCH --job-name=hybrid_job
#SBATCH --output=hybrid_job.out
#SBATCH --error=hybrid_job.err
#SBATCH --partition=zen4
#SBATCH --time=01:00:00
#SBATCH --nodes=2
#SBATCH --ntasks=4
#SBATCH --cpus-per-task=4
#SBATCH --mem-per-cpu=2G

# Load modules
module load gcc/14.2.0
module load mpich/4.1.2

# Set OpenMP threads per MPI task
export OMP_NUM_THREADS=4

# Run hybrid program with CPU binding to physical cores
srun --cpu-bind=cores -n 4 ./hybrid_program

To determine your resource needs, refer to the Determine Resource Needs documentation.

Make the script executable and submit it using sbatch:
```
sbatch hybrid_job.sh
```

Tip

This example uses 2 nodes with 4 MPI tasks and 4 OpenMP threads per task, for a total of 16 parallel threads. The --cpu-bind=cores flag ensures OpenMP threads are bound to physical cores rather than hyperthreads. Adjust --ntasks, --cpus-per-task, and OMP_NUM_THREADS based on your application's needs.

Note

For more information on MPI implementations and launchers, see MPI Implementations and Usage Guidance.

Python Job Script¶

Create a Python file named script.py with the following example content:

import time
import platform

print("SLURM Python job started.")
print("Running on:", platform.node())
time.sleep(10)  # Simulate workload
print("Job complete. Goodbye!")

Create a file named submit_python_job.sh with the following content:

#!/bin/bash
#SBATCH --job-name=python_job
#SBATCH --output=python_job.out
#SBATCH --error=python_job.err
#SBATCH --time=01:00:00
#SBATCH --nodes=1
#SBATCH --ntasks=1
#SBATCH --cpus-per-task=8
#SBATCH --mem=32G

# Load required modules
module load gcc

# Activate conda environment
source ~/miniforge3/etc/profile.d/conda.sh
conda activate myenv

# Run Python script
python script.py

To determine your resource needs, refer to the Determine Resource Needs documentation.

Make the script executable and submit it to SLURM:
```
sbatch submit_python_job.sh
```

GPU Job Script¶

Warning

We are currently working to make the CUDA module available system-wide for all users. In the meantime, please use CUDA via a Conda environment as described below.

Create and activate a new Conda environment

conda create --name cuda-env python=3.10 -y
conda activate cuda-env

Install the CUDA Toolkit with nvcc support

conda install -c nvidia cuda-toolkit=12.9

Install a compatible GCC toolchain (GCC 11)

conda install -c conda-forge gxx_linux-64=11

Create a sample CUDA program: gpu_program.cu

#include <stdio.h>
#include <cuda_runtime.h>

__global__ void hello_from_gpu() {
    printf("Hello from GPU thread %d!\n", threadIdx.x);
}

int main() {
    printf("Starting GPU job...\n");
    hello_from_gpu<<<1, 64>>>();

    cudaError_t err = cudaGetLastError();
    if (err != cudaSuccess) {
        fprintf(stderr, "Kernel launch failed: %s\n", cudaGetErrorString(err));
        return 1;
    }

    err = cudaDeviceSynchronize();
    if (err != cudaSuccess) {
        fprintf(stderr, "CUDA error after kernel: %s\n", cudaGetErrorString(err));
        return 1;
    }

    printf("GPU job finished.\n");
    return 0;
}

Compile the CUDA program for NVIDIA H100 GPUs (sm_90)

nvcc -arch=sm_90 \
  -ccbin "$CONDA_PREFIX/bin/x86_64-conda-linux-gnu-g++" \
  -I"$CONDA_PREFIX/targets/x86_64-linux/include" \
  -L"$CONDA_PREFIX/targets/x86_64-linux/lib" \
  -o gpu_program gpu_program.cu

Create the SLURM job script: gpu_job.slurm

#!/bin/bash
#SBATCH --job-name=gpu_hello
#SBATCH --output=gpu_hello.out
#SBATCH --error=gpu_hello.err
#SBATCH --partition=h100
#SBATCH --gres=gpu:nvidia_h100_nvl:1
#SBATCH --cpus-per-task=2
#SBATCH --mem=4G
#SBATCH --time=00:05:00

source ~/miniforge3/etc/profile.d/conda.sh
conda activate cuda-env

./gpu_program

To determine your resource needs, refer to the Determine Resource Needs documentation.

Submit the job to SLURM
```
sbatch gpu_job.slurm
```

Job Management¶

Submission¶

sbatch job.sh                       # Submit job
sbatch --array=1-10 job.sh          # Submit job array
sbatch --dependency=afterok:12345 job.sh  # Submit with dependency

Monitoring¶

squeue -u $USER           # View user's jobs
squeue -p zen4            # View jobs on zen4 partition
squeue -p h100            # View jobs on h100 partition

Control¶

scancel 12345             # Cancel a specific job
scancel -u $USER          # Cancel all user's jobs
scancel -p zen4           # Cancel all jobs in zen4 partition

Resource Requests¶

CPU Jobs: Specify --nodes, --ntasks, --cpus-per-task, and --mem
GPU Jobs: Include --gres=gpu:1 or more as needed
Python Jobs:
See Python environment setup
Use --cpus-per-task for multi-threading
Set --mem appropriately for data requirements

For additional guidance, consult Slurm Documentation and REPACSS-specific resources.