Skip to main content

Running jobs

There are two ways you can run your jobs namely submitting a PBS script and running a job interactively.

Submitting the PBS Script to the Batch Scheduler

In order to run our simple PBS script, we will need to submit it to the batch scheduler using the command “qsub” followed by the name of the script we would like to run.

In the following example, we submit our “hello.pbs” script to the batch scheduler using “qsub”. Note that it returns the job identifier when the job is successfully submitted. You can use this job identifier to query the status of your job from your shell. For example: shell> qsub hello.pbs 64811.nano.nano.alliance.unm.edu

shell> qsub hello.pbs

64152.wheeler-sn.alliance.unm.edu

 

Interactive PBS Jobs

Normally a job is submitted for execution on a cluster or supercomputer using the command “qsub script.pbs”. However, at times, such as when debugging, it can be useful to run a job interactively. To run a job in this way type “qsub –I”, and the batch manager will log you into a node where you can directly run your code. For example, here is the output from an interactive session running a “hello world” MPI program on four cores of a single physical node:

rubeldas@wheeler-sn:~$ qsub -I -lnodes=1:ppn=4 -lwalltime=00:05:00

qsub: waiting for job 64143.wheeler-sn.alliance.unm.edu to start

qsub: job 64143.wheeler-sn.alliance.unm.edu ready

Wheeler Portable Batch System Prologue

Job Id: 64143.wheeler-sn.alliance.unm.edu

Username: rubeldas

Job 64143.wheeler-sn.alliance.unm.edu running on nodes:

wheeler287

 

prologue running on host: wheeler287

 

rubeldas@wheeler-sn:~$ module load openmpi/gnu

rubeldas@wheeler-sn:~$ mpiexec -np 4 ./helloworld 2>/dev/null

hello_parallel.f: Number of tasks= 4 My rank= 0 My name=wheeler-sn

hello_parallel.f: Number of tasks= 4 My rank= 1 My name=wheeler-sn

hello_parallel.f: Number of tasks= 4 My rank= 2 My name=wheeler-sn

hello_parallel.f: Number of tasks= 4 My rank= 3 My name=wheeler-sn

 

Three commands were executed here. The first, qsub -I -lnodes=1:ppn=4 -lwalltime=00:05:00, asked the batch manager to provide one node of nano with all 4 of that node’s cores for use. The walltime was specified as 5 minutes, since this was a simple code that would execute quickly. The second command, module load openmpi/gnu, loaded the module used when compiling the “hello world” program; this ensures that the necessary MPI libraries would be available during execution. The third command, mpiexec -np 4 ./helloworld 2>/dev/null, ran the “hello world” program. (The standard error was directed to null to remove a spurious message that appears sometimes on the machine.)