Skip to main content

Checking on running jobs

Checking on the status of your Job:

If you would like to check the status of your job, you can use the “qstat” command to do so. The “-a” option causes PBS to display more information about the jobs currently in the scheduler. If you would like to see the status of this job only, you would run the following from your shell:

shell> qstat 55811

Here, 55811 is the job identifier.

Note that your job can be in one of three states while it is in the scheduler: Runner, Queued, or Exiting, denoted by R, Q, and E respectively in the job state column (the column labeled “S”). You can see all these states when you use the following command.

shell> qstat -a
wheeler-sn.alliance.unm.edu:
                                                                               Req’d    Req’d         Elap
Job ID                       Username  Queue   Jobname        SessID  NDS TSK  Memory   Time      S   Time
---------------------------  --------  ------- -------------  ------  --- ---  ------  ---------  --- --------
55811.wheeler-sn.allia  saurii   default  B19F_re5e4         0    4  32    - -    48:00:00  R  47:30:42
63875.wheeler-sn.allia  yqwang   default  exp3_round4        - 1  11    - -   48:00:00  Q   - -

Determining which nodes your Job is using:

If you would like to check which nodes your job is using, you can pass the “-n” option to qsub. Note that if you currently have a job running on a node of the machine, you may freely log into that node in order to check on the status of your job. When your job is finished, your processes on that node will all be killed by the system, and the node will be released back into the available resource pool.

shell> qstat -an
wheeler-sn.alliance.unm.edu:                                                                                                                            Req’d      Req’d             Elap
Job ID                           Username  Queue   Jobname     SessID  NDS  TSK  Memory   Time        S  Time
---------------------------  ------------  --------   ----------  ------- ---  ---  ------  --------  --  --------
55811.wheeler-sn.alliance.u  saurii  default  B19F_re5e4        0    4   32     - -  48:00:00  R  47:30:42
      wheeler296/0-7+wheeler295/0-7+wheeler282/0-7+wheeler280/0-7

Here, node numbers are 296, 295, 282, and 280; every node has 8 processors.

 

Viewing Output and Error Files:

Once your job has completed, you should see two files in the directory from which you submitted the job: channel.pbs.o55811 and channel.pbs.e55811 (where channel refer to name of pbs script and 55811 refer to the numerical portion of the job identifier returned by qsub). Any output from the job sent to “standard output” will be written to the channel.pbs.o55811 file and any output sent to “standard error” will be written to the channel.pbs.e55811. These files are referred to as the “output file” and “error file” respectively throughout this document. For the example job, the error file is empty, and the output file contains the following:

Wheeler Portable Batch System Prologue
Job Id: 36680.wheeler-sn.alliance.unm.edu
Username: rubeldas
Job 36680.wheeler-sn.alliance.unm.edu running on nodes:
wheeler288 wheeler281
prologue running on host: wheeler288

But, the output file may contain more informations depending on the what asked for as standard output in the pbs script.