Using and Administering
You can create a monitor program that monitors jobs submitted using the
llsubmit subroutine. The schedd daemon invokes this monitor
program if the monitor_program argument to llsubmit is
not null. The monitor program is invoked each time a job step changes
state. This means that the monitor program will be informed when the
job step is started, completed, vacated, removed, or rejected. If you
suspect the monitor program encountered problems or didn't run, you
should check the listing in the schedd log. In the event of
a monitor program failure, the job is still run.
monitor_program job_id user_arg state exit_status
- monitor_program
- Is the name of the program supplied in the monitor_program
argument passed to the llsubmit function.
- job_id
- Is the full ID for the job step.
- user_arg
- The string supplied to the monitor_arg argument that is passed
to the llsubmit function.
- state
- Is the current state of the job step. Possible values for the state
are:
- JOB_STARTED
- The job step has started.
- JOB_COMPLETED
- The job step has completed.
- JOB_VACATED
- The job step has been vacated. The job step will be rescheduled if
the job step is restartable or if it is checkpointable.
- JOB_REJECTED
- A startd daemon has rejected the job. The job will be
rescheduled to another machine if possible.
- JOB_REMOVED
- The job step was cancelled or could not be started.
- JOB_NOTRUN
- The job step cannot be run because a dependency cannot be met.
- exit_status
- Is the exit status from the job step. The argument is meaningful
only if the state is JOB_COMPLETED.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]