Using and Administering

The Monitor Program

Purpose

You can create a monitor program that monitors jobs submitted using the llsubmit subroutine. The schedd daemon invokes this monitor program if the monitor_program argument to llsubmit is not null. The monitor program is invoked each time a job step changes state. This means that the monitor program will be informed when the job step is started, completed, vacated, removed, or rejected. If you suspect the monitor program encountered problems or didn't run, you should check the listing in the schedd log. In the event of a monitor program failure, the job is still run.

Syntax

monitor_program job_id user_arg state exit_status

Parameters

monitor_program
Is the name of the program supplied in the monitor_program argument passed to the llsubmit function.

job_id
Is the full ID for the job step.

user_arg
The string supplied to the monitor_arg argument that is passed to the llsubmit function.

state
Is the current state of the job step. Possible values for the state are:

JOB_STARTED
The job step has started.

JOB_COMPLETED
The job step has completed.

JOB_VACATED
The job step has been vacated. The job step will be rescheduled if the job step is restartable or if it is checkpointable.

JOB_REJECTED
A startd daemon has rejected the job. The job will be rescheduled to another machine if possible.

JOB_REMOVED
The job step was cancelled or could not be started.

JOB_NOTRUN
The job step cannot be run because a dependency cannot be met.

exit_status
Is the exit status from the job step. The argument is meaningful only if the state is JOB_COMPLETED.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]