Using and Administering


LoadLeveler Job States

As LoadLeveler processes a job, the job moves into various states. Some states are unique to specific daemons; for example, only the negotiator places a job in the NotQueued state. For more information on daemons, see Daemons and Processes. Possible job states are:

Cancelled
The job was cancelled either by a user or by an administrator.

Completed
The job has completed.

Deferred
The job will not be assigned to a machine until a specified date. This date may have been specified by the user in the job command file, or may have been generated by the negotiator because a parallel job did not accumulate enough machines to run the job. (Only the negotiator places a job in the Deferred state.)

Idle
The job is being considered to run on a machine, though no machine has been selected.

NotQueued
The job is not being considered to run on a machine. A job can enter this state because the associated schedd is down, the user or group associated with the job is at its maximum maxqueued or maxidle value, or because the job has a dependency which cannot be determined. For more information on these keywords, see Controlling the Mix of Idle and Running Jobs. (Only the negotiator places a job in the NotQueued state.)

Not Run
The job will never be run because a dependency associated with the job was found to be false.

Pending
The job is in the process of starting on one or more machines. (The negotiator indicates this state until the schedd acknowleges that it has received the request to start the job. Then the negotiator changes the state of the job to Starting. The schedd indicates the Pending state until all startd machines have acknowledged receipt of the start request. The schedd then changes the state of the job to Starting.)

Reject Pending
The job did not start. Possible reasons why a job is rejected are: job requirements were not met on the target machine, or the user ID of the person running the job is not valid on the target machine. After a job leaves the Reject Pending state, it is moved into one of the following states: Idle, User Hold, or Removed.

Removed
The job was stopped by LoadLeveler.

Remove Pending
The job is in the process of being removed, but not all associated machines have acknowledged the removal of the job.

Running
The job is running: the job was dispatched and has started on the designated machine.

Starting
The job is starting: the job was dispatched, was received by the target machine, and LoadLeveler is setting up the environment in which to run the job. For a parallel job, LoadLeveler sets up the environment on all required nodes. See the description of the "Pending" state for more information on when the negotiator or the schedd daemon moves a job into the Starting state.

System Hold
The job has been put in system hold.

System User Hold
The job has been put in system hold and user hold.

Terminated
If the negotiator and schedd daemons experience communication problems, they may be temporarily unable to exchange information concerning the status of jobs in the system. During this period of time, some of the jobs may actually complete and therefore be removed from the schedd's list of active jobs. When communication resumes between the two daemons, the negotiator will move such jobs to the Terminated state, where they will remain for a set period of time (specified by the NEGOTIATOR_REMOVE_COMPLETED keyword in the configuration file). When this time has passed, the negotiator will remove the jobs from its active list.

User Hold
The job has been put in user hold.

Vacated
The job started but did not complete. The negotiator will reschedule the job (provided the job is allowed to be rescheduled). Possible reasons why a job moves to the Vacated state are: the machine where the job was running was flushed, the VACATE expression in the configuration file evaluated to True, or LoadLeveler detected a condition indicating the job needed to be vacated. For more information on the VACATE expression, see Step 8: Manage a Job's Status Using Control Expressions.

You may also see other states that include "Pending," such as Complete Pending and Vacate Pending. These are intermediate, temporary states usually associated with parallel jobs.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]