Using and Administering

Chapter 1. What is LoadLeveler?

LoadLeveler is a job management system that allows users to run more jobs in less time by matching their processing needs to available resources. LoadLeveler serves as a job scheduler and provides a facility for building, submitting, and processing jobs quickly and efficiently in a dynamic environment.

Figure 1 shows the different environments to which LoadLeveler can schedule jobs. Together, these environments comprise the LoadLeveler cluster. An environment can include heterogeneous clusters, dedicated nodes, and the RISC System/6000 Scalable POWERparallel System (SP).

Figure 1. Example of a LoadLeveler Configuration

View figure.

In addition, LoadLeveler can schedule jobs written for NQS to machines outside of the LoadLeveler cluster for execution. As Figure 1 also illustrates, a LoadLeveler cluster can include submit-only machines, which allow users to have access to a limited number of LoadLeveler features. This type of machine is further discussed in "Roles of Machines".

How LoadLeveler Works

This section describes how LoadLeveler works by introducing some basic job scheduling concepts.

What Does a Network Job Management and Job Scheduling System Do?

A network job management and job scheduling system, such as LoadLeveler, is a software program that schedules and manages jobs that you submit to one or more machines under its control. LoadLeveler accepts jobs that users submit and reviews the job requirements. LoadLeveler then examines the machines under its control to determine which machines are best suited to run each job.

Jobs

LoadLeveler schedules your jobs on one or more machines for processing. The definition of a job, in this context, is a set of job steps. For each job step, you can specify a different executable. (The executable is the part of the job that gets processed.) You can use LoadLeveler to submit jobs which are made up of one or more job steps, where each job step depends upon the completion status of a previous job step. For example, Figure 2 illustrates a stream of job steps:

Figure 2. LoadLeveler Job Steps

View figure.

Each of these job steps is defined in a single job command file. A job command file specifies the name of the job, as well as the job steps that you want to submit, and can contain other LoadLeveler statements.

LoadLeveler tries to execute each of your job steps on a machine that has enough resources to support executing and checkpointing each step. If your job command file has multiple job steps, the job steps will not necessarily run on the same machine, unless you explicitly request that they do.

You can submit batch jobs to LoadLeveler for scheduling. Batch jobs run in the background and generally do not require any input from the user. Batch jobs can either be serial or parallel. A serial job runs on a single machine, while a parallel job - a job that was written using a parallel language Application Program Interface (API) - is separated into multiple parts that can be processed simultaneously by several machines.

Machines and Workstations

In order for LoadLeveler to schedule a job on a machine, the machine must be a valid member of the LoadLeveler cluster. A cluster is the combination of all of the different types of machines that use LoadLeveler. The following types of machines can be in a LoadLeveler cluster:

RISC System/6000 (and compatible hardware running AIX)
SP System

To make a machine a member of the LoadLeveler cluster, the administrator has to install the LoadLeveler software onto the machine and identify the central manager (described in "Roles of Machines"). Once the machine becomes a valid member of the cluster, LoadLeveler can schedule jobs to the machine.

Roles of Machines

Each machine in the LoadLeveler cluster performs one or more roles that make job scheduling possible. These roles are described below:

Scheduling Machine: When a job is submitted, it gets placed in a queue managed by a scheduling machine. This machine contacts another machine that serves as the central manager for the entire LoadLeveler cluster. (This role is described below). This scheduling machine asks the central manager to find a machine that can run the job and keeps persistent information about the job. Some scheduling machines are known as public scheduling machines, meaning any LoadLeveler user can access them. These machines schedule jobs submitted from submit-only machines, which are described below.
Central Manager Machine: The role of the central manager is to examine the job's requirements and find one or more machines in the LoadLeveler cluster that will run the job. Once it finds the machine(s), it notifies the scheduling machine.
Executing Machine: The machine that runs the job is known as the executing machine.
Submitting Machine: This type of machine is known as a submit-only machine. It participates in the LoadLeveler cluster on a limited basis. Although the name implies that users of these machines can only submit jobs, they can also query and cancel jobs. Users of these machines also have their own Graphical User Interface (GUI) that provides them with the submit-only subset of functions. The submit-only machine feature allows workstations that are not part of the LoadLeveler cluster to submit jobs to the cluster.

Keep in mind that one machine can assume multiple roles.

Machine Availability

There may be times when some of the machines in the LoadLeveler cluster are not available to process jobs. This may be when the owners of the machines have decided to make them unavailable. This ability of LoadLeveler to allow users to restrict the use of their machines provides flexibility and control over the resources.

Machine owners can make their personal workstations available to other LoadLeveler users in several ways. For example, you can specify that:

The machine will always be available
The machine will be available only between certain hours
The machine will be available when the keyboard and mouse are not being used interactively.

Owners can also specify that their personal workstations will never be available to other LoadLeveler users.

LoadLeveler Daemons

This section lists the daemons that LoadLeveler uses to process jobs. For more detailed information, see "Daemons and Processes".

LoadL_master: Referred to as the master daemon, this daemon manages all LoadLeveler daemons on its machine. The master daemon runs on all machines in the cluster.
LoadL_schedd: Referred to as the schedd daemon, this daemon manages a list of jobs submitted to the machine. The schedd daemon runs on all scheduling machines in the cluster.
LoadL_startd: Referred to as the startd daemon, this daemon accepts jobs to be run on the machine where startd runs. The startd daemon runs on all executing machines in the cluster.
LoadL_starter: Spawned by the startd daemon, the starter process manages a running job on the executing machine. The starter process runs on all executing machines in the cluster.
LoadL_kbdd: Referred to as the keyboard daemon, this daemon monitors keyboard and mouse activity. The keyboard daemon runs on all executing machines in the cluster.
LoadL_negotiator: Referred to as the negotiator daemon, this daemon collects job status and machine status from all machines in the LoadLeveler cluster, and makes decisions on where the jobs should be run. The negotiator daemon runs on the LoadLeveler central manager machine.

How Does LoadLeveler Schedule Jobs to Run on Machines?

Once a user submits a job to LoadLeveler, LoadLeveler examines the job in order to determine what resources it needs to run the job. Then, LoadLeveler determines which machines in the LoadLeveler cluster are best suited to run the job. Once the appropriate machine is found, LoadLeveler dispatches the job to the machines. To provide this function, LoadLeveler uses the concept of queues.

A job queue is a list of jobs that are waiting to be processed. When a user submits a job to LoadLeveler, the job enters into an internal database that resides on one of the machines in the LoadLeveler cluster until it is ready to be dispatched to another machine to be run, as shown in Figure 3.

Figure 3. Job Queues

View figure.

Once LoadLeveler examines the job to determine its required resources, the job is dispatched to a machine to be processed. Arrows 2 and 3 indicate that the job can be dispatched to either one machine or, in the case of parallel jobs, to multiple machines. Once the job reaches the executing machine, the job runs.

Jobs do not necessarily get dispatched to machines in the cluster based upon a first-come, first-serve basis. Instead, LoadLeveler examines the requirements and characteristics of the job and the availability of machines and determines the best time for the job to be dispatched.

LoadLeveler also uses the concept of job classes to schedule jobs to run on machines. A job class is a classification to which a job can belong. For example, short running jobs may belong to a job class called short_jobs. Similarly, jobs that are only allowed to run on the weekends may belong to a class called weekend. The system administrator can define these job classes and select the users that are authorized to submit jobs of these classes. For more information, see "Step 3: Specify Class Stanzas".

You can specify which types of jobs will run on a machine by specifying the type(s) of job classes the machine will support. For more information, see "Step 1: Specify Machine Stanzas".

LoadLeveler also examines a job's priority in order to determine when to schedule the job on a machine. A priority of a job is used to determine its position among a list of all jobs waiting to be dispatched. For more information on job priority, see "Setting and Changing the Priority of a Job".

The LoadLeveler Job Cycle

Figure 4 illustrates the information flow through the LoadLeveler cluster:

Figure 4. High-Level Job Flow

View figure.

With LoadLeveler, there is a managing machine known as the central manager. Also, there are machines that act as scheduling machines and machines that serve as the executing machines. The arrows in Figure 4 illustrate the following:

Arrow 1 indicates that a job has been submitted to LoadLeveler.
Arrow 2 indicates that the scheduling machine contacts the central manager to inform it that a job has been submitted and to find out if a machine exists that matches the job requirements.
Arrow 3 indicates that the central manager checks to determine if a machine exists that is capable of running the job. Once a machine is found, the central manager informs the scheduling machine which machine is available.
Arrow 4 indicates that the scheduling machine contacts the executing machine and provides it with information regarding the job.

Figure 4 is broken down into the following more detailed diagrams illustrating how LoadLeveler processes a job.

Submit a LoadLeveler job:
Figure 5. Job is Submitted to LoadLeveler

View figure.

Figure 5 illustrates that the schedd daemon runs on the scheduling machine. This machine can also have the startd daemon running on it. The negotiator daemon resides on the central manager machine. The arrows in Figure 5 illustrate the following:
- Arrow 1 indicates that a job has been submitted to the scheduling machine.
- Arrow 2 indicates that the schedd daemon, on the scheduling machine, stores all of the relevant job information on local disk.
- Arrow 3 indicates that the schedd daemon sends job description information to the negotiator daemon.
Permit to run:
Figure 6. LoadLeveler Authorizes the Job

View figure.

In Figure 6, arrow 4 indicates that the negotiator daemon authorizes the schedd daemon to begin taking steps to run the job. This authorization is called a permit to run. Once this is done, the job is considered Pending or Starting. (See "LoadLeveler Job States" for more information.)
Prepare to run:
Figure 7. LoadLeveler Prepares to Run the Job

View figure.

In Figure 7, arrow 5 illustrates that the schedd daemon contacts the startd daemon on the executing machine and requests that it start the job. The executing machine can either be a local machine (the machine from which the job was submitted) or a remote machine (another machine in the cluster).
Initiate job:
Figure 8. LoadLeveler Starts the Job

View figure.

The arrows in Figure 8 illustrate the following:
- The two arrows numbered 6 indicate that the startd daemon on the executing machine, spawns a starter process and awaits more work.
- The two arrows numbered 7 indicate that the schedd daemon sends the starter process the job information and the executable.
- Arrow 8 indicates that the schedd daemon notifies the negotiator daemon that the job has been started and the negotiator daemon marks the job as Running. (See "LoadLeveler Job States" for more information.)
The starter forks and executes the user's job, and the starter parent waits for the child to complete.
Complete job:
Figure 9. LoadLeveler Completes the Job

View figure.

The arrows in Figure 9 illustrate the following:
- The arrows numbered 9 indicate that when the job completes, the starter process notifies the startd daemon, and the startd daemon notifies the schedd daemon.
- Arrow 10 indicates that the schedd daemon examines the information it has received and forwards it to the negotiator daemon.

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]