Using and Administering

Task Scenario Using the Graphical User Interface

The tasks described in this chapter are those that you, as a user might be interested in accomplishing and are presented in a typical step-by-step scenario. You do not have to follow the steps shown here and may perform certain tasks before others without any difficulty. Some tasks must be performed prior to others in order for succeeding tasks to work. For example, you cannot submit a job if you do not have a job command file that you built using either the GUI or an editor.

Step 1: Build a Parallel Job

From the Jobs window:

SELECT
File > Build a Job > Parallel

&TRIANGLE. The dialog box shown in Figure 33 appears:

Figure 33. LoadLeveler Build a Job Window

View figure.

Complete those fields for which you want to override what is currently specified in your skel.cmd defaults file. A sample skel.cmd file is found in /usr/LoadL/full/samples. You can update this file to define defaults for your site, and then update the *skelfile resource in Xloadl to point to your new skel.cmd file. If you want a personal defaults file, copy skel.cmd to one of your directories, edit the file, and update the *skelfile resource in .Xdefaults.
Field Input
Executable Name of the program to run. It must be an executable file.

Optional. If omitted, the command file is executed as if it were a shell script.

Arguments Parameters to pass to the program.

Required only if the executable requires them.

Stdin Filename to use as standard input (stdin) by the program.

Optional. The default is /dev/null.

Stdout Filename to use as standard output (stdout) by the program.

Optional. The default is /dev/null.

Stderr Filename to use as standard error (stderr) by the program.

Optional. The default is /dev/null.

Initialdir Initial directory. LoadLeveler changes to this directory before running the job.

Optional. The default is your current working directory.

Notify User User id of person to notify regarding status of submitted job.

Optional. The default is your userid.

StartDate Month, day, and year in the format mm/dd/yy. The job will not start before this date.

Optional. The default is to run the job as soon as possible.

StartTime Hour, minute, second in the format hh:mm:ss. The job will not start before this time.

Optional. The default is to run the job as soon as possible.

If you specify StartTime but not StartDate, the default StartDate is the current day. If you specify StartDate but not StartTime, the default StartTime is 00:00:00. This means that the job will start as soon as possible on the specified date.

Priority Number between 0 and 100, inclusive.

Optional. The default is 50.

This is the user priority. For more information on this priority, refer to Setting and Changing the Priority of a Job.

Image size Number in kilobytes that reflects the maximum size you expect your program to grow to as it runs.

Optional.

Class Class type. The job will only run on machines that support the specified class type. Your system administrator defines the class types.

Optional. You can press the Choices button to get a list of available classes. Press the Details button under the class list to verify your permissions.

Hold Hold status of the submitted job. Permitted values are:
user
user hold
system
system hold (only valid for LoadLeveler administrators)
usersys
user and system hold (only valid for LoadLeveler administrators)

Optional. The default is a no-hold state. Press the set button to this field.

Account Number Number associated with the job. For use with the llacctmrg and llsummary commands for acquiring job accounting data.

Optional. Required only if the ACCT keyword is set to A_VALIDATE in the configuration file.

Environment Specifies your initial environment variables when your job starts. Separate environment specifications with semicolons.

Optional.

Shell The name of the shell to use for the job.

Optional. If not specified, the shell used in the owner's password file entry is used. If none is specified, /bin/sh is used.

Group The LoadLeveler group name to which the job belongs.

Optional.

Step Name The name of this job step.

Optional.

Node Usage How the node is used. Permitted values are:

shared
The node can be shared with other tasks of other job steps. This is the default.

not shared
The node cannot be shared.

Optional. Press the Set button to set this field.

Dependency A Boolean expression defining the relationship between the job steps.

Optional.

Comments Comments associated with the job. These comments help to distinguish one job from another job.

Optional.

Note:The fields that appear in this table are what you see when viewing the Build a Job window. The text in these fields does not necessarily correspond with the keywords listed in Job Command File Keywords.

See Job Command File Keywords for information on the defaults associated with these keywords.

SELECT
a Job Type if you want to change the job type you selected on the Build A Job cascading window.

Your choices are:

Serial
Specifies a serial job.
Parallel
Specifies a non-PVM parallel job.
PVM
Specifies a PVM parallel job.

Note that the job type you select affects the choices that are active on the Build A Job window.

SELECT
a Notification option

Your choices are:

Always
Notify you when the job starts, completes, and if it incurs errors.
Complete
Notify you when the job completes. This is the default option as initially defined in the skel.cmd file.
Error
Notify you if the job cannot run because of an error.
Never
Do not notify you.
Start
Notify you when the job starts.

SELECT
a Checkpoint option.

Your choices are:

No
Do not checkpoint the job. This is the default.
User
Yes, checkpoint the job at intervals you determine. See checkpoint for more information.
System
Yes, checkpoint the job at intervals determined by LoadLeveler. See checkpoint for more information.

SELECT
a Restart option

Your choices are:

No
Do not restart the job.
Yes
Yes, restart the job from an existing checkpoint file when you submit the job.

SELECT
Nodes (available when the job type is parallel)

&TRIANGLE. The Nodes dialog box appears.

Complete the necessary fields to specify node information for a parallel job. Depending upon which model you choose, different fields will be available; any unavailable fields will be greyed out. LoadLeveler will assign defaults for any fields that you leave blank.
Field Available in: Input
Min # of Nodes Tasks Per Node Model and Tasks with Uniform Blocking Model Minimum number of nodes required for running the parallel job. For more information, see node.

Optional. The default is one.

Max # of Nodes Tasks Per Node Model Maximum number of nodes required for running the parallel job. For more information, see node.

Optional. The default is the minimum number of nodes.

Tasks per Node Tasks Per Node Model The number of tasks of the parallel job you want to run per node. For more information, see tasks_per_node.

Optional.

Total Tasks Tasks with Uniform Blocking Model, and Custom Blocking Model The total number of tasks of the parallel job you want to run on all available nodes. For more information, see total_tasks.

Optional for Uniform, required for Custom Blocking. The default is one.

Blocking Custom Blocking Model The number of tasks assigned (as a block) to each consecutive node until all of a job's tasks have been assigned. For more information, see blocking
Task Geometry Custom Geometry Model The task ids of each task that you want to run on each node. You can use the "Set Geometry" button for step-by-step directions. For more information, see task_geometry

SELECT
Close to return to the Build a Job dialog box.

SELECT
Network (available when the job type is parallel)

&TRIANGLE. The Network dialog box appears.

Complete those fields for which you want to specify network information. For more information, see network.
Field Input
MPI/LAPI Choose one, both, or none of these boxes to specify the MPI (Message Passing Interface) protocol, the (LAPI Low-level Application Programming Interface) protocol, both protocols, or neither protocol.

Optional.

Adapter/Network Select an adapter name or a network type from the list.

Required for each protocol you select.

Adapter Usage Specifies that the adapter is either shared or not shared.

Optional. The default is shared.

Communication Mode Specifies the mode in which an SP switch adapter is used, and can be either IP (internet Protocol) or US (User Space).

Optional. The default is IP.

Communication Level Implies the amount of memory to be allocated to each window for the corresponding protocol, and can be Low, Average, or High.

SELECT
Close to return to the Build a Job dialog box.

SELECT
Requirements

&TRIANGLE. The Requirements dialog box appears.

Complete those fields for which you want to specify requirements. Defaults are used for those fields that you leave blank. LoadLeveler dispatches your job only to one of those machines with resources that matches the requirements you specify.
Field Input
Architecture* Machine type. The job will not run on any other machine type.

Optional. The default is the architecture of your current machine.

Operating System* Operating system. The job will not run on any other operating system.

Optional. The default is the operating system of your current machine.

Disk Amount of disk space in the execute directory. The job will only run on a machine with at least this much disk space.

Optional. The default is defined in your local configuration file.

Memory Amount of memory. The job will only run on a machine with at least this much memory.

Optional. The default is defined in your local configuration file.

Machine(s) Machine name(s). The job will only run on the specified machines.

Optional.

Feature(s) Features. The job will only run on machines with specified features.

Optional.

LoadLeveler Version Specifies the version of LoadLeveler, in dotted decimal format, on the machine where you want the job to run. For example: 2.1.0.0 specifies that your job will run on a machine running LoadLeveler Version 2.1.0.0 or higher.

Optional.

Pool Specifies the number associated with the pool you want to use. All available pools listed in the administration file appear as choices. The default is to select nodes from any pool.
Requirement Requirements. The job will only run if these requirements are met.

Note:

If you enter a resource that is not available, you will NOT receive a message. LoadLeveler holds your job in the Idle state until the resource becomes available. Therefore, ensure the spelling of your entry is correct. You can issue llq -s jobID to find out if you have a job for which requirements were not met.

*If you do not specify an architecture or operating system, LoadLeveler assumes that your job can run only on your machine's architecture and operating system. If your job is not a shell script that can be run successfully on any platform, you should specify a required architecture and operating system.

SELECT
Close to return to the Build a Job dialog box.

SELECT
Resources

&TRIANGLE. The Resources dialog box appears.

This dialog box allows you to set the amount of defined consumable resources required for a job step. Resources with an "*" appended to their names are not in the SCHEDULE_BY_RESOURCES list. For more information, see resources.

SELECT
Close to return to the Build a Job dialog box.

SELECT
Preferences

&TRIANGLE. The Preferences dialog box appears.

This dialog box is similar to the Requirements dialog box, with the exception of the Adapter choice, which is not supported as a Preference. Complete the fields for those parameters that you want to specify. These parameters are not binding. For any preferences that you specify, LoadLeveler attempts to find a machine that matches these preferences along with your requirements. If it cannot find the machine, LoadLeveler chooses the first machine that matches the requirements.

SELECT
Close to return to the Build a Job dialog box.

SELECT
Limits

&TRIANGLE. The Limits dialog box appears.

Complete the fields for those limits that you want to impose upon your job. If you type copy in any field, the limits in effect on the submit machine are used. If you leave any field blank, the default limits in effect for your userid on the machine that runs the job are used.
Field Input
CPU Limit Maximum amount of CPU time that the submitted job can use. Express the amount as:
[hours:[minutes:][seconds][ .fraction]

For example, 12:56:21 is 12 hours, 56 minutes, and 21 seconds.

Optional

Data Limit Maximum amount of the data segment that the submitted job can use. Express the amount as:
integer[.fraction][units]

where integer and fraction represent strings of up to eight digits.

Optional

Core Limit Maximum size of a core file.

Optional

RSS Limit Maximum size of the resident set size. It is the largest amount of physical memory a user's process can allocate.

Optional

File Limit Maximum size of a file that is created.

Optional

Stack Limit Maximum size of the stack.

Optional

Job CPU Limit Maximum amount of CPU a single job step can use per processor.

Optional

Wall Clock Limit Maximum amount of elapsed time for which a job can run.

Optional

SELECT
Close to return to the Build a Job dialog box.

SELECT
PVM to select a PVM job.

&TRIANGLE. The PVM dialog box appears.

Complete those fields for which you want to specify requirements. Defaults are used for those fields that you leave blank.
Field Input
Min # of Processors Minimum number of processors required for running the PVM job.

Optional. The default is one.

Max # of Processors Maximum number of processors required for running the PVM job.

Optional. The default is one.

Parallel Path The directory that defines where the PVM3 executables are located.
PVM Specifies that an adapter is used for this PVM job.
Adapter/Network Select an adapter name or a network type from the list.

Required.

Adapter Usage Specifies that the adapter is either shared or not shared.

Optional. The default is shared.

SELECT
Close to return to the Build a Job dialog box.

Step 2: Edit the Job Command File

There are several ways that you can edit the job command file that you just built:

  1. Using the Jobs window:

    SELECT
    File > Submit a Job

    &TRIANGLE. The Submit a Job dialog box appears.

    SELECT
    the job file you want to edit from the file column.

    SELECT
    Edit

    &TRIANGLE. Your job command file appears in a window. You can use any editor to edit the job command file. The default editor is specified in your .Xdefaults file.

    If you have an icon manager, an icon may appear. An icon manager is a program that creates a graphic symbol, displayed on a screen, that you can point to with a device such as a mouse in order to select a particular function or application. Select this icon to view your job command file.

  2. Using the Tools Edit pulldown menus on the Build a Job window:

    Using the Edit pulldown menu, you can modify the job command file. Your choices appear in the following table:
    To Select
    Add a step to the job command file Add a Step
    Delete a step from the job command file Delete a Step
    Clear the fields in the Build a Job window Clear Fields
    Select defaults to use in the fields Set Field Defaults
    Note:Other options include Go to Next Step, Go to Previous Step, and Go to Last Step that allow you to edit various steps in the job command file.

    Using the Tools pulldown menu, you can modify the job command file. Your choices appear in the following table:
    To Select
    Name the job Set Job Name
    Open a window where you can enter a script file Append Script
    Fill in the fields using another file Restore from File
    View the job command file in a window View Entire Job
    Determine which step you are viewing What is step #
    Start a new job command file Start a new job


To Do This
Save the information you entered into a file which you can submit later

SELECT
Save

&TRIANGLE. A window appears prompting you to enter a job filename.

ENTER
a job filename in the text entry field.

SELECT
OK

&TRIANGLE. The window closes and the information you entered is saved in the file you specified.


Submit the program immediately and discard the information you entered

SELECT
Submit

GO TO
Step 4

If you already submitted your job, go to Step 4: Display, Refresh and Obtain Job Status. Otherwise, go to Step 3: Submit a Job Command File.

Step 3: Submit a Job Command File

After building a job command file, you can submit it to one or more machines for processing. In addition to scripts with LoadLeveler keywords, you can also submit scripts that contain NQS options. You cannot, however, in this release of LoadLeveler, combine NQS and LoadLeveler options.

To submit a job, from the Jobs window:

SELECT
File > Submit a Job

&TRIANGLE. The Submit a Job dialog box appears.

SELECT
the job file that you want to submit from the file column.

You can also use the filter field and the directories column to select the file or you can type in the file name in the text entry field.

SELECT
Submit

&TRIANGLE. The job is submitted for processing.

You can now submit another job or you can press Close to exit the window.

Go to the next step.

Step 4: Display, Refresh and Obtain Job Status

When you submit a job, the status of the job is automatically displayed in the Jobs window. You can update or refresh this status using the Jobs window and selecting one of the following:

To change how often the amount of time should pass before the jobs window is automatically refreshed, use the Jobs window.

SELECT
Refresh > Set Auto Refresh

&TRIANGLE. A window appears.

TYPE IN
a value for the number of seconds to pass before the Jobs window is updated.

Automatic refresh can be expensive in terms of network usage and CPU cycles. You should specify a refresh interval of 120 seconds or more for normal use.

SELECT
OK

&TRIANGLE. The window closes and the value you specified takes effect.

To receive detailed information on a job:

SELECT
Actions > Extended Status to receive additional information on the job. Selecting this option is the same as typing llq -x command.

You can also get information in the following way:

SELECT
Actions > Extended Details

Selecting this option is the same as typing llq -x -l command. You can also double click on the job in the Jobs window to get details on the job.

Note: Obtaining extended status or details on multiple jobs can be expensive in terms of network usage and CPU cycles.

SELECT
Actions > Job Status

You can also use the llq -s command to determine why a submitted job remains in the Idle or Deferred state.

For more information on these states, see llq - Query Job Status.

Go to the next step.

Step 5: Sort the Jobs Window

You can specify up to two sorting options for the Jobs window. The options you specify determine the order in which the jobs appear in the Jobs window.

From the Jobs window:
Action Select Sort > Type of Sort
Sort jobs by the machine from which they were submitted Sort by Submitting Machine > [Primary|Secondary]
Sort by owner Sort by Owner > [Primary|Secondary]
Sort by the time the jobs were submitted Sort by Submission Time > [Primary|Secondary]
Sort by the state of the job Sort by State > [Primary|Secondary]
Sort jobs by their user priority (last job listed runs first) Sort by Priority > [Primary|Secondary]
Sort by the class of the job Sort by Class > [Primary|Secondary]
Sort by the group associated with the job Sort by Group > [Primary|Secondary]
Sort by the machine running the job Sort by Running Machine > [Primary|Secondary]
Sort by dispatch order Sort by Dispatch Order > [Primary|Secondary]
Not specify a sort No Sort [Primary|Secondary]

Each sorting option contains a cascading window which allows you to select this option as either a Primary or Secondary sorting option. For example, suppose you select Sort by Owner as the primary sorting option and Sort by Class as the secondary sorting option. The Jobs window is sorted by owner and, within each owner, by class.

Go to the next step.

Step 6: Change Priorities of Jobs in a Queue

If your job has not yet begun to run and is still in the queue, you can change the priority of the job in relation to your other jobs in the queue that belong to the same class. This only affects the user priority of the job. For more information on this priority, refer to Setting and Changing the Priority of a Job. Only the owner of a job or the LoadLeveler administrator can change the priority of a job.

From the Jobs window:

SELECT
a job by clicking on it with the mouse

SELECT
Actions > Priority

&TRIANGLE. A window appears.

TYPE IN
a number between 0 and 100, inclusive, to indicate a new priority.

SELECT
OK

&TRIANGLE. The window closes and the priority of your job changes.

Go to the next step.

Step 7: Hold a Job

Only the owner of a job or the LoadLeveler administrator can place a hold on a job.

From the Jobs window:

SELECT
the job you want to hold by clicking on it with the mouse

SELECT
Actions > Hold

&TRIANGLE. The job is put on hold and its status changes in the Jobs window.

Go to the next step.

Step 8: Release a Hold on a Job

Only the owner of a job or the LoadLeveler administrator can release a hold on a job.

From the Jobs window:

SELECT
the job you want to release by clicking on it with the mouse

SELECT
Actions > Release from Hold

&TRIANGLE. The job is released from hold and its status is updated in the Jobs window.

Go to the next step.

Step 9: Cancel a Job

Only the owner of a job or the LoadLeveler administrator can cancel a job.

From the Jobs window:

SELECT
the job you want to cancel by clicking on it with the mouse

SELECT
Actions > Cancel

&TRIANGLE. A warning dialog box appears prompting you to confirm your cancellation request. Once you confirm your request, LoadLeveler cancels the job and the job information disappears from the Jobs window.

Go to the next step.

Step 10: Display and Refresh Machine Status

The status of the machines is automatically displayed in the Machines window. You can update or refresh this status using the Machines window and selecting one of the following:

To specify an amount of time to pass before the Machines window is automatically refreshed, from the Machines window:

SELECT
Refresh > Set Auto Refresh

&TRIANGLE. A window appears.

TYPE IN
a value for the number of seconds to pass before the Machines window is updated.

Automatic refresh can be expensive in terms of network usage and CPU cycles. You should specify a refresh interval of 120 seconds or more for normal use.

SELECT
OK

&TRIANGLE. The window closes and the value you specified takes effect.

Go to the next step.

Step 11: Sort the Machines Window

You can specify up to two sorting options for the Machines window. The options you specify determine the order in which machines appear in the window.

From the Machines window:
Action Select Sort > Sort Type
Sort by machine name Sort by Name > [Primary|Secondary]
Sort by schedd state Sort by Schedd > [Primary|Secondary]
Sort by total number of jobs scheduled Sort by InQ > [Primary|Secondary]
Sort by number of running jobs scheduled by this machine Sort by Act > [Primary|Secondary]
Sort by startd state Sort by Startd > [Primary|Secondary]
Sort by the number of jobs running on this machine Sort by Run > [Primary|Secondary]
Sort by load average Sort by LdAvg > [Primary|Secondary]
Sort by keyboard idle time Sort by Idle > [Primary|Secondary]
Sort by hardware architecture Sort by Arch > [Primary|Secondary]
Sort by operating system type Sort by OpSys > [Primary|Secondary]
Not specify a sort No Sort [Primary|Secondary]

Each sorting option contains a cascading window which allows you to select this option as either a Primary or Secondary sorting option. For example, suppose you select Sort by Arch as the primary sorting option and Sort by Name as the secondary sorting option. The Machines window is sorted by by hardware architecture, and within each architecture type, by machine name.

Go to the next step.

Step 12: Find the Location of the Central Manager

The LoadLeveler administrator designates one of the nodes in the LoadLeveler cluster as the central manager. When jobs are submitted at any node, the central manager is notified and decides where to schedule the jobs. In addition, it keeps track of the status of machines in the cluster and the jobs in the system by communicating with each node. LoadLeveler uses this information to make the scheduling decisions and to respond to queries.

To find the location of the central manager, from the Machines window:

SELECT
Actions > Find Central Manager

&TRIANGLE. A message appears in the message window declaring on which machine the central manager is located.

Go to the next step.

Step 13: Find the Location of the Public Scheduling Machines

Public scheduling machines are those machines that participate in the scheduling of LoadLeveler jobs on behalf of the submit-only machines.

To get a list of these machines in your cluster, use the Machines window:

SELECT
Actions > Find Public Scheduler

&TRIANGLE. A message appears displaying the names of these machines.

Go to the next step.

Step 14: Specify Which Jobs Appear in the Jobs Window

Normally, only your jobs appear in the Jobs window. You can, however, specify which jobs you want to appear by using the Select pull-down menu on the Jobs window.
To Display Select Select >
All jobs in the queue All
All jobs belonging to a specific user (or users) By User

&TRIANGLE. A window appears prompting you to enter the user IDs whose jobs you want to view.

All jobs submitted to a specific machine (or machines) By Machine

&TRIANGLE. A window appears prompting you to enter the machine names on which the jobs you want to view are running.

All jobs belonging to a specific group (or groups) By Group

&TRIANGLE. A window appears prompting you to enter the LoadLeveler group names to which the jobs you want to view belong.

All jobs having a particular ID By Job Id

A dialog box prompts you to enter the id of the job you want to appear. This ID appears in the left column of the Jobs window. Type in the ID and press OK.

Note:

When you choose By User, By Machines, or By Group, you can use a UNIX regular expression enclosed in parentheses. For example, you can enter (^k10) to display all machines beginning with the characters "k10".

SELECT
Select > Show Selection to show the selection parameters.

Go to the next step.

Step 15: Specify Which Machines Appear in Machines Window

You can specify which machines will appear in the Machines window. The default is to view all of the machines in the LoadLeveler pool.

From the Machines window:
To Select Select >
View all of the machines All
View machines by operating system by OpSys

&TRIANGLE. A window appears prompting you to enter the operating system of those machines you want to view.

View machines by hardware architecture by Arch

&TRIANGLE. A window appears prompting you to enter the hardware architecture of those machines you want to view.

View machines by state by State

&TRIANGLE. A cascading pulldown menu appears prompting you to select the state of the machines that you want to view.

SELECTt
Select > Show Selection to show the selection parameters.

Go to the next step.

Step 16: Save LoadLeveler Messages in a File

Normally, all the messages that LoadLeveler generates appear in the Messages window. If you would also like to have these messages written to a file, use the Messages window.

SELECT
Actions > Start logging to a file

&TRIANGLE. A window appears prompting you to enter a filename in which to log the messages.

TYPE IN
the filename in the text entry field.

SELECT
OK

&TRIANGLE. The window closes.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]