IBM Books

Using and Administering


Chapter 11. LoadLeveler APIs

LoadLeveler provides several Application Programming Interfaces (API) that you can use. LoadLeveler's APIs are interfaces that allow application programs written by customers to interact with the LoadLeveler environment by using specific data or functions that are a part of LoadLeveler. These interfaces can be subroutines within a library or installation exits. This chapter also describes configuration file keywords required to enable these APIs.

This chapter discusses the following:

The header file llapi.h defines all of the API data structures and subroutines. This file is located in the include subdirectory of the LoadLeveler release directory. You must include this file when you call any API subroutine.

The library libllapi.a is a shared library containing all of the LoadLeveler API subroutines. This library is located in the lib subdirectory of the LoadLeveler release directory.

Attention: These APIs are not thread safe; They should not be linked to by a threaded application.


Accounting API

LoadLeveler provides two subroutines for accounting: one for account validation and one for extracting accounting data.

Account Validation Subroutine

LoadLeveler provides the llacctval executable to perform account validation.

Purpose

llacctval compares the account number a user specifies in a job command file with the account numbers defined for that user in the LoadLeveler administration file. If the account numbers match, llacctval returns a value of zero. Otherwise, it returns a non-zero value.

Syntax

program user_name user_group user_acct# acct1 acct2 ...

Parameters

program

Is the name of the program that performs the account validation. The default is llacctval. The name you specify here must match the value specified on the ACCT_VALIDATION keyword. in the configuration file.

user_name

Is the name of the user whose account number you want to validate.

user_group

Is the login group name of the user.

user_acct#

Is the account number specified by the user in the job command file.

acct1 acct2 ...

Are the account numbers obtained from the user stanza in the LoadLeveler administration file.

Description

llacctval is invoked from within the llsubmit command. If the return code is non-zero, llsubmit does not submit the job.

You can replace llacctval with your own accounting user exit (see below).

To enable account validation, you must specify the following keyword in the configuration file:

  ACCT = A_VALIDATE

To use your own accounting exit, specify the following keyword in the configuration file:

  ACCT_VALIDATION = pathname

where pathname is the name of your accounting exit.

Return Values

If the validation succeeds, the exit status must be zero. If it does not succeed, the exit status must be a non-zero number.

Report Generation Subroutine

LoadLeveler provides the GetHistory subroutine to generate accounting reports.

Purpose

GetHistory processes local or global LoadLeveler history files.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
int GetHistory(char *filename, int (*func (LL_job *), int version);

Parameters

filename

Specifies the name of the history file.

(*func (LL_job *)

Specifies the user-supplied function you want to call to process each history record. The function must return an integer and must accept as input a pointer to the LL_job structure. The LL_job structure is defined in the llapi.h file.

version

Specifies the version of the history record you want to create. LL_JOB_VERSION in the llapi.h file creates an LL_job history record.

Description

GetHistory opens the history file you specify, reads one LL_job accounting record, and calls a user-supplied routine, passing to the routine the address of an LL_job structure. GetHistory processes all history records one at a time and then closes the file. Any user can call this subroutine.

The user-supplied function must include the following files:

#include sys/resource.h
#include sys/types.h
#include sys/time.h

The ll_event_usage structure is part of the LL_job structure and contains the following LoadLeveler defined data:

int event

Specifies the event identifier. This is an integer whose value is one of the following:
1
Represents a LoadLeveler-generated event.
2
Represents an installation-generated event.

char *name

Specifies a character string identifying the event. This can be one of the following:

Return Values

GetHistory returns a zero when successful.

Error Values

GetHistory returns -1 to indicate that the version is not supported or that an error occurred opening the history file.

Examples

Makefiles and examples which use this API are located in the samples/llphist subdirectory of the release directory. The examples include the executable llpjob, which invokes GetHistory to print every record in the history file. In order to compile llpjob, the sample Makefile must update the RELEASE_DIR field to represent the current LoadLeveler release directory. The syntax for llpjob is:

  llpjob history_file

Where history_file is a local or global history file.


Serial Checkpointing API

This section describes ckpt, the subroutine used for user-initiated checkpointing of serial jobs. "Step 13: Enable Checkpointing" describes how to checkpoint your jobs in various ways including system-initiated and user-initiated. For information of checkpointing parallel jobs, see IBM Parallel Environment for AIX: Operation and Use, Volume 1 .

ckpt Subroutine

Purpose

Specify the ckpt subroutine in a FORTRAN, C, or C++ program to activate user-initiated checkpointing. Whenever this subroutine is invoked, a checkpoint of the program is taken.

C++ Syntax

  extern "C"{void ckpt();}

C Syntax

  void ckpt();

FORTRAN Syntax

  call ckpt()

Related Information

FORTRAN, C, and C++ programs can be compiled with the crxlf, crxlc, and crxlC programs, respectively. These programs are found in the bin subdirectory of the LoadLeveler release directory. See "Ensure all User's Jobs are Linked to Checkpointing Libraries" for information on using these compile programs.


The Submit API

This API allows you to submit jobs to LoadLeveler. The submit API consists of the llsubmit subroutine, the llfree_job_info subroutine, and the monitor program.

llsubmit Subroutine

llsubmit is both the name of a LoadLeveler command used to submit jobs as well as the subroutine described here.

Purpose

The llsubmit subroutine submits jobs to LoadLeveler for scheduling.

Syntax

  int llsubmit (char *job_cmd_file, char *monitor_program,
  char *monitor_arg, LL_job *job_info, int job_version);

Parameters

job_cmd_file

Is a pointer to a string containing the name of the job command file.

monitor_program

Is a pointer to a string containing the name of the monitor program to be invoked when the state of the job is changed. It is set to NULL if a monitoring program is not provided.

monitor_arg

Is a pointer to a string which is stored in the job object and is passed to the monitor program. The maximum length of the string is 1023 bytes. If the length exceeds this value, it is truncated to 1023 bytes. The string is set to NULL if an argument is not provided.

job_info

Is a pointer to a structure defined in the llapi.h header file. No fields are required to be filled in. Upon return, the structure will contain the number of job steps in the job command file and a pointer to an array of pointers to information about each job step. Space for the array and the job step information is allocated by llsubmit. The caller should free this space using the llfree_job_info subroutine.

job_version

Is an integer indicating the version of llsubmit being used. This argument should be set to LL_JOB_VERSION which is defined in the llapi.h include file.

Description

LoadLeveler must be installed and configured correctly on the machine on which the submit application is run.

The uid and gid in effect when llsubmit is invoked is the uid and gid used when the job is run.

Return Values

0
The job was submitted.

Error Values

-1
The job was not submitted. Error messages are written to stderr.

llfree_job_info Subroutine

Purpose

llfree_job_info frees space for the array and the job step information used by llsubmit.

Syntax

  void llfree_job_info(LL_job *job_info, int job_version);

Parameters

job_info

Is a pointer to a LL_job structure. Upon return, the space pointed to by the step_list variable and the space associated with the LL_job step structures pointed to by the step_list array are freed. All fields in the LL_job structure are set to zero.

job_version

Is an integer indicating the version of llfree_job_info being used. This argument should be set to LL_JOB_VERSION which is defined in the llapi.h header file.

The Monitor Program

Purpose

You can create a monitor program that monitors jobs submitted using the llsubmit subroutine. The schedd daemon invokes this monitor program if the monitor_program argument to llsubmit is not null. The monitor program is invoked each time a job step changes state. This means that the monitor program will be informed when the job step is started, completed, vacated, removed, or rejected.

Syntax

monitor_program job_id user_arg state exit_status

Parameters

monitor_program

Is the name of the program supplied in the monitor_program argument passed to the llsubmit function.

job_id

Is the full ID for the job step.

user_arg

The string supplied to the monitor_arg argument that is passed to the llsubmit function.

state

Is the current state of the job step. Possible values for the state are:

JOB_STARTED

The job step has started.

JOB_COMPLETED

The job step has completed.

JOB_VACATED

The job step has been vacated. The job step will be rescheduled if the job step is restartable or if it is checkpointable.

JOB_REJECTED

A startd daemon has rejected the job. The job will be rescheduled to another machine if possible.

JOB_REMOVED

The job step was cancelled or could not be started.

JOB_NOTRUN

The job step cannot be run because a dependency cannot be met.

exit_status

Is the exit status from the job step. The argument is meaningful only if the state is JOB_COMPLETED.

Data Access API

This API gives you access to LoadLeveler objects and allows you to retrieve specific data from the objects. You can use this API to query the negotiator daemon for information about its current set of jobs and machines. The Data Access API consists of the following subroutines: Ll_query, ll_set_request, ll_reset_request, ll_get_objs, ll_get_data, ll_next_obj, ll_free_objs, and ll_deallocate.

Using the Data Access API

To use this API, you need to call the data access subroutines in the following order:

To see code that uses these subroutines, refer to "Examples of Using the Data Access API". For more information on LoadLeveler objects, see "Understanding the LoadLeveler Job Object Model".

ll_query Subroutine

Purpose

The ll_query subroutine initializes the query object and defines the type of query you want to perform. The LL_element created and the corresponding data returned by this function is determined by the query_type you select.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
LL_element * ll_query(enum QueryType query_type);

Parameters

query_type

Can be JOBS (to query job information) or MACHINES (to query machine information).

Description

query_type is the input field for this subroutine.

This subroutine is used in conjunction with other data access subroutines to query information about job and machine objects. You must call ll_query prior to using the other data access subroutines.

Return Values

This subroutine returns a pointer to an LL_element object. The pointer is used by subsequent data access subroutine calls.

Error Values

NULL
The subroutine was unable to create the appropriate pointer.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_reset_request, ll_get_objs, ll_free_objs, ll_next_obj, ll_deallocate.

ll_set_request Subroutine

Purpose

The ll_set_request subroutine determines the data requested during a subsequent ll_get_objs call to query specific objects. You can filter your queries based on the query_type, object_filter, and data_filter you select.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
int ll_set_request(LL_element *query_element,QueryFlags query_flags,
char **object_filter,DataFilter data_filter);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query subroutine.

query_flags

When query_type (in ll_query) is JOBS, query_flags can be the following:

QUERY_ALL

Query all jobs.

QUERY_JOBID

Query by job ID.

QUERY_STEPID

Query by step ID.

QUERY_USER

Query by user ID.

QUERY_GROUP

Query by LoadLeveler group.

QUERY_CLASS

Query by LoadLeveler class.

QUERY_HOSTS

Query by machine name.

When query_type (in ll_query) is MACHINES, query_flags can be the following:

QUERY_ALL

Query all machines.

QUERY_HOST

Query by machine names.

object_filter

Specifies search criteria. The value you specify for object_filter is related to the value you specify for query_flags:

data_filter

Filters the data returned from the object you query. The value you specify for data_filter is related to the value you specify for query_type:

Description

query_element, query_flags, object_filter, and data_filter are the input fields for this subroutine.

You can request a combination of object filters by calling ll_set_request more than once. When you do this, the query flags you specify are or-ed together. The following are valid combinations of object filters:

That is, to query jobs owned by certain users and on a specific machines, issue ll_set_request first with QUERY_USER and the appropriate user IDs, and then issue it again with QUERY_HOST and the appropriate host names.

For example, suppose you issue ll_set_request with a user ID list of anton and meg, and then issue it again with a host list of k10n10 and k10n11. The objects returned are all of the jobs on k10n10 and k10n11 which belong to anton or meg.

Note that if you use two consecutive calls with the same flag, the second call will replace the previous call.

Also, you should not use the QUERY_ALL flag in combination with any other flag, since QUERY_ALL will replace any existing requests.

Return Values

This subroutine returns a zero to indicate success.

Error Values

-1
You specified an invalid query_element.
-2
You specified an invalid query_flag.
-3
You specified an invalid object_filter.
-4
You specified an invalid data_filter.
-5
A system error occurred.

Related Information

Subroutines: ll_get_data, ll_query, ll_reset_request, ll_get_objs, ll_free_objs, ll_next_obj, ll_deallocate.

ll_reset_request Subroutine

Purpose

The ll_reset_request subroutine resets the request data to NULL for the query_element you specify.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
int ll_reset_request(LL_element *query_element);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query function.

Description

query_element is the input field for this subroutine.

This subroutine is used in conjunction with ll_set_request to change the data requested with the ll_get_objs subroutine.

Return Values

This subroutine returns a zero to indicate success.

Error Values

-1
The subroutine was unable to reset the appropriate data.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_query, ll_get_objs, ll_free_objs, ll_next_obj, ll_deallocate.

ll_get_objs Subroutine

Purpose

The ll_get_objs subroutine sends a query request to the daemon you specify along with the request data you specified in the ll_set_request subroutine. ll_get_objs receives a list of objects matching the request.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
LL_element * ll_get_objs(LL_element *query_element,LL_Daemon query_daemon,
char *hostname,int *number_of_objs,int *error_code);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query function.

query_daemon

Specifies the LoadLeveler daemon you want to query. The following indicates which daemons respond to which query flags. When query_type (in ll_query) is JOBS, the query_flags (in ll_set_request) listed in the lefthand column are responded to by the daemons listed in the righthand column:

QUERY_ALL

negotiator and schedd

QUERY_JOBID

negotiator and schedd

QUERY_STEPID

negotiator

QUERY_USER

negotiator

QUERY_GROUP

negotiator

QUERY_CLASS

negotiator

QUERY_HOST

negotiator

When query_type (in ll_query) is MACHINES, the query_flags (in ll_set_request) listed in the lefthand column are responded to by the daemons listed in the righthand column:

QUERY_ALL

negotiator and schedd

QUERY_HOST

negotiator

hostname

Specifies the host name where the daemon is queried. If you specify NULL, the daemon on the local machine is queried. To contact the negotiator daemon, you do not need to specify a hostname.

number_of_objs

Is a pointer to an integer representing the number of objects received from the daemon.

error_code

Is a pointer to an integer representing the error code issued when the function returns a NULL value. See "Error Values".

Description

query_element, query_daemon, and hostname are the input fields for this subroutine. number_of_objs and error_code are output fields.

Each LoadLeveler daemon returns only the objects that it knows about.

Return Values

This subroutine returns a pointer to the first object in the list. You must use the ll_next_obj subroutine to access the next object in the list.

Error Values

This subroutine a NULL to indicate failure. The error_code parameter is set to one of the following:

-1
You specified an invalid query_element.
-2
You specified an invalid query_daemon.
-3
The API could not resolve the hostname.
-4
You set an invalid request type for the specified daemon.
-5
A system error occurred.
-6
No objects exist matching your request.
-7
An internal error occurred.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_query, ll_get_objs, ll_free_objs, ll_next_obj, ll_deallocate.

Understanding the LoadLeveler Job Object Model

The ll_get_data subroutine of the data access API allows you to access the LoadLeveler job model. The LoadLeveler job model consists of objects that have attributes and connections to other objects. An attribute is a characteristic of the object and generally has a primitive data type (such as integer, float, or character). The job name, submission time and job priority are examples of attributes.

Objects are connected to one or more other objects via relationships. An object can be connected to other objects through more than one relationship, or through the same relationship. For example, A Job object is connected to a Credential object and to Step objects through two different relationships. A Job object can be connected to more than one Step object through the same relationship of "having a Step." When an object is connected through different relationships, different specifications are used to retrieve the appropriate object.

When an object is connected to more than one object through the same relationship, there are Count, GetFirst and GetNext specifications associated with the relationship. The Count operation returns the number of connections. You must use the GetFirst operation to initialize access to the first such connected object. You must use the GetNext operation to get the remaining objects in succession. You can not use GetNext after the last object has been retrieved.

You can use the ll_get_data subroutine to access both attributes and connected objects. See "ll_get_data Subroutine" for more information.

The root of the job model is the Job object, as shown in Figure 35. The job is queried for information about the number of steps it contains and the time it was submitted. The job is connected to a single Credential object and one or more Step objects. Elements for these objects can be obtained from the job.

You can query the Credential object to obtain the ID and group of the submitter of the job.

The Step object represents one executable unit of the job (all the tasks that are executed together). It contains information about the execution state of the step, messages generated during execution of the step, the number of nodes in the step, the number of unique machines the step is running on, the time the step was dispatched, the execution priority of the step, the unique identifier given to the step by LoadLeveler, the class of the step and the number of processes running for the step (task instances). The Step is connected to one or more Switch Table objects, one or more Machine objects and one or more Node objects. The list of Machines represents all of the hosts where one or more nodes of the step are running. If two or more nodes are running on the same host, the Machine object for the host occurs only once in the step's Machine list. The Step object is connected to one Switch Table object for each of the protocols (MPI and/or LAPI) used by the Step. Finally, the Step is connected to one or more Node objects.

Each Node object manages a set of executables that share common requirements and preferences. The Node can be queried for the number of tasks it manages, and is connected to one or more Task objects.

Figure 35. LoadLeveler Job Object Model

View figure.

The Task object represents one or more copies of the same executable. The Task object can be queried for the executable, the executable arguments, and the number of instances of the executable.

Table 11 describes the specifications and elements available when you use the ll_get_data subroutine. Each specification name describes the object you need to specify and the attribute returned. For example, the specification LL_JobGetFirstStep includes the object you need to specify (LL_Job) and the value returned (GetFirstStep).

This table is sorted alphabetically by object; within each object the specifications are also sorted alphabetically.

Table 11. Specifications for ll_get_data Subroutine
Specification Object Resulting Data Type Description
LL_CredentialGid Credential int* A pointer to an integer containing the UNIX gid of the user submitting the job.
LL_CredentialGroupName Credential char* A pointer to a string containing the UNIX group name of the user submitting the job.
LL_CredentialUid Credential int* A pointer to an integer containing the UNIX uid of the person submitting the job.
LL_CredentialUserName Credential char* A pointer to a string containing the user ID of the user submitting the job.
LL_JobCredential Job LL_element* A pointer to the element associated with the job the credential.
LL_JobGetFirstStep Job LL_element* A pointer to the element associated with the first step of the job, to be used in subsequent ll_get_data calls.
LL_JobGetNextStep Job LL_element* A pointer to the element associated with the next step.
LL_JobName Job char* A pointer to a character string containing the job name.
LL_JobStepCount Job int* A pointer to an integer indicating the number of steps connected to the job.
LL_JobStepType Job int* A pointer to an integer indicating the type of job, which can be INTERACTIVE_JOB or BATCH_JOB.
LL_JobSubmitHost Job char* A pointer to a character string containing the name of the host machine from which the job was submitted.
LL_JobSubmitTime Job time_t* A pointer to the time_t structure indicating when the job was submitted.
LL_MachineAdapterList Machine char** A pointer to an array containing the list of adapters associated with the machine. The array ends with a NULL string.
LL_MachineArchitecture Machine char* A pointer to a string containing the machine architecture.
LL_MachineAvailableClassList Machine char** A pointer to an array containing the currently available job classes defined on the machine. The array ends with a NULL string.
LL_MachineConfiguredClassList Machine char** A pointer to an array containing the initiators on the machine. The array ends with a NULL string.
LL_MachineCPUs Machine int* A pointer to an integer containing the number of CPUs on the machine.
LL_MachineDisk Machine int* A pointer to an integer indicating the disk space in KBs on the machine.
LL_MachineFeatureList Machine char** A pointer to an array containing the features defined on the machine. The array ends with a NULL string.
LL_MachineKbddIdle Machine int* A pointer to an integer indicating the number of seconds since the kbdd daemon detected keyboard mouse activity.
LL_MachineLoadAverage Machine double* A pointer to a double containing the load average on the machine.
LL_MachineMaxTasks Machine int* A pointer to an integer indicating the maximum number of tasks this machine can run at one time.
LL_MachineMachineMode Machine char* A pointer to a string containing the configured machine mode.
LL_MachineName Machine char* A pointer to a string containing the machine name.
LL_MachineOperatingSystem Machine char* A pointer to a string containing the operating system on the machine.
LL_MachinePoolList Machine int** A pointer to an array indicating the pool numbers to which this machine belongs. The array ends with a NULL string.
LL_MachineRealMemory Machine int* A pointer to an integer indicating the physical memory on the machine.
LL_MachineSpeed Machine double* A pointer to a double containing the configured speed of the machine.
LL_MachineStartdRunningJobs Machine int* A pointer to an integer containing the number of running jobs known by the startdd daemon.
LL_MachineStartdState Machine char* A pointer to a string containing the state of the startdd daemon.
LL_MachineStepList Machine char** A pointer to an array containing the steps running on the machine. The array ends with a NULL string.
LL_MachineTimeStamp Machine time_t* A pointer to a time_t structure indicating the time the machine last reported to the negotiator.
LL_MachineVirtualMemory Machine int* A pointer to an integer indicating the virtual memory in KBs on the machine.
LL_NodeGetFirstTask Node LL_element* A pointer to the element associated with the first task for this node.
LL_NodeGetNextTask Node LL_element* A pointer to the element associated with the next task for this node.
LL_NodeMinInstances Node int* A pointer to an integer indicating the minimum number of machines requested.
LL_NodeMaxInstances Node int* A pointer to an integer indicating the maximum number of machines requested.
LL_NodeRequirements Node char* A pointer to a string containing the node requirements.
LL_NodeTaskCount Node int* A pointer to an integer indicating the number of tasks running on the node.
LL_StepAccountNumber Step char* A pointer to a string indicating the account number specified by the user submitting the job.
LL_StepAdapterUsage Step int* A pointer to an integer indicating the adapter usage specified by the user, which can be SHARED or NOT_SHARED.
LL_StepComment Step char* A pointer to a string indicating the comment specified by the user submitting the job.
LL_StepCompletionCode Step int* A pointer to an integer indicating the completion code of the step.
LL_StepCompletionDate Step time_t* A pointer to a time_t structure indicating the completion date of the step.
LL_StepCoreLimitHard Step int* A pointer to an integer indicating the core hard limit set by the user in the core_limit keyword.
LL_StepCoreLimitSoft Step int* A pointer to an integer indicating the core soft limit set by the user in the core_limit keyword.
LL_StepCpuLimitHard Step int* A pointer to an integer indicating the CPU hard limit set by the user in the cpu_limit keyword.
LL_StepCpuLimitSoft Step int* A pointer to an integer indicating the CPU soft limit set by the user in the cpu_limit keyword.
LL_StepCpuStepLimitHard Step int* A pointer to an integer indicating the CPU step hard limit set by the user in the job_cpu_limit keyword.
LL_StepCpuStepLimitSoft Step int* A pointer to an integer indicating the CPU step soft limit set by the user in the job_cpu_limit keyword.
LL_StepDataLimitHard Step int* A pointer to an integer indicating the data hard limit set by the user in the data_limit keyword.
LL_StepDataLimitSoft Step int* A pointer to an integer indicating the data soft limit set by the user in the data_limit keyword.
LL_StepDispatchTime Step time_t* A pointer to a time_t structure indicating the time the negotiator dispatched the job.
LL_StepEnvironment Step char* A pointer to a string containing the environment variables set by the user in the executable.
LL_StepErrorFile Step char* A pointer to a string containing the standard error file name used by the executable.
LL_StepExecSize Step int* A pointer to an integer indicating the executable size.
LL_StepFileLimitHard Step int* A pointer to an integer indicating the file hard limit set by the user in the file_limit keyword.
LL_StepFileLimitSoft Step int* A pointer to an integer indicating the file soft limit set by the user in the file_limit keyword.
LL_StepGetFirstMachine Step LL_element* A pointer to the element associated with the first machine in the step.
LL_StepGetFirstNode Step LL_element* A pointer to the element associated with the first node of the step.
LL_StepGetMasterTask Step LL_element* A pointer to the element associated with the master task of the step.
LL_StepGetNextMachine Step LL_element* A pointer to the element associated with the next machine of the step.
LL_StepGetNextNode Step LL_element* A pointer to the element associated with the next node of the step.
LL_StepID Step char* A pointer to a string containing the ID of the step.
LL_StepImageSize Step int* A pointer to an integer indicating the image size of the executable.
LL_StepInputFile Step char* A pointer to a string containing the standard input file name used by the executable.
LL_StepIwd Step char* A pointer to a string containing the initial working directory name used by the executable.
LL_StepJobClass Step char* A pointer to a string containing the class of the step.
LL_StepMachineCount Step int* A pointer to an integer indicating the number of machines assigned to the step.
LL_StepName Step char* A pointer to a string containing the name of the step.
LL_StepNodeCount Step int* A pointer to an integer indicating the number of node objects associated with the step.
LL_StepNodeUsage Step int* A pointer to an integer indicating the node usage specified by the user, which can be SHARED or NOT_SHARED.
LL_StepOutputFile Step char* A pointer to a character string containing the standard output file name used by the executable.
LL_StepPriority Step int* A pointer to an integer indicating the priority of the step.
LL_StepRssLimitHard Step int* A pointer to an integer indicating the RSS hard limit set by the user in the rss_limit keyword.
LL_StepRssLimitSoft Step int* A pointer to an integer indicating the RSS soft limit set by the user in the rss_limit keyword.
LL_StepShell Step char* A pointer to a character string containing the shell name used by the executable.
LL_StepStackLimitHard Step int* A pointer to an integer indicating the stack hard limit set by the user in the stack_limit keyword.
LL_StepStackLimitSoft Step int* A pointer to an integer indicating the stack soft limit set by the user in the stack_limit keyword.
LL_StepStartCount Step int* A pointer to an integer indicating the number of times the step has been started.
LL_StepStartDate Step time_t* A pointer to a time_t structure indicating the value the user specified in the startdate keyword.
LL_StepState Step int* A pointer to an integer indicating the state of the Step (Idle, Pending, Starting, etc.) The value returned is in the StepState enum.
LL_StepTaskInstanceCount Step int* A pointer to an integer indicating the number of task instances in the step.
LL_StepWallClockLimitHard Step int* A pointer to an integer indicating the wall clock hard limit set by the user in the wall_clock_limit keyword.
LL_StepWallClockLimitSoft Step int* A pointer to an integer indicating the wall clock soft limit set by the user in the wall_clock_limit keyword.
LL_TaskExecutable Task char* A pointer to a string containing the name of the executable.
LL_TaskExecutableArguments Task char* A pointer to a string containing the arguments passed by the user in the executable.
LL_TaskIsMaster Task int* A pointer to an integer indicating whether this is the master task.

ll_get_data Subroutine

Before you use this subroutine, make sure you are familiar with "Understanding the LoadLeveler Job Object Model".

Purpose

The ll_get_data subroutine returns data from a valid LL_element.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_get_data(LL_element *element, enum LLAPI_Specification specification,
  void* resulting_data_type);

Parameters

object

Is a pointer to the LL_element returned by the ll_get_objs subroutine or by the ll_get_data subroutine. For example: Job, Machine, Step, etc.

specification

Specifies the data field within the data object you want to read.

resulting_data_type

Is a pointer to where you want the data stored.

Description

object and specification are input fields, while resulting_data_type is an output field.

The ll_get_data subroutine of the data access API allows you to access LoadLeveler objects. The parameters of ll_get_data are a LoadLeveler object (LL_element), a specification that indicates what information about the object is being requested, and a pointer to the area where the information being requested should be stored.

If the specification indicates an attribute of the element that is passed in, the result pointer must be the address of a variable of the appropriate type. The type returned by each specification is found in Table 11. If the specification queries the connection to another object, the returned value is of type LL_element. You can use a subsequent ll_get_data call to query information about the new object.

The data type char* and any arrays of type int or char must be freed by the caller.

LL_element pointers cannot be freed by the caller

Return Values

This subroutine returns a zero to indicate success.

Error Values

-1
You specified an invalid object.
-2
You specified an invalid LLAPI_Specification.

Related Information

Subroutines: ll_query, ll_set_request, ll_reset_request, ll_get_objs, lL_next_obj, ll_free_objs, ll_deallocate.

ll_next_obj Subroutine

Purpose

The ll_next_obj subroutine returns the next object in the query_element list you specify.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
LL_element * ll_next_obj(LL_element *query_element);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query function.

Description

query_element is the input field for this subroutine.

Use this subroutine in conjunction with the ll_get_objs subroutine to "loop" through the list of objects queried.

Return Values

This subroutine returns a pointer to the next object in the list.

Error Values

NULL
Indicates an error or the end of the list of objects.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_query, ll_get_objs, ll_free_objs, ll_deallocate.

ll_free_objs Subroutine

Purpose

The ll_free_objs subroutine frees all of the LL_element objects in the query_element list that were obtained by the ll_get_objs subroutine. You must free the query_element by using the ll_deallocate subroutine.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
int ll_free_objs(LL_element *query_element);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query function.

Description

query_element is the input field for this subroutine.

Return Values

This subroutine returns a zero to indicate success.

Error Values

-1
You specified an invalid query_element.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_query, ll_get_objs, ll_reset_request, ll_free_objs.

ll_deallocate Subroutine

Purpose

The ll_deallocate subroutine deallocates the query_element allocated by the ll_query subroutine.

Library

LoadLeveler API library libllapi.a

Syntax

#include "llapi.h"
 
int ll_deallocate(LL_element *query_element);

Parameters

query_element

Is a pointer to the LL_element returned by the ll_query function.

Description

query_element is the input field for this subroutine.

Return Values

This subroutine returns a zero to indicate success.

Error Values

-1
You specified an invalid query_element.

Related Information

Subroutines: ll_get_data, ll_set_request, ll_query, ll_get_objs, ll_reset_request, ll_next_obj, ll_free_objs.

Examples of Using the Data Access API

Example 1: The following example obtains a list of current job objects from the negotiator and then prints the step ID and the name of the first allocated host.

#include "llapi.h"
 
main(int argc,char *argv[])
{
  LL_element *queryObject=NULL, *job=NULL;
  int rc, num, err, state;
  LL_element *step=NULL, *machine = NULL;
  char *id=NULL, *name=NULL;
 
        /* Initialize the query for jobs */
        queryObject = ll_query(JOBS);
 
        /* I want to query all jobs */
        rc = ll_set_request(queryObject,QUERY_ALL,NULL,NULL);
 
        /* Request the objects from the Negotiator daemon */
        job = ll_get_objs(queryObject,LL_CM,NULL,&num,&err);
 
        /* Did we get a list of objects ? */
        if (job == NULL) {
                printf("  ll_get_objs returned a NULL object.\n");
                printf("  err = %d\n",err);
        }
        else {
                /* Loop through the list and process */
                printf(" RESULT: number of jobs in list = %d\n",num);
                while(job) {
                        rc = ll_get_data(job,LL_JobGetFirstStep, &step);
                        while (step) {
                            rc = ll_get_data(step,LL_StepID, &id);
                            rc = ll_get_data(step,LL_StepState,&state);
                            printf(" RESULT: step id: %s\n",id);
                            if (state == STATE_RUNNING) {
                               rc = ll_get_data(step,LL_StepGetFirstMachine, &machine);
                               rc = ll_get_data(machine,LL_MachineName, &name);
 
                               printf(" Running on 1st assigned host: %s.\n",name);
                               free(name);
                            }
                            else
                               printf("  Not Running.\n");
                            free(id);
                            rc=ll_get_data(job,LL_JobGetNextStep,&step);
                        }
                        job = ll_next_obj(queryObject);
                }
        }
 
        /* free objects obtained from Negotiator */
        rc = ll_free_objs(queryObject);
 
        /* free query element */
        rc = ll_deallocate(queryObject);
}

Example 2: The following example queries all jobs running under the class "small" from the host k10n04:

main(int argc,char *argv[])
{
  LL_element *queryObject=NULL, *jobObject=NULL;
  int rc, num, err;
  LL_element *step=NULL, *cred=NULL, *machine=NULL;
  char *class_list[1];
  char *host_list[1];
  char *id=NULL, *name=NULL;
 
        /* Initialize the query for jobs */
        queryObject = ll_query(JOBS);
 
        /* Query all jobs on host k10n04 submitted to class "small" */
        class_list[0] = (char *)malloc(10*sizeof(char *));
        strcpy(class_list[0],"small");
        rc = ll_set_request(queryObject,QUERY_CLASS,class_list,ALL_DATA);
        host_list[0] = (char *)malloc(10*sizeof(char *));
        strcpy(host_list[0],"k10n04");
        rc = ll_set_request(queryObject,QUERY_HOST,host_list,ALL_DATA);
 
        /* Request the objects from the Negotiator daemon */
        jobObject = ll_get_objs(queryObject,LL_CM,NULL,&num,&err);
 
        /* Did we get a list of objects ? */
        if (jobObject == NULL) {
                printf("  ll_get_objs returned a NULL object.\n");
                printf("  err = %d\n",err);
        }
        else {
                /* Loop through the list and process */
                while(jobObject) {
                        printf(" RESULT: number of jobs in list = %d\n",num);
                        if(ll_get_data(jobObject,LL_JobCredential, &cred)){
                                printf("Couldn't get credential object.\n");
                        }
                        else {
                                if(!ll_get_data(cred,LL_CredentialUserName, &name)==0) {
                                        printf("The owner of this job is %s\n",name);
                                        free(name);
                                }
                                else {
                                        printf("Couldn't get user name.\n");
                                }
                        }
                        if (ll_get_data(jobObject,LL_JobGetFirstStep, &step)==0) {
                                while (step) {
                                        if(!ll_get_data(step,LL_StepID, &id)) {
                                                printf(" RESULT: step id: %s\n",id);
                                        }
                                ll_get_data(jobObject,LL_JobGetNextStep,&step);
                                }
                        }
                        else {
                                printf("No step associated with Job. Error !!\n");
                                exit(1);
                        }
                        jobObject = ll_next_obj(queryObject);
                        }
        }
        /* free objects obtained from Negotiator */
        rc = ll_free_objs(queryObject);
        /* free query element */
        rc = ll_deallocate(queryObject);
 }

Example 3: The following example queries information about the hosts k10n11 and k10n06:

#include "llapi.h"
 
main(int argc,char *argv[])
{
  LL_element *queryObject=NULL, *machine=NULL;
  int rc, num, err;
  char **host_list;
  char *state, *name;
 
        /* Initialize the query for machines */
        queryObject = ll_query(MACHINES);
 
        /* I want to query two specific hostnames */
        host_list = (char **)malloc(2*sizeof(char *));
        host_list[0]=strdup("k10n11");
        host_list[1]=strdup("k10n06");
        rc = ll_set_request(queryObject,QUERY_HOST,host_list,NULL);
 
        /* Request the objects from the Negotiator daemon */
        machine = ll_get_objs(queryObject,LL_CM,NULL,&num,&err);
 
        /* Did we get a list of objects ? */
        if (machine == NULL) {
                printf("  ll_get_objs returned a NULL object.\n");
                printf("  err = %d\n",err);
        }
        else {
                /* Loop through the list and process */
                printf(" RESULT: number of machines in list = %d\n",num);
                while(machine) {
                        rc = ll_get_data(machine,LL_MachineName,&name);
                        if (!rc) {
                                printf("machine name: %s\n",name);
                                free(name);
                        }
                        rc = ll_get_data(machine,LL_MachineStartdState,&state);
                        if (!rc) {
                                printf("startd state: %s\n",state);
                                free(state);
                        }
                        machine = ll_next_obj(queryObject);
                }
        }
 
        /* free objects obtained from Negotiator */
        rc = ll_free_objs(queryObject);
 
        /* free query element */
        rc = ll_deallocate(queryObject);
 }

Parallel Job API

If you are using any of the parallel operating environments already supported by LoadLeveler, you do not have to use the parallel API. However, if you have another application environment that you want to use, you need to use the subroutines described here to interface with LoadLeveler.

The parallel job API consists of two subroutines. ll_get_hostlist acquires the list of LoadLeveler selected parallel nodes, and ll_start_host starts the parallel task under the LoadLeveler starter.

The following section describes how parallel job submission works. Understanding this will help you to better understand the parallel API.

Interaction Between LoadLeveler and the Parallel API

This API does not give you access to any new LoadLeveler Version 2 Release 1.0 functions.

Program applications which use the parallel APIs to interface with LoadLeveler are supported under a job type called parallel. When a user submits a job specifying the keyword job_type equal to parallel, the LoadLeveler API job control flow is as follows:

The negotiator selects nodes based on the resources you request. Once the nodes have been obtained, the negotiator contacts the schedd to start the job. The schedd marks the job pending and contacts the affected startds to start their starter processes.

One machine becomes the Master Starter. The Master Starter is one of the selected parallel nodes. After all starters are started and have completed inititialization, the Master Starter starts the executable specified in the job command file. The executable referred to as the Parallel Master uses this API to start tasks on remote nodes. A LOADLBATCH environment variable is set to YES so that the Parallel Master can distinguish between callers.

The Parallel Master must:

When the Parallel Master starts, the job is marked Running. Once the Parallel Master and all tasks exit, the job is marked Complete.

Termination Paths

The Parallel Master is expected to cleanup and exit when:

A SIGKILL is issued to any process which does not exit within two minutes of receiving a termination signal.

ll_get_hostlist Subroutine

Purpose

This subroutine obtains a list of machines from the Master Starter machine so that the Parallel Master can start the Parallel Slaves. The Parallel Master is the LoadLeveler executable specified in the job command file and the Parallel Slaves are the processes started by the Parallel Master through the ll_start_host API.

Library

LoadLeveler API library libllapi.a

Syntax

  int ll_get_hostlist(struct JM_JOB_INFO* jobinfo);

Parameters

jobinfo is a pointer to the JM_JOB_INFO structure defined in llapi.h. No fields are required to be filled in. ll_get_hostlist allocates storage for an array of JM_NODE_INFO structures and returns the pointer in the jm_min_node_info pointer. It is the caller's responsibility to free this storage.

struct JM_JOB_INFO {
int  jm_request_type;
char  jm_job_description[50];
enum  JM_ADAPTER_TYPE jm_adapter_type;
int  jm_css_authentication;
int jm_min_num_nodes;
struct JM_NODE_INFO *jm_min_node_info;
};
struct JM_NODE_INFO {
char jm_node_name [MAXHOSTNAMELEN];
char jm_node_address [50];
int jm_switch_node_number;
int  jm_pool_id;
int  jm_cpu_usage;
int  jm_adapter_usage;
int  jm_num_virtual_tasks;
int  *jm_virtual_task_ids;
enum  JM_RETURN_CODE jm_return_code;
};

The following data is filled in for the JM_JOB_INFO structure:

jm_min_num_nodes

Is the number of elements in the array of JM_NODE_INFO structures. It is the number of hosts allocated to a job.

jm_min_node_info

Is the pointer to the array of JM_NODE_INFO structures. The first entry in this array describes the node which is mapped to task 0. The second entry is mapped to task 1, and so on.

The following data is filled in for each JM_NODE_INFO structure:

jm_node_name

Is the name of the node.

jm_node_address

Is the address corresponding to the adapter requested.

jm_switch_node_number

Is the relative node number set only for job running on the SP switch adapter. For all other jobs it is set to -1.

Description

The Parallel Master must:

Return Values

This subroutine returns a zero to indicate success.

Error Values

-2
Cannot get LoadLeveler step ID from environment.

-5
Cannot make socket. This means that the UNIX stream socket could not be created. This socket is needed to establish communications with the starter for both of the API's functions.

-6
Cannot connect to host.

-8
Cannot get hostlist.

ll_start_host Subroutine

Purpose

This subroutine starts a task on a selected machine.

Library

LoadLeveler API library libllapi.a

Syntax

  int ll_start_host(char *host, char *start_cmd);

Parameters

host

Is the name of the node on which you want to start the task.

start_cmd

Is the actual command to execute on the node, including flags and arguments.

Description

This function must be invoked for all the machines returned from the ll_get_hostlist subroutine once and only once by the Parallel Master. Acquiring the start_cmd is the responsibility of the Parallel Master. The user may pass this information through the arguments or environment keywords in the job command file.

The Parallel Master must:

Return Values

This subroutine returns an integer greater than one to indicate the socket connected to the Parallel Slave's standard I/O (stdio)

Error Values

-2
Cannot get LoadLeveler step ID from environment

-4
Nameserver cannot resolve host

-6
Cannot connect to host

-7
Cannot send PASS_OPEN_SOCKET command to remote startd

-9
The command you specified failed.

Examples

A sample program called para_api.c is provided in the samples/llpara subdirectory of the release directory, usually /usr/lpp/LoadL/full.

In order to run this example, you need to do the following:

  1. Copy the sample Makefile and the sample program called para_api.c to your home directory.

  2. Update the startCmd variable in para_api.c to reflect your home directory versus /usr/lpp/LoadL/full/samples/llpara. For example:
    char *startCmd = "/home/user/para_api -s";
    

  3. Issue make to create the executable para_api.

  4. Update your job command file as follows:



    #!/bin/ksh
    # @ initialdir     = /home/user
    # @ executable     = para_api
    # @ output         = para_api.$(cluster).$(process).out
    # @ error          = para_api.$(cluster).$(process).err
    # @ job_type       = parallel
    # @ min_processors = 2
    # @ max_processors = 2
    # @ queue
    

  5. Submit the job command file to LoadLeveler.

    The syntax to invoke the Parallel Master is:

      para_api
    

    The syntax to invoke the Parallel Slave is:

      para_api -s
    

    The Parallel Master does the following:

    num_nodes=2
     
    name=host1.kgn.ibm.com address=9.115.8.162 switch_number=-1
     
    name=host2.kgn.ibm.com address=9.115.8.164 switch_number=-1
     
    Connected to host1.kgn.ibm.com at sock 3
    Received acko "8000" and acke "10000" from host 0
     
    Connected to host2.kgn.ibm.com at sock 4
    Received acko "8001" and acke "10001" from host 1
     
    <Master Exiting>
    

    The Parallel Slave does the following:


Job Control API

This API allows you to disable the default LoadLeveler scheduling algorithm and "plug in" an external scheduler. The job control API consists of two subroutines, ll_start_job and ll_terminate_job, and uses the SCHEDULER_API LoadLeveler configuration file keyword. This API is available to LoadLeveler administrators and to users.

To use the job control API, you must specify the following keyword in the global LoadLeveler configuration file:

SCHEDULER_API = YES

Specifying YES disables the default LoadLeveler scheduling algorithm. When you disable the default LoadLeveler scheduler, jobs do not start unless requested to do so by the job control API.

You can toggle between the default LoadLeveler scheduler and an external scheduler in the following ways. If you are running the default LoadLeveler scheduler, you can switch to an external scheduler by doing the following:

  1. In the configuration file, set SCHEDULER_API = YES
  2. On the central manager machine, issue the llctl command with the reconfig option

If you are running an external scheduler, you can re-enable the LoadLeveler scheduling algorithm by doing the following:

  1. In the configuration file, set SCHEDULER_API = NO
  2. On the central manager machine, issue the llctl command with the reconfig option

Note that the scheduling API automatically connects to an alternate central manager if the API cannot contact the primary central manager.

An example of an external scheduler you can use is the Extensible Argonne Scheduling sYstem (EASY), developed by Argonne National Laboratory and available as public domain code.

You should use this API in conjuction with the query API, which collects information regarding which machines are available and which jobs need to be scheduled. See "Query API" for more information.

ll_start_job Subroutine

Purpose

This subroutine tells the LoadLeveler negotiator to start a job on the specified nodes.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_start_job(LL_start_job_info *ptr);

Parameters

ptr

Specifies the pointer to the LL_start_job_info structure that was allocated by the caller. The LL_start_job_info members are:

int version_num

Represents the version number of the LL_start_job_info structure. Should be set to LL_PROC_VERSION

LL_STEP_ID StepId

Represents the step ID of the job step to be started.

char **nodeList

Is a pointer to an array of node names where the job will be started. The first member of the array is the parallel master node. The array must be ended with a NULL.

Description

You must set SCHEDULER_API = YES in the global configuration file to use this subroutine.

Only jobs steps currently in the Idle state are started.

Only processes having the LoadLeveler administrator user ID can invoke this subroutine.

An external scheduler uses this subroutine in conjunction with the ll_get_nodes and ll_get_jobs subroutines of the query API. The query API returns information about which machines are avialable for scheduling and which jobs are currently in the job queue waiting to be scheduled.

Return Values

This subroutines return a value of zero to indicate the start job request was accepted by the negotiator. However, a return code of zero does not necessarily imply the job started. You can use the llq command to verify the job started. Otherwise, this subroutine returns an integer value defined in the llapi.h file.

Error Values

-1
There is an error in the input parameter.

-2
The subroutine cannot connect to the central manager.

-4
An error occurred reading parameters from the administration or the configuration file.

-5
The negotiator cannot find the specified StepId in the negotiator job queue.

-6
A data transmission failure occurred.

-7
The subroutine cannot authorize the action because you are not a LoadLeveler administrator.

-8
The job object version number is incorrect.

-9
The StepId is not in the Idle state.

-10
One of the nodes specified is not available to run the job.

-11
One of the nodes specified does not have an available initiator for the class of the job.

-12
For one of the nodes specified, the requirements statement does not satisfy the job requirements.

-13
The number of nodes specified was less than the minimum or more than the maximum requested by the job.

-14
The LoadLeveler default scheduler is enabled; that is, SCHEDULING_API=NO.

-15
The same node was specified twice in ll_start_job nodeList.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory. The examples include the executable sch_api, which invokes the query API and the job control API to start the second job in the list received from ll_get_jobs on two nodes. You should submit at least two jobs prior to running the sample. To compile sch_api, copy the sample to a writeable directory and update the RELEASE_DIR field to represent the current LoadLeveler release directory.

Related Information

Subroutines: ll_get_jobs, ll_terminate_job, ll_get_nodes

ll_terminate_job Subroutine

Purpose

This subroutine tells the negotiator to cancel the specified job step.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_terminate_job(LL_terminate_job_info *ptr);

Parameters

ptr

Specifies the pointer to the LL_start_terminate_info structure that was allocated by the caller. The LL_terminate_job_info members are:

int version_num

Represents the version number of the LL_terminate_job_info structure. Should be set to LL_PROC_VERSION

LL_STEP_ID StepId

Represents the step ID of the job step to be started.

Description

You do not need to disable the default LoadLeveler scheduler in order to use this subroutine.

Only processes having the LoadLeveler administrator user ID can invoke this subroutine.

An external scheduler uses this subroutine in conjunction with the ll_get_job subroutine (of the job control API) and ll_start_jobs subroutine (of the query API).

Return Values

This subroutine returns a value of zero when successful, to indicate the terminate job request was accepted by the negotiator. However, a return code of zero does not necessarily imply the negotiator cancelled the job. Use the llq command to verify the job was cancelled. Otherwise, this subroutine returns an integer value defined in the llapi.h file.

Error Values

-1
There is an error in the input parameter.

-4
An error occurred reading parameters from the administration or the configuration file.

-6
A data transmission failure occurred.

-7
The subroutine cannot authorize the action because you are not a LoadLeveler administrator or you are not the user who submitted the job.

-8
The job object version number is incorrect.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory. The examples include the executable sch_api, which invokes the query API and the job control API to terminate the first job reported by the ll_get_jobs subroutine. You should submit at least two jobs prior to running the sample. To compile sch_api, copy the sample to a writeable directory and update the RELEASE_DIR field to represent the current LoadLeveler release directory.

Related Information

Subroutines: ll_get_jobs, ll_start_job, ll_get_nodes

Usage Notes

It is important to know how LoadLeveler keywords and commands behave when you disable the default LoadLeveler scheduling algorithm. LoadLeveler scheduling keywords and commands fall into the following categories:

The following sections discuss some specific keywords and commands and how they behave when you disable the default LoadLeveler scheduling algorithm.

Job Command File Keywords

class - This value is provided by the query APIs. Machines chosen by ll_start_job must have the class of the job available or the request will be rejected.
dependency - Supported as before. Job objects for which dependency cannot be evaluated (because a previous step has not run) are maintained in the NotQueued state, and attempts to start them via ll_start_job will result in an error. If the dependency is met, ll_start_job can start the proc.
hold - ll_start_job cannot start a job that is in Hold status.
min_processors - ll_start_job must specify at least this number of processors.
max_processors - ll_start_job must specify no more than this number of processors.
preferences - Passed to the query API.
requirements - ll_start_job returns an error if the machine(s) specified do not match the requirements of the job. This includes Disk and Virtual Memory requirements.
startdate - The job remains in the Deferred state until the startdate specified in the job is reached. ll_start_job cannot start a job in the Deferred state.
user_priority - Used in calculating the system priority (as described in "How Does a Job's Priority Affect Dispatching Order?"). The system priority assigned to the job is available through the query API. No other control of the order in which jobs are run is enforced.

Administration File Keywords

master_node_exclusive is ignored.
master_node_requirement is ignored.
maxidle is supported.
maxjobs is ignored.
maxqueued is supported.
max_jobs_scheduled is ignored.
priority is used to calculate the system priority (where appropriate).
speed is available through the query API.

Configuration File Keywords

MACHPRIO is calculated but is not used.
SYSPRIO is calulated and available to the query API.
MAX_STARTERS is calculated, and if starting the job causes this value to be exceeded, ll_start_job returns an error.
NEGOTIATOR_PARALLEL_DEFER is ignored.
NEGOTIATOR_PARALLEL_HOLD is ignored.
NEGOTIATOR_RESCAN_QUEUE is ignored.
NEGOTIATOR_RECALCULATE_SYSPRIO_INTERVAL works as before. Set this value to 0 if you do not want the SYSPRIOs of job objects recalulated.

Query API

This API provides information about the jobs and machines in the LoadLeveler cluster. You can use this in conjuction with the job control API, since the job control API requires you to know which machines are available and which jobs need to be scheduled. See "Job Control API" for more information.

The query API consists of the following subroutines: ll_get_jobs, ll_free_jobs, ll_get_nodes, and ll_free_nodes.

ll_get_jobs Subroutine

Purpose

This subroutine, available to any user, returns information about all jobs in the LoadLeveler job queue.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_get_jobs(LL_get_jobs_info *);

Parameters

ptr

Specifies the pointer to the LL_get_jobs_info structure that was allocated by the caller. The LL_get_jobs_info members are:

int version_num

Represents the version number of the LL_start_job_info structure. This should be set to LL_PROC_VERSION.

int numJobs

Represents the number of entries in the array.

LL_job **JobList

Represents the pointer to an array of LL_job structures. The LL_job structure is defined in llapi.h.

Description

The LL_get_jobs_info structure contains an array of LL_job structures indicating each job in the LoadLeveler system.

Some job information, such as the start time of the job, is not available to this API. (It is recommended that you use the dispatch time, which is available, in place of the start time.) Also, some accounting information is not available to this API.

Return Values

This subroutines returns a value of zero when successful. Otherwise, it returns an integer value defined in the llapi.h file.

Error Values

-1
There is an error in the input parameter.

-2
The API cannot connect to the central manager.

-3
The API cannot allocate memory.

-4
A configuration error occurred.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory.

Related Information

Subroutines: ll_free_jobs, ll_free_nodes, ll_get_nodes

ll_free_jobs Subroutine

Purpose

This subroutine, available to any user, frees storage that was allocated by ll_get_jobs.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_free_jobs(LL_get_jobs_info *ptr);

Parameters

ptr

Specifies the address of the LL_get_jobs_info structure to be freed.

Description

This subroutine frees the storage pointed to by the LL_get_jobs_info pointer.

Return Values

This subroutines returns a value of zero when successful. Otherwise, it returns an integer value defined in the llapi.h file.

Error Values

-8
The version_num member of the LL_get_jobs_info structure did not match the current version.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory.

Related Information

Subroutines: ll_get_jobs, ll_free_nodes, ll_get_nodes

ll_get_nodes Subroutine

Purpose

This subroutine, available to any user, returns information about all of nodes known by the negotiator daemon.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_get_nodes(LL_get_nodes_info *ptr);

Parameters

ptr

Specifies the pointer to the LL_get_nodes_info structure that was allocated by the caller. The LL_get_nodes_info members are:

int version_num

Represents the version number of the LL_start_job_info structure.

int numNodes

Represents the number of entries in the NodeList array.

LL_node **NodeList

Represents the pointer to an array of LL_node structures. The LL_node structure is defined in llapi.h.

Description

The LL_get_node_info structure contains an array of LL_job structures indicating each node in the LoadLeveler system.

Return Values

This subroutines returns a value of zero when successful. Otherwise, it returns an integer value defined in the llapi.h file.

Error Values

-1
There is an error in the input parameter.

-2
The API cannot connect to the central manager.

-3
The API cannot allocate memory.

-4
A configuration error occurred.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory.

Related Information

Subroutines: ll_free_jobs, ll_free_nodes, ll_get_jobs

ll_free_nodes Subroutine

Purpose

This subroutine, available to any user, frees storage that was allocated by ll_get_nodes.

Library

LoadLeveler API library libllapi.a

Syntax

  #include "llapi.h"
 
  int ll_nodes_jobs(LL_get_nodes_info *ptr);

Parameters

ptr

Specifies the address of the LL_get_nodes_info structure to be freed.

Description

This subroutine frees the storage pointed to by the LL_get_nodes_info pointer.

Return Values

This subroutines returns a value of zero when successful. Otherwise, it returns an integer value defined in the llapi.h file.

Error Values

-8
The version_num member of the LL_get_jobs_info structure did not match the current version.

Examples

Makefiles and examples which use this subroutine are located in the samples/llsch subdirectory of the release directory.

Related Information

Subroutines: ll_get_jobs, ll_free_nodes, ll_get_nodes


User Exits

This section discusses separate user exits for the following:

Handling DCE Security Credentials

You can write a pair of programs to override the default LoadLeveler DCE authentication method. To enable the programs, use the following keyword in your configuration file:

DCE_AUTHENTICATION_PAIR = program1, program2

Where program1 and program2 are LoadLeveler or installation supplied programs that are used to authenticate DCE security credentials. program1 obtains a handle (an opaque credentials object), at the time the job is submitted, which is used to authenticate to DCE. program2 is the path name of a LoadLeveler or an installation supplied program that uses the handle obtained by program1 to authenticate to DCE before starting the job on the executing machine(s).

An example of a credentials object is a character string containing the DCE principle name and a password. program1 writes the following to standard output:

If program1 encounters errors, it writes error messages to standard error.

program2 receives the following as standard input:

program2 writes the following to standard output:

If program2 encounters errors, it writes error messages to standard error. The parent process, the LoadLeveler starter process, writes those messages to the starter log.

Usage Notes

If you are using DCE on AIX 4.3, you need the proper DCE credentials for the existing authentication method in order to run a command or function that uses rshell (rsh). Otherwise, the rshell command may fail. You can use the lsauthent command to determine the authentication method. If lsauthent indicates that DCE authentication is in use, you must log in to DCE wth the dce_login command to obtain the proper credentials.

LoadLeveler commands that run rshell include llctl version and llctl start.

For examples of programs that enable DCE security credentials, see the /samples/lldce subdirectory in the release directory.

Handling an AFS Token

You can write a program, run by the scheduler, to refresh an AFS token when a job is started. To invoke the program, use the following keyword in your configuration file:

AFS_GETNEWTOKEN = myprog

where myprog is a filter that receives the AFS authentication information on standard input and writes the new information to standard output. The filter is run when the job is scheduled to run and can be used to refresh a token which expired when the job was queued.

Before running the program, LoadLeveler sets up standard input and standard output as pipes between the program and LoadLeveler. LoadLeveler also sets up the following environment variables:

LOADL_STEP_OWNER
The owner (UNIX user name) of the job

LOADL_STEP_COMMAND
The name of the command the user's job step invokes.

LOADL_STEP_CLASS
The class this job step will run.

LOADL_STEP_ID
The step identifier, generated by LoadLeveler.

LOADL_JOB_CPU_LIMIT
The number of CPU seconds the job is limited to.

LOADL_WALL_LIMIT
The number of wall clock seconds the job is limited to.

LoadLeveler writes the following current AFS credentials, in order, over the standard input pipe:

The ktc_principal structure indicating the service.
The ktc_principal structure indicating the client.
The ktc_token structure containing the credentials.

The ktc_principal structure is defined in the AFS header file afs_rxkad.h. The ktc_token structure is defined in the AFS header file afs_auth.h.

LoadLeveler expects to read these same structures in the same order from the standard output pipe, except these should be refreshed credentials produced by the user exit.

The user exit can modify the passed credentials (to extend their lifetime) and pass them back, or it can obtain new credentials. LoadLeveler takes whatever is returned and uses it to authenticate the user prior to starting the user's job.

Filtering a Job Script

You can write a program to filter a job script when the job is submitted. This program can, for example, modify defaults or perform site specific verification of parameters. To invoke the program, specify the following keyword in your configuration file:

SUBMIT_FILTER = myprog

where myprog is called with the job file as the standard input. The standard output is submitted to LoadLeveler. If the program returns with a non-zero exit code, the job submission is cancelled.

The following environment variables are set when the program is invoked:

LOADL_ACTIVE
LoadLeveler version
LOADL_STEP_COMMAND
Job command file name
LOADL_STEP_ID
The job identifier, generated by LoadLeveler
LOADL_STEP_OWNER
The owner (UNIX user name) of the job

Using Your Own Mail Program

You can write a program to override the LoadLeveler default mail notification method. You can use this program to, for example, display your own messages to users when a job completes, or to automate tasks such as sending error messages to a network manager.

The syntax for the program is the same as it is for standard UNIX mail programs; the command is called with a list of users as arguments, and the mail message is taken from standard input. This syntax is as follows:

MAIL = program

where program specifies the path name of a local program you want to use.

Writing Prolog and Epilog Programs

An administrator can write prolog and epilog user exits that can run before and after a LoadLeveler job runs, respectively.

Prolog and epilog programs fall into two categories: those that run as the LoadLeveler user ID, and those that run in a user's environment.

To specify prolog and epilog programs, specify the following keywords in the configuration file:

JOB_PROLOG = pathname

where pathname is the full path name of the prolog program. This program runs under the LoadLeveler user ID.

JOB_EPILOG = pathname

where pathname is the full path name of the epilog program. This program runs under the LoadLeveler user ID.

JOB_USER_PROLOG = pathname

where pathname is the full path name of the user prolog program. This program runs under the user's environment.

JOB_USER_EPILOG = pathname

where pathname is the full path name of the user epilog program. This program runs under the user's environment.

A user environment prolog or epilog runs with AFS and/or DCE authentification (if either is installed and enabled). For security reasons, you must code these programs on the machines where the job runs and on the machine that schedules the job. If you do not define a value for these keywords, the user enviroment prolog and epilog settings on the executing machine are ignored.

The user environment prolog and epilog can set environment variables for the job by sending information to standard output in the following format:

env id = value

Where:

id
Is the name of the environment variable

value
Is the value (setting) of the environment variable

For example, the user environment prolog below sets the environment variable STAGE_HOST for the job:

#!/bin/sh
echo env STAGE_HOST=shd22

Prolog Programs

The prolog program is invoked by the starter process. Once the starter process invokes the prolog program, the program obtains information about the job from environment variables.

Syntax
prolog_program

Where prolog_program is the name of the prolog program as defined in the JOB_PROLOG keyword.

No arguments are passed to the program but several environment variables are set. These environment variables are described in "Submitting a Job Command File".

The real and effective user ID of the prolog process is the LoadLeveler user ID. If the prolog program requires root authority, the administrator must write a secure C or perl program to perform the desired actions. You should not use shell scripts with set uid permissions, since these scripts may make your system susceptible to security problems.

Return Code Values

0
The job will begin.

If the prolog program is killed, the job does not begin and a message is written to the starter log.

Sample Prolog Programs

Sample of a Prolog Program for Korn Shell

#!/bin/ksh
#
# Set up environment
set -a
. /etc/environment
. ~/.profile
export PATH="$PATH:/loctools/lladmin/bin"
export LOG="/tmp/$LOADL_STEP_OWNER.$LOADL_JOB_ID.prolog"
#
# Do set up based upon job step class
#
case $LOADL_STEP_CLASS in
    # A OSL job is about to run, make sure the osl filesystem is
    # mounted. If status is negative then filesystem cannot be
    # mounted and the job step should not run.
    "OSL")
      mount_osl_files >> $LOG
    if [ status = 0 ]
        then EXIT_CODE=1
      else
        EXIT_CODE=0
      fi
      ;;
# A simulation job is about to run, simulation data has to
# be made available to the job. The status from copy script must
# be zero or job step cannot run.
"sim")
      copy_sim_data >> $LOG
if [ status = 0 ]
        then EXIT_CODE=0
      else
        EXIT_CODE=1
      fi
      ;;
# All other job will require free space in /tmp, make sure
# enough space is available.
*)
      check_tmp >> $LOG
      EXIT_CODE=$?
      ;;
esac
# The job step will run only if EXIT_CODE == 0
exit $EXIT_CODE

Sample of a Prolog Program for C Shell

#!/bin/csh
#
# Set up environment
source /u/loadl/.login
#
setenv PATH  "${PATH}:/loctools/lladmin/bin"
setenv LOG "/tmp/${LOADL_STEP_OWNER}.${LOADL_JOB_ID}.prolog"
#
# Do set up based upon job step class
#
switch ($LOADL_STEP_CLASS)
    # A OSL job is about to run, make sure the osl filesystem is
    # mounted. If status is negative then filesystem cannot be
    # mounted and the job step should not run.
    case "OSL":
      mount_osl_files >> $LOG
      if ($status < 0 ) then
        set EXIT_CODE = 1
      else
        set EXIT_CODE = 0
      endif
      breaksw
# A simulation job is about to run, simulation data has to
# be made available to the job. The status from copy script must
# be zero or job step cannot run.
case "sim":
    copy_sim_data >> $LOG
    if ($status == 0 ) then
      set EXIT_CODE = 0
    else
      set EXIT_CODE = 1
    endif
    breaksw
# All other job will require free space in /tmp, make sure
# enough space is available.
default:
    check_tmp >> $LOG
    set EXIT_CODE = $status
    breaksw
endsw
 
# The job step will run only if EXIT_CODE == 0
exit $EXIT_CODE

Epilog Programs

The installation defined epilog program is invoked after a job step has completed. The purpose of the epilog program is to perform any required clean up such as unmounting file systems, removing files, and copying results. The exit status of both the prolog program and the job step is set in environment variables.

Syntax
epilog_program

Where epilog_program is the name of the epilog program as defined in the JOB_EPILOG keyword.

No arguments are passed to the program but several environment variables are set. These environment variables are described in "Submitting a Job Command File".

Note

To interpret the exit status of the prolog program and the job step, convert the string to an integer and use the structures found in the sys/wait.h file.

Sample Epilog Programs

Sample of an Epilog Program for Korn Shell

#!/bin/ksh
#
# Set up environment
set -a
. /etc/environment
. ~/.profile
export PATH="$PATH:/loctools/lladmin/bin"
export LOG="/tmp/$LOADL_STEP_OWNER.$LOADL_JOB_ID.epilog"
#
if [ [ -z $LOADL_PROLOG_EXIT_CODE ] ]
then
echo "Prolog did not run" >> $LOG
else
echo "Prolog exit code = $LOADL_PROLOG_EXIT_CODE" >> $LOG
fi
#
if [ [ -z $LOADL_USER_PROLOG_EXIT_CODE ] ]
  then
   echo "User environment prolog did not run" >> $LOG
  else
   echo "User environment exit code = $LOADL_USER_PROLOG_EXIT_CODE" >> $LOG
fi
#
if [ [ -z $LOADL_JOB_STEP_EXIT_CODE ] ]
  then
   echo "Job step did not run" >> $LOG
  else
   echo "Job step exit code = $LOADL_JOB_STEP_EXIT_CODE" >> $LOG
fi
#
#
# Do clean up based upon job step class
#
case $LOADL_STEP_CLASS in
  # A OSL job just ran, unmount the filesystem.
  "OSL")
    umount_osl_files >> $LOG
    ;;
  # A simulation job just ran, remove input files.
  # Copy results if simulation was successful (second argument
  # contains exit status from job step).
  "sim")
    rm_sim_data >> $LOG
    if [ $2 = 0 ]
      then copy_sim_results >> $LOG
    fi
    ;;
# Clean up /tmp
*)
  clean_tmp >> $LOG
  ;;
esac

Sample of an Epilog Program for C Shell

#!/bin/csh
#
# Set up environment
source /u/loadl/.login
#
setenv PATH  "${PATH}:/loctools/lladmin/bin"
setenv LOG "/tmp/${LOADL_STEP_OWNER}.${LOADL_JOB_ID}.prolog"
#
if ( ${?LOADL_PROLOG_EXIT_CODE} ) then
echo "Prolog exit code = $LOADL_PROLOG_EXIT_CODE" >> $LOG
else
echo "Prolog did not run" >> $LOG
endif
#
if ( ${?LOADL_USER_PROLOG_EXIT_CODE} ) then
    echo "User environment exit code = $LOADL_USER_PROLOG_EXIT_CODE" >> $LOG
  else
    echo "User environment prolog did not run" >> $LOG
endif
#
if ( ${?LOADL_JOB_STEP_EXIT_CODE} ) then
    echo "Job step exit code = $LOADL_JOB_STEP_EXIT_CODE" >> $LOG
  else
    echo "Job step did not run" >> $LOG
endif
#
# Do clean up based upon job step class
#
switch ($LOADL_STEP_CLASS)
  # A OSL job just ran, unmount the filesystem.
  case "OSL":
    umount_osl_files >> $LOG
    breaksw
# A simulation job just ran, remove input files.
# Copy results if simulation was successful (second argument
# contains exit status from job step).
case "sim":
  rm_sim_data >> $LOG
  if ($argv{2} == 0 ) then
    copy_sim_results >> $LOG
  endif
  breaksw
# Clean up /tmp
default:
  clean_tmp >> $LOG
  breaksw
endsw


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]