IBM Books

Using and Administering


Chapter 4. Submitting and Managing Parallel Jobs

This chapter tells you how to submit and manage parallel jobs. For information on setting up and planning for parallel jobs, see Chapter 6. "Administration Tasks for Parallel Jobs".


Supported Parallel Environments

LoadLeveler allows you to schedule parallel batch jobs that have been written using the following:

Note that for parallel batch jobs, LoadLeveler no longer interacts with the PSSP Resource Manager, since all Resource Manager function has been incorporated into LoadLeveler. For more information, see "Resource Manager Functions Now in LoadLeveler".


Keyword Considerations for Parallel Jobs

Several LoadLeveler job command language keywords are associated with parallel jobs. Whether a keyword is appropriate is dependent upon the type of job and the type of LoadLeveler scheduler you are running.

Table 4 shows you the parallel keywords supported by the LoadLeveler Backfill scheduler, based on the type of job you are running.

Table 4. Parallel Keywords Supported by the Backfill Scheduler
job_type=parallel job_type=pvm3
network
node
node_usage
tasks_per_node
total_tasks
All keywords supported for job_type=pvm3 (supported for compatibility reasons)

Adapter requirement
max_processors
min_processors
network
parallel_path

Table 5 shows you the parallel keywords supported by the default LoadLeveler scheduler, based on the type of job you are running.

Table 5. Parallel Keywords Supported by the Default Scheduler
job_type=parallel job_type=pvm3
max_processors
min_processors
Adapter requirement

max_processors
min_processors
parallel_path
Adapter requirement

These keywords are used in the examples in this chapter, and are described in more detail in "Job Command File Keywords".

If you disable the default LoadLeveler scheduler to run an external scheduler, see "Usage Notes" for an explanation of which keywords are supported.


Job Command File Examples

This section contains sample job command files for the following parallel environments:

POE 2.4.0

Figure 17 is a sample job command file for POE 2.4.0.

Figure 17. POE 2.4.0 Job Command File - Multiple Tasks Per Node

#
# @ job_type = parallel
# @ environment = COPY_ALL
# @ output = poe.out
# @ error = poe.error
# @ node = 8,10
# @ tasks_per_node = 2
# @ network.LAPI = switch,shared,US
# @ network.MPI = switch,shared,US
# @ wall_clock_limit = 60
# @ executable = /usr/bin/poe
# @ arguments = /u/richc/My_POE_program -euilib "us"
# @ class = POE
# @ queue

Figure 17 shows the following:

Figure 18 is a second sample job command file for POE 2.4.0.

Figure 18. POE Sample Job Command File - Invoking POE Twice

#
# @ job_type = parallel
# @ input = poe.in.1
# @ output = poe.out.1
# @ error = poe.err
# @ node = 2,8
# @ network.MPI = switch,shared,IP
# @ wall_clock_limit = 60
# @ class = POE
# @ queue
/usr/bin/poe /u/richc/my_POE_setup_program -infolevel 2
/usr/bin/poe /u/richc/my_POE_main_program -infolevel 2

Figure 18 shows the following:

PVM 3.3 (Non-SP)

Figure 19 shows a sample job command file for PVM 3.3 (RS6K architecture). Before using PVM, users should contact their administrator to determine which PVM architecture has been installed.

Figure 19. Sample PVM 3.3 Job Command File

# @ executable    = my_PVM_program
# @ job_type      = pvm3
# @ parallel_path = /home/LL_userid/cmds/pvm3/$PVM_ARCH:$PVM_ROOT/lib/$PVM_ARCH
# @ class         = PVM3
# @ requirements  = (Pool == 4)
# @ output = my_PVM_program.$(cluster).$(process).out
# @ error  = my_PVM_program.$(cluster).$(process).err
# @ min_processors = 8
# @ max_processors = 10
# @ queue

Note the following requirements for PVM 3.3 (RS6K architecture) jobs:

PVM 3.3.11+ (SP2MPI architecture)

Figure 20 shows a sample job command file for PVM 3.3.11+ (SP2MPI architecture). Before using PVM, users should contact their administrator to determine which PVM architecture has been installed. The SP2MPI architecture version should be used when users require that their jobs run in user space.

Figure 20. Sample PVM 3.3.11+ (SP2MPI Architecture) Job Command File

# @ job_type      = parallel
# @ class         = PVM3
# @ requirements  = (Adapter == "hps_us")
# @ output = my_PVM_program.$(cluster).$(process).out
# @ error  = my_PVM_program.$(cluster).$(process).err
# @ node = 3,3
# @ queue
 
# Set PVM daemon and starter path dictated by LoadLeveler administrator
starter_path=/home/userid/loadl/pvm3/bin/SP2MPI
daemon_path=/home/userid/loadl/pvm3/lib/SP2MPI
 
# Export "MP_EUILIB" before starting PVM3 (default is "ip")
export MP_EUILIB=us
echo MP_EUILIB=$MP_EUILIB
 
# Clean up old PVM log and daemon files belonging to user
filelog=/tmp/pvml.`id | awk -F'=' '{print $2}' | awk -F'(' '{print $1}'`
filedaemon=/tmp/pvmd.`id | awk -F'=' '{print $2}' | awk -F'(' '{print $1}'`
rm -f $filelog > /dev/null
rm -f $filedaemon > /dev/null
 
# Start PVM daemon in background
$daemon_path/pvmd3 &
echo "pvm background pid=$!"
echo "Sleep 2 seconds"
sleep 2
echo "PVM daemon started"
 
# Start parallel executable
llnode_cnt=`echo "$LOADL_PROCESSOR_LIST" | awk '{print NF}'`
actual_cnt=`expr "$llnode_cnt" - 1`
$starter_path/starter -n $actual_cnt /home/userid/my_PVM_program
echo "Parallel executable starting"
 
# Check processes running and halt PVM daemon
echo "ps -a" | /home/userid/loadl/pvm3/lib/SP2MPI/pvm
echo "Halt PVM daemon"
echo "halt" | /home/userid/loadl/pvm3/lib/SP2MPI/pvm
wait
echo "PVM daemon completed"

Note the following requirements for PVM 3.3.11+ (SP2MPI architecture) jobs:

Sequence of Events in a PVM 3.3.11+ Job

This example demonstrates the sequence of events that occur when you submit the sample job command file shown in Figure 20.

Figure 21 illustrates the following:

Figure 21. Sequence of Events in a PVM 3.3.11+ Job

View figure.


Obtaining Status of Parallel Jobs

Both end users and LoadLeveler administrators can obtain status of parallel jobs in the same way as they obtain status of serial jobs - either by using the llq command or by viewing the Jobs window on the graphical user interface (GUI). By issuing llq -l, or by using the Job Details selection in the GUI, users get a list of machines allocated to the parallel job. See llq - Query Job Status for sample output from an llq -l command issued to query a parallel job.

Also, administrators can create a class for parallel jobs. Users can check the status of their parallel jobs by specifying this class in the Class field on the Jobs window of the GUI.

Obtaining Allocated Host Names

llq -l output includes information on allocated host names. Another way to obtain the allocated host names is with the LOADL_PROCESSOR_LIST environment variable, which you can use from a shell script in your job command file as shown in Figure 22.

This example uses LOADL_PROCESSOR_LIST to perform a remote copy of a local file to all of the nodes, and then invokes POE. Note that the processor list contains an entry for each task running on a node. If two tasks are running on a node, LOADL_PROCESSOR_LIST will contain two instances of the host name where the tasks are running. The example in Figure 22 removes any duplicate entries.

Note that LOADL_PROCESSOR_LIST is set by LoadLeveler, not by the user.

Figure 22. Using LOADL_PROCESSOR_LIST in a Shell Script

#!/bin/ksh
# @ output     =  my_POE_program.$(cluster).$(process).out
# @ error      =  my_POE_program.$(cluster).$(process).err
# @ class      =  POE
# @ job_type   =  parallel
# @ node = 8,12
# @ network.MPI = css0,shared,US
# @ queue
 
tmp_file="/tmp/node_list"
rm -f $tmp_file
 
# Copy each entry in the list to a new line in a file so
# that duplicate entries can be removed.
for node in $LOADL_PROCESSOR_LIST
        do
                echo $node >> $tmp_file
        done
 
# Sort the file removing duplicate entries and save list in variable
nodelist= `sort -u /tmp/node_list`
 
for node in $nodelist
        do
                rcp localfile $node:/home/userid
        done
 
rm -f $tmp_file
 
 
/usr/bin/poe /home/userid/my_POE_program


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]