This section contains tips on running LoadLeveler, including some productivity aids.
If you are running LoadLeveler on a large number of nodes (128 or more), network traffic between LoadLeveler daemons can become excessive to the point of overwhelming a receiving daemon. To reduce network traffic, consider the following daemon, keyword, and command recommendations for large installations.
By reading the notification mail you receive after submitting a job, you can determine the time the job was submitted, started, and stopped. Suppose you submit a job and receive the following mail when the job finishes:
Submitted at: Sun Apr 30 11:40:41 1996 Started at: Sun Apr 30 11:45:00 1996 Exited at: Sun Apr 30 12:49:10 1996 Real Time: 0 01:08:29 Job Step User Time: 0 00:30:15 Job Step System Time: 0 00:12:55 Total Job Step Time: 0 00:43:10 Starter User Time: 0 00:00:00 Starter System Time: 0 00:00:00 Total Starter Time: 0 00:00:00
This mail tells you the following:
You can also get the starting time by issing llsummary -l -x and then issuing awk /Date|Event/ against the resulting file. For this to work, you must have ACCT = A_ON A_DETAIL set in the LoadL_config file.
Using a machine's local configuration file, you can set up the machine to run jobs at a certain time of day (sometimes called an execution window). The following coding in the local configuration file runs jobs between 5:00 PM and 8:00AM daily, and suspends jobs the rest of the day:
START: (tm_day >= 1700) || (tm_day <= 0800) SUSPEND: (tm_day > 0800) && (tm_day < 1700) CONTINUE: (tm_day >= 1700) || (tm_day <= 0800)
Three keywords determine the mix of idle and running jobs for a user. By a running job, we mean a job that is in one of the following states: Running, Pending, or Starting. These keywords, which are described in detail in Step 2: Specify User Stanzas, are:
For a user's job to be allowed into the job queue, the total of other jobs (in the Idle, Pending, Starting and Running states) for that user must be less than the maxqueued value for that user. Also, the total idle jobs (those in the Idle, Pending, and Starting states) must be less than the maxidle value for the user. If either of these constraints are at the maximum, the job is placed in the Not Queued state until one of the other jobs changes state. If the user is at the maxqueued limit, a job must complete, be cancelled, or be held before the new job can enter the queue. If the user is at the maxidle limit, a job must start running, be cancelled, or be held before the new job can enter the queue.
Once a job is in the queue, the job is not taken out of queue unless the user places a hold on the job, the job completes, or the job is cancelled. (An exception to this, when you are running the default LoadLeveler scheduler, is parallel jobs which do not accumulate sufficient machines in a given time period. These jobs are moved to the Deferred state, meaning they must vie for the queue when their Deferred period expires.)
Once a job is in the queue, the job will run unless the maxjobs limit for the user is at a maximum.
Note the following restrictions for using these keywords:
You can use dependencies in your job command file to send the output from many job steps to the same output file. For example:
# @ step_name = step1 # @ executable = ssba.job # @ output = ssba.tmp # @ ... # @ queue # # @ step_name = append1 # @ dependency = (step1 != CC_REMOVED) # @ executable = append.ksh # @ output = /dev/null # @ queue # @ # @ step_name = step2 # @ dependency = (append1 == 0) # @ executable = ssba.job # @ output = ssba.tmp # @ ... # @ queue # @ # @ step_name = append2 # @ dependency = (step2 != CC_REMOVED) # @ executable = append.ksh # @ output = /dev/null # @ queue # # ...
Then, the file append.ksh could contain the line cat ssba.tmp >> ssba.log. All your output will reside in ssba.log. (Your dependecies can look for different return values, depending on what you need to accomplish.)
You can achieve the same result from within ssba.job by appending your output to an output file rather than writing it to stdout. Then your output statement for each step would be /dev/null and you wouldn't need the append steps.
You can define a machine to have multiple job classes which are active at different times. For example, suppose you want a machine to run jobs of Class A any time, and you want the same machine to run Class B jobs between 6 p.m. and 8 a.m.
You can combine the Class keyword with a user-defined macro (called Off_shift in this example).
For example:
Off_Shift = ((tm_hour >= 18) || (tm_hour < 8))
Then define your START statement:
START : (Class == "A") || ((Class == "B") && $(Off_Shift))
Make sure you have the parenthesis around the Off_Shift macro, since the logical OR has a lower precedence than the logical AND in the START statement.
Also, to take weekends into account, code the following statements. Remember that Saturday is day 6 and Sunday is day 0.
Off_Shift = ((tm_wday == 6) || (tm_wday == 0) || (tm_hour >=18) \ || (tm_hour < 8)) Prime_Shift = ((tm_wday != 6) && (tm_wday != 0) && (tm_hour >= 8) \ && (tm_hour < 18))
You can use the /usr/bin/rup command to report the load average on a machine. The rup machine_name command gives you a report that looks similar to the following:
localhost up 23 days, 10:25, load average: 1.72, 1.05, 1.17
You can use this command to report the load average of your local machine or of remote machines. Another command, /usr/bin/uptime, returns the load average information for only your local host.
The schedd daemon writes to the spool/history file only when a job is completed or removed. Therefore, you can delete the history file and restart schedd even when some jobs are scheduled to run on other hosts.
However, you should clean up the spool/job_queue.dir and spool/job_queue.pag files only when no jobs are being scheduled on the machine.
You should not delete these files if there are any jobs in the job queue that are being scheduled from this machine (for example, jobs with names such as thismachine.clusterno.jobno).