Using and Administering
Information on completed serial and parallel jobs is
gathered using the UNIX wait3 system call. Information on
non-completed serial and parallel jobs is gathered in a platform-dependent
manner by examining data from the UNIX process.
Accounting information on a completed serial job is determined by
accumulating resources consumed by that job on the machine(s) that ran the
job. Similarly, accounting information on completed parallel jobs is
gathered by accumulating resources used on all of the nodes that ran the
job.
You can also view resource consumption information on serial and parallel
jobs that are still running by specifying the -x option of the
llq command. In order to enable llq -x, you
should specify the following keywords in the configuration file:
- ACCT = A_ON A_DETAIL
- Turns accounting data recording on. For more information on this
keyword, see Step 9: Define Job Accounting.
- JOB_ACCT_Q_POLICY = number
- where number is the amount of time in seconds that determines how
often the startd daemon updates the schedd daemon with accounting data of
running jobs. This controls the accuracy of the llq -x
command. The default is 300 seconds.
- JOB_LIMIT_POLICY = number
- where number is an amount of time in seconds. The smaller
of JOB_LIMIT_POLICY and JOB_ACCT_Q_POLICY is used to
control how often the startd daemon collects resource consumption
data on running jobs, and how often the job_cpu_limit is
checked. The default for JOB_LIMIT_POLICY is
POLLING_FREQUENCY multiplied by POLLS_PER_UPDATE.
[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]