Using and Administering

ll_get_hostlist Subroutine

Purpose

This subroutine obtains a list of machines from the Master Starter machine so that the Parallel Master can start the Parallel Slaves. The Parallel Master is the LoadLeveler executable specified in the job command file and the Parallel Slaves are the processes started by the Parallel Master through the ll_start_host API.

Library

LoadLeveler API library libllapi.a

Syntax

  int ll_get_hostlist(struct JM_JOB_INFO* jobinfo);

Parameters

jobinfo is a pointer to the JM_JOB_INFO structure defined in llapi.h. No fields are required to be filled in. ll_get_hostlist allocates storage for an array of JM_NODE_INFO structures and returns the pointer in the jm_min_node_info pointer. It is the caller's responsibility to free this storage.

struct JM_JOB_INFO {
   int  jm_request_type;
   char  jm_job_description[50];
   enum  JM_ADAPTER_TYPE jm_adapter_type;
   int  jm_css_authentication;
   int jm_min_num_nodes;
   struct JM_NODE_INFO *jm_min_node_info;
};
struct JM_NODE_INFO {
   char jm_node_name [MAXHOSTNAMELEN];
   char jm_node_address [50];
   int jm_switch_node_number;
   int  jm_pool_id;
   int  jm_cpu_usage;
   int  jm_adapter_usage;
   int  jm_num_virtual_tasks;
   int  *jm_virtual_task_ids;
   enum  JM_RETURN_CODE jm_return_code;
};

The following data is filled in for the JM_JOB_INFO structure:

jm_min_num_nodes
Is the number of elements in the array of JM_NODE_INFO structures. It is the number of hosts allocated to a job.

jm_min_node_info
Is the pointer to the array of JM_NODE_INFO structures. The first entry in this array describes the node which is mapped to task 0. The second entry is mapped to task 1, and so on.

The following data is filled in for each JM_NODE_INFO structure:

jm_node_name
Is the name of the node.

jm_node_address
Is the address corresponding to the adapter requested.

jm_switch_node_number
Is the relative node number set only for job running on the SP switch adapter. For all other jobs it is set to -1.

Description

The Parallel Master must:

Return Values

This subroutine returns a zero to indicate success.

Error Values

-2
Cannot get LoadLeveler step ID from environment.

-5
Cannot make socket. This means that the UNIX stream socket could not be created. This socket is needed to establish communications with the starter for both of the API's functions.

-6
Cannot connect to host.

-8
Cannot get hostlist.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]