This chapter explains how to install the five PE filesets.
Installation on an IBM RS/6000 SP is different from installation on an IBM RS/6000 network cluster. The main difference is that the SP installation allows you to use the SP system management functions to manage and maintain the system software, whereas these functions are not available on an IBM RS/6000 network cluster.
In both environments, you first install the desired PE filesets on a single node. When that installation is complete, you can then replicate the installation image throughout the remaining nodes, using one of the suggested methods described in this chapter.
You can install the desired PE filesets on an SP in one of three ways:
To use this method, refer to the chapter on "Performing Software Maintenance" in IBM Parallel System Support Programs for AIX: Installation and Migration Guide, GA22-7347.
In any event, you must first install the PE filesets on at least one node of your system. Preferably, install them on the control workstation to allow use of the system management software functions.
If you install PE on the SP Control Workstation (using one of the methods explained later in this chapter), you will need to do the following in order to use the User Space libraries:
ln -s /usr/lpp/ssp/css/libtb2/libmpci.a /etc/ssp/css/libus/libmpci.a
ln -s /usr/lpp/ssp/css/libtb2/libmpci_r.a /etc/ssp/css/libus/libmpci_r.a
ln -s /usr/lpp/ssp/css/libtb3/libmpci.a /etc/ssp/css/libus/libmpci.a
ln -s /usr/lpp/ssp/css/libtb3/libmpci_r.a /etc/ssp/css/libus/libmpci_r.a
Note: |
Establish this link before installing POE. The installation steps depend on the correct adapter libraries being properly linked. It is is normally created by /usr/lpp/ssp/css/rc.switch when called from /etc/inittab running on an SP node. The link does not get created on the SP control workstation since it cannot determine what type of switch adapter should be used. Failure to set up this link may prevent you from using POE. |
During the course of installing PE filesets on an SP, you may encounter sysck warning messages that a particular file is also owned by another fileset. If the file is also owned by one of the PSSP filesets, such as ssp.css, then these messages can be ignored. However, if the warning messages are for older PE related filesets, such as poe or poe_pcf, then this may indicate an older version is installed.
These warning messages can be ignored, as the system will function properly. However, if you later choose to remove the old fileset after installing PE Version 2.4, you will need to repeat the installation of the new fileset.
Installation on an IBM RS/6000 network cluster is similar to that on the SP, with the exception that there are no system management functions, leaving you with the following two options:
In either case, first install the PE filesets on at least one system in your cluster. When this is complete, you can replicate the installation image to your other nodes.
During the course of installing PE filesets on a cluster, you may encounter sysck warning messages that a particular file is also owned by another fileset. If the file is also owned by one of the older PE filesets, such as poe or poe_pcf, then this may indicate an older version is installed.
These warning messages can be ignored, as the system will function properly. However, if you later choose to remove the old fileset after installing PE Version 2.4, you will need to repeat the installation of the new fileset.
If you migrate from PE Version 1 or Version 2.1 to PE Version 2 Release 4, installing the new filesets will completely replace some of the earlier release filesets, rendering them obsolete. The replaced filesets will be marked "OBSOLETE" in the ODM and lslpp by installp.
However, some directories and install files will remain. Since these earlier filesets do not coexist or execute with PE Version 2.4, you should uninstall your old filesets before installing the new PE filesets, rather than installing the new filesets on top of the old. This will conserve disk space and reduce the chance for confusion over old fileset path names, executables, etc.
Caution:
If you plan to uninstall the old filesets, do so before installing
the new filesets. If you attempt to uninstall the old filesets
after installing PEVersion 2.4 , you may accidentally delete some needed files that may affect
your system.
The following table lists the old filesets that need to be removed before
installing PE Version 2.4:
PE Version | Filesets to be Removed |
---|---|
1 |
poe poe_pcf xpdbx vt pedocs
|
2.1 |
ppe.xpdbx ppe.pedocs
|
You can use the lslpp command to check if any of the above filesets are installed. For example, lslpp -l poe will tell you if the Version 1 poe fileset is installed.
To remove filesets you can use any of the following methods:
Use the Maintain Installed Software dialog found under the Software Installation and Maintenance dialog.
installp -u poe
Install this fileset to submit a POE job which uses the SP Resource Manager from a non-SP node. To install, do the following:
installp -aFXd /<mounted_image_directory>/pssp.installp ssp.clients
Install this fileset to submit a POE job which uses LoadLeveler from a node outside of the LoadLeveler cluster. To install, do the following:
installp -aFXd device loadl.so
Prior to actually installing any fileset, you may want to look at its README file. The README file may contain some special or additional information about installing the fileset. The PE filesets are all shipped with a copy of the README as part of the first file on the tape. This allows you to view the README using the installp -i command and option.
If you decide after reading the README that you would like to refer to the file later, once the fileset is installed you can find the README file in the /usr/lpp/<fileset>/README directory, with a name of <fileset>.README.
Summarized below are the basic steps you must follow to install the PE software on the SP or an IBM RS/6000 network cluster.
You can install all of the PE filesets at once, or you can install selected
filesets one at a time. To determine which filesets, if any, that you
want to install separately, see "PE Fileset Requirements".
If you are installing: | Perform these steps: | |||||
---|---|---|---|---|---|---|
ppe.poe | ppe.vt | ppe.pedb | ppe.xprofiler | ppe.pedocs | ||
X | X | X | X | X | "Step 1: Copying the Software to a Hard Disk for Installation Over a Network" | Standard Steps |
"Step 2: Performing the Initial Installation" | ||||||
"Step 3: Installing PE on Other Nodes" | ||||||
X |
|
|
|
| "Step 4: Verifying the POE Installation" |
Optional Steps
|
| X |
|
|
| "Step 5: Verifying the VT Installation" | |
X |
|
|
|
| Appendix C. "Using Additional POE Sample Applications" |
This section provides the step-by-step procedure for installing the PE software on the SP or on the IBM RS/6000 network cluster. Each step includes one or more tables that guide you through choices about such variables as:
Pay close attention to these tables as you proceed through the procedure, because they may direct you to skip certain steps.
Notes:
This step consists of copying the installation images off
the distribution medium and exporting the installation directory, thereby
making the installation images available for mounting.
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: |
---|---|
You must always complete this step. | You must complete this step if any of the machines in your cluster do not have the proper installation device to read the distribution medium. |
Note: | If you already have an earlier version of PE installed, remove the earlier version before proceeding. (See "Removing an Installation Image".) |
To copy the PE software off the distribution medium, follow the instructions below:
* This command invokes SMIT, and takes you to the window for copying software to a hard disk for future installation over the network.
* A window opens listing the available INPUT devices and directories for software.
* The window listing the available INPUT devices closes and the original SMIT window indicates your selection.
* The SMIT window displays the default parameters for copying software to a hard disk.
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: |
---|---|
/spdata/sys1/install/pssplpp | /usr/sys/inst.images |
* The system copies the PE software installation images to the directory.
* The SMIT window closes.
To export the directory so the machines in your cluster can install the
PE installation images it contains, enter the appropriate command, as shown in
the following table:
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: |
---|---|
/usr/sbin/mknfsexp -d /spdata/sys1/install/pssplpp | /usr/sbin/mknfsexp -d /usr/sys/inst.images |
This step consists of initially installing the PE installation image, using either of the following methods:
Either method allows you to specify whether you want to install all of the PE software filesets or just certain individual filesets.
Note: | Keep in mind that some of the PE filesets depend on others to run. "PE Fileset Requirements" details these dependencies. Refer to this section before you do a partial installation. |
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: |
---|---|
Perform this step on the initial SPnode or on the control workstation. You must login as root. | Perform this step on any machine in the cluster. You must login as root. |
To initially install the installation image, enter the
appropriate command as shown in the following table:
To install: | ENTER | ||
---|---|---|---|
all software filesets | installp -a -d devicename ppe* | ||
just the poe fileset | installp -a -I -X -d devicename ppe.poe | ||
just the pedb fileset | installp -a -I -X -d devicename ppe.pedb | ||
just the VT fileset | installp -a -I -X -d devicename ppe.vt | ||
just the Xprofiler fileset | installp -a -I -X -d devicename ppe.xprofiler | ||
just the pedocs fileset | installp -a -I -X -ddevicename ppe.pedocs | ||
|
* The system reads and receives the installation image off the distribution medium.
To initially install the installation image using SMIT, follow the instructions below:
* This command invokes SMIT, and takes you directly to its window for installing software.
* A window opens listing the available INPUT devices and directories for software.
* The window listing the available INPUT devices and directories closes and the original SMIT window indicates your selection.
* The SMIT window displays the default install parameters.
If you want to install: | Type this in the "SOFTWARE to install" field: |
---|---|
all the PE software | ppe* |
just the poe fileset | ppe.poe |
just the pedb fileset | ppe.pedb |
just the VT fileset | ppe.vt |
just the Xprofiler fileset | ppe.xprofiler |
just the pedocs fileset | ppe.pedocs |
Note: | After choosing the appropriate software, you may also want to change other options on the panel, as needed. For example, the panel also asks whether or not you want to expand the file systems. |
* The system installs the installation image.
Note: | The POE installation process checks to see if the digd daemon is running. If it is not running, it will start the daemon. If it is running, it will not start it, and will tell you the daemon's process number, so you can kill it manually. |
Note: | For more information on SMIT, refer to IBM AIX Version 4 General Programming Concepts: Writing and Debugging Programs, SC23-2533. |
If installation fails, a software product cleanup procedure is automatically called. The cleanup procedure removes any files that may have been restored from the distribution medium, and backs out of any post-installation procedure that may have been started.
To help ascertain the cause of a failed installation, refer to the installation status file. This file indicates how far installation had progressed when the errors occurred. The status file is described in more detail in IBM AIX Version 4 General Programming Concepts: Writing and Debugging Programs, SC23-2533. If you cannot ascertain the cause of a failed installation, contact your local IBM representative.
You have completed the initial installation of PE. For a description of the directories, files, and daemon processes created and the links established when the installation image was received, see Chapter 6. "How Installation of PE Alters Your System".
To determine which remaining steps you need to perform, refer to the
following table:
If there are other nodes in your system on which you need to install PE filesets: | If there are not any other nodes in your system on which you need to install PE filesets: |
---|---|
Proceed to | Skip:
and/or |
This step consists of installing PE on other nodes, using either of the following methods:
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: | ||
---|---|---|---|
Perform this step from the initial SP node or from the control workstation as root.
Ensure that you have executed the k4init command to obtain Kerberos authentication for accessing your nodes.
| Perform this step from a node with PE installed as root. |
This method consists of:
To create a host list file, follow the instructions below:
Note: | By default, the installation scripts look for a file named host.list in your current directory; however, you can name the host list file anything you want. If you do choose to give your file a different name, you will have to specify that file name when you run the installation script. |
hostname1 hostname2 hostname3 hostname4 hostname5
To run the installation script, enter the command that is appropriate for
your system, as listed in the table below. (For a detailed explanation
of the syntax of each of these commands, see Appendix A. "Syntax of Commands for Running Installation and Deinstallation Scripts".)
If you are installing on the SP: | If you are installing on an IBM RS/6000 network cluster: |
---|---|
PEinstallSP image_name [host_list_file] [-f fanout_value] [-copy | -mount] | PEinstall image_name [host_list_file] [-copy | -mount] |
Notes:
|
The default is:
/spdata/sys1/install/pssplpp
/usr/sys/inst.images
The default is:
/spdata/sys1/install/pssplpp
/usr/sys/inst.images
The default is:
/spdata/sys1/install/pssplpp
/usr/sys/inst.images
The default is /mnt.
Answer no to this prompt.
Note: | Be sure that you have issued the chmod 777 command on this directory. |
Answer yes to this prompt.
PEinstall or PEinstallSP issues a mkdir command for the directory name specified, followed by a chmod 777.
When you are prompted for the name of the fileset you want to install,
enter the appropriate file name as shown in the following table:
If you want to install: | Type this when prompted: |
---|---|
all the PE software | all |
just the poe fileset | ppe.poe |
just the pedb fileset | ppe.pedb |
just the VT fileset | ppe.vt |
just the Xprofiler fileset | ppe.xprofiler |
just the pedocs fileset | ppe.pedocs |
* For each node in the host list, PEinstallSP or PEinstall executes the following installp command:
installp -aFX -d/image_directory/image_name fileset
This command installs both the usr and root portion of the fileset in the image specified.
The following severe installation errors will cause the installation process to terminate completely:
For other errors, a message may be displayed describing the error, and then processing will continue. The same message will be logged in a file named PEnode.log in the current working directory. If you see error messages, look in this file, as the node on which the error occurred is always displayed and logged. This helps you identify any nodes on which the fileset(s) did not get successfully installed. When you correct the error(s), you can then go back and rerun the PEinstallSP or PEinstall script just for those nodes.
As a system administrator, you may want to have more control over the installation of PE, and install it manually to other nodes, using SMIT or installp.
During "Step 1: Copying the Software to a Hard Disk for Installation Over a Network", you created an installation image that you can use to replicate the installation of PE filesets on the other nodes of your system. By making this image available to the other nodes, either by copying or mounting the image file, you can use SMIT or installp to install the image.
The installation image of PE filesets does not require any special consideration. You may use SMIT or installp as described in "Method 1: Using the installp Command". You can also set up a host list file, and run installp via dsh (for SP systems only), or rsh, and install the PE filesets on multiple nodes.
You have completed installing PE on the other nodes in your system.
To determine which remaining steps you need to perform, refer to the
following table:
If you installed POE: | If you did not install POE: |
---|---|
Proceed to: | Skip:
Proceed to: |
Note: | This step applies only if you have POE installed. |
This step consists of testing the installation of POE, using the POE Installation Verification Program (IVP). This program is provided in /usr/lpp/ppe.poe/samples/ivp. For details about how the POE IVP works, see Appendix B. "POE Installation Verification Program".
To run the POE IVP, follow the instructions below.
At the control workstation (or other home node):
* This runs an installation verification test that checks for successful execution of a message-passing program using two tasks on this node. The output should resemble the following:
Verifying the location of the Libraries Verifying the existence of the Binaries Partition Manager daemon /etc/pmdv2 is executable POE files seem to be in order Compiling the ivp sample program Output files will be stored in directory /tmp/ivp15480 Creating host.list file for this node Setting the required environment variables Executing the parallel program with 2 tasks POE IVP: running as task 0 on node pe03 POE IVP: running as task 1 on node pe03 POE IVP: there are 2 tasks running POE IVP: task 1 received <POE IVP Message Passing Text> POE IVP: all messages sent Parallel program ivp.out return code was 0 Executing the parallel program with 2 tasks, threaded library POE IVP_r: running as task 1 on node pe03 POE IVP_r: running as task 0 on node pe03 POE IVP_r: there are 2 tasks running POE IVP_r: task 1 received <POE IVP Message Passing Text - Threaded Library> POE IVP_r: all messages sent Parallel program ivp_r.out return code was 0 If both tests return a return code of 0, POE IVP is successful. To test POWERparallel system message passing, run the tests in ../samples/poetest.bw and poetest.cast To test threaded message passing, run the tests in ../samples/threads End of IVP test
If errors are encountered, your output contains messages that describe these errors. You can correct the errors and run the ivp.script again, if desired.
POE also has sample applications for doing the following:
See Appendix C. "Using Additional POE Sample Applications" for more information.
You have completed verifying the POE installation.
To determine whether you need to proceed to the next step, refer to the
following table:
If you installed VT: | If you did not install VT: |
---|---|
Proceed to: | Do not proceed to: |
Note: | This step applies only if you have VT installed. |
This step consists of:
To verify that VT was installed correctly, follow the instructions below.
At the control workstation (or other home node):
* This command starts VT.
If the trace file plays to the end while updating the display, VT was installed successfully.
To verify that the VT trace generation mechanism is operating correctly, follow the instructions below after installing POE.
At the control workstation (or other home node):
* This results in a file called vtsample.trc being generated in the directory from which you ran it.
If you are successful in examining this trace file with VT as described previously, VT is operating correctly.
Once you have completed verifying the installation of the VT fileset, you can use VT as described in IBM Parallel Environment for AIX: Operation and Use, Volume 2, SC28-1980.
Once you have installed the PE filesets (ppe.poe, ppe.vt, ppe.pedb, ppe.xprofiler, and ppe.pedocs), refer to the README file provided with each fileset for any additional installation or usage information. The README file is located in /usr/lpp/<fileset>/README as <fileset>.README.
For information about other procedures related to PE installation, see Chapter 5. "Installation-Related Procedures".