IBM Parallel Engineering and Scientific Subroutine Library (Parallel ESSL) for AIX has the following characteristics:
For a list of subroutines, refer to "Looking for a Subroutine?".
Tuned for performance on the SP consisting of POWER2 nodes with the High Performance Switch or SP Switch.
This part of the book is organized into five chapters, providing guidance information on how to use Parallel ESSL. It is organized as follows:
This chapter introduces you to IBM* Parallel Engineering and Scientific Subroutine Library (Parallel ESSL) for Advanced Interactive Executive (AIX*) products.
Parallel ESSL is a scalable mathematical subroutine library that supports parallel processing applications on IBM RS/6000* SP* Systems and clusters of IBM RS/6000 workstations. Parallel ESSL supports the Single Program Multiple Data (SPMD) programming model using either the Message Passing Interface (MPI) signal handling library or the MPI threaded library. Parallel ESSL provides subroutines in six major areas of mathematical computations. It is tuned for optimal performance on the SP consisting of POWER2* nodes with the High Performance Switch or SP Switch.
Parallel ESSL provides subroutines in the following computational areas:
The subroutines run under the AIX operating system and can be called from application programs written in Fortran, C, C++, and High Performance Fortran (HPF). On the SP, Parallel System Support Programs (PSSP) is also required.
For communication, Parallel ESSL includes the Basic Linear Algebra Communications Subprograms (BLACS), which use the Parallel Environment (PE) Message Passing Interface (MPI). Communications using the User Space (US) require either the High Performance Switch or SP Switch. Communications using the Internet Protocol (IP) may use Ethernet, Token Ring, Fiber Distributed Data Interface (FDDI), High Performance Switch or SP Switch. For computations, Parallel ESSL uses the ESSL for AIX subroutines.
To order the IBM Parallel ESSL for AIX, specify program number 5765-C41.
Parallel ESSL uses PE for communication during parallel processing, supporting the SPMD programming model, running on the SP or workstation clusters. In other words, your application program must be using PE if you want to call Parallel ESSL subroutines.
The RS/6000 processors are called processor nodes. A parallel program, such as yours with calls to the Parallel ESSL subroutines, executes as a number of individual, but related, parallel tasks on a number of your system's processor nodes. The group of parallel tasks is called a partition. The parallel tasks of your partition can communicate to exchange data or synchronize execution.
Your SP may have an optional high-performance switch for communication. The switch increases the speed of communication between nodes. It supports a high volume of message passing with increased bandwidth and low latency. This helps your application program, as well as the Parallel ESSL subroutines, achieve maximum performance.
Parallel ESSL assumes that the application program is using the SPMD programming model, where the programs running the parallel tasks of your partition are identical. The tasks, however, work on different sets of data.
The following sections describe how to use the Message Passing subroutines supplied in Parallel ESSL.
The application developer begins by creating a parallel program's source code, including calls to the Parallel ESSL subroutines. The application developer might create this program from scratch and then places calls to BLACS or MPI or MPL routines so that it can run as a number of parallel tasks. These calls enable the parallel processes of your partition to communicate data and coordinate their execution. As part of each parallel process, the Parallel ESSL subroutines also perform these types of functions.
Details on what other specific coding additions are required when using Parallel ESSL are given in "Coding and Running Your Program".
Your global data structures (vectors, matrices, or sequences) must be distributed across your processes prior to calling the Parallel ESSL subroutines.
Because data is distributed for both input and output, no implicit bottleneck is created by an initial scatter or ending gather operation. Parallel ESSL works in true SPMD mode, where each process operates only on a portion of the data. Also, the input and output data may be too large to collectively reside on a single node; therefore, problems associated with the storage limitations of a single processor node are eased by performing the computation in actual SPMD fashion.
See "Distributing Your Data" for details on distributing your data.
After writing the parallel application program containing calls to the Parallel ESSL subroutines, the developer then begins a cycle of modification and testing. The application program is run using the Parallel Operating Environment (POE). The POE includes:
You can use all of these capabilities of POE with Parallel ESSL.
Once the parallel program is debugged, you now want to tune the program for optimal performance. This is an important step of the process, because performance is the key reason for using the Parallel ESSL subroutines. To tune and analyze programs with calls to the Parallel ESSL subroutines, you may wish to use the tools provided by PE. For details, see the PE manuals listed in "Parallel Environment Version 2".
XL HPF provides an easy way to develop parallel software with the SPMD programming model on your SP or cluster configuration. The XL HPF compiler, guided by XL HPF directives in your source code, handles the distribution of data and communication on multiple processes. There are three steps involved in getting a Fortran program ready to run in a parallel environment:
These three steps are accomplished using XL HPF directives. To parallelize your Fortran program, you must do the following:
Parallel Environment is required to compile and run Parallel ESSL HPF programs.
For more details about the XL HPF language, see the IBM XL High Performance Fortran for AIX Language Reference manual, SC09-2226.
For more information on distributing your data, see "Distributing Your Data". For more information on coding your program, see "Coding and Running Your Program".
For further details on PE and its various capabilities, see the PE manuals listed in "Parallel Environment Version 2". For more information about MPI, see references [38] and [46].
Parallel ESSL provides accuracy comparable to libraries using equivalent algorithms with identical precision formats. The data types operated on are RS/6000 architecture precisions: ANSI/IEEE 64-bit binary floating-point format, and 32-bit integer. See the ANSI/IEEE Standard for Binary Floating-Point Arithmetic, ANSI/IEEE Standard 754-1985, for more detail.
The Parallel ESSL subroutines follow standard Fortran calling conventions. When Parallel ESSL subroutines are called from a program in a language other than Fortran, such as C or C++, the Fortran conventions must be used. This applies to all aspects of the interface, such as the linkage conventions and the data conventions. For example, array ordering must be consistent with Fortran array ordering techniques. Data and linkage conventions for each language are given in "Coding and Running Your Program" and in the ESSL Version 3 Guide and Reference.