Guide and Reference

Coding Your Message Passing Program

This section contains Parallel ESSL-specific application program coding requirements and considerations for message passing programs--that is, programs coded in Fortran, C, and C++. To make a Parallel ESSL call in a parallel application program:

Call the BLACS initialization subroutines (BLACS_GET followed by a call to either BLACS_GRIDINIT or BLACS_GRIDMAP), to initialize the process grid. For details on how to do this, see "Initializing the BLACS".
Ensure your data has been distributed across your process grid, according to the particular input distribution specified by the Parallel ESSL subroutine. For details on how to do this, see "Distributing Your Data".
Call the Parallel ESSL subroutine on all processes in the process grid (defined earlier through the BLACS initialization calls). The Parallel ESSL subroutine call interfaces are documented in Part 2 of this book.
When the Parallel ESSL subroutine returns control to the application program, you process the solution data, which is distributed in accordance with the output distribution specified by the Parallel ESSL subroutine.

To look at an application program outline, see the following:

"Application Program Outline" (For all subroutines, except the sparse linear algebraic equation subroutines.)
"Application Program Outline for the Fortran 90 Sparse Linear Algebraic Equations and Their Utilities"
"Application Program Outline for the Fortran 77 Sparse Linear Algebraic Equations and Their Utilities"

For an example of the use of Parallel ESSL in a sample message passing Fortran 90 application program solving a thermal diffusion problem, see Appendix B. "Sample Programs".

The ESSL Version 3 Guide and Reference manual contains additional information about coding ESSL for AIX subroutine calls in Fortran, C, and C++ programs. That information also applies to Parallel ESSL and is not repeated in this book. The specific topics you may want to reference, that apply to Parallel ESSL, are:

Coding the calling sequences
Passing arguments
Setting up scalar data
Setting up complex data in C and C++ programs
Setting up arrays

Initializing the BLACS

A parallel machine with k processes is often thought of as a one-dimensional linear array of processes labeled 0, 1, ..., k-1. For performance reasons, it is sometimes useful to map this one-dimensional array into a logical two-dimensional rectangular grid, which is also referred to as process grid, of processes. The process grid can have p process rows and q process columns, where p × q = k. A process can now be indexed by row and column, (i,j), where 0 <= i < p and 0 <= j < q. (This logical rectangular grid may not necessarily be reflected by the underlying hardware--that is, processor nodes. In most cases k is less than or equal to the number of SP processor nodes that your job is running on. In special cases, however, the number of processes can be greater than the number of SP processor nodes. This is subject to restrictions imposed by PE. For more details refer to the appropriate Parallel Environment: Operation and Use manual.)

Before calling the Parallel ESSL subroutines, you need to call BLACS_GET, followed by a call to either BLACS_GRIDINIT or BLACS_GRIDMAP to define the size and dimensions of your process grid. This identifies what processes are involved in the communication. You can reinitialize the BLACS, as needed, at various points in your application program to redefine the process grid.

When you initialize the BLACS, you must specify the (total) size k of the grid to be less than or equal to the value set in the MP_PROCS PE environment variable or its associated command-line flag -procs. If argument values are not valid, an error message is issued and the program is terminated.

An example of initializing the BLACS in a Fortran 90 program is shown in Appendix B. "Sample Programs". See the subroutine initialize_scale in "Module Scale (Message Passing)".

BLACS_PINFO

You call the BLACS_PINFO routine when you want to determine how many processes are available. You can use this information as input into other BLACS routines that set up your process grid.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_PINFO` (`mypnum, nprocs`)
C	`blacs_pinfo` (`&mypnum, &nprocs`);
C++	extern "FORTRAN" void blacs_pinfo(const int &, const int &); `blacs_pinfo` (`mypnum, nprocs`);

On Return

mypnum

is the local process rank (for example, PE task number) that your program is currently running on.

Returned as: a fullword integer value, where: 0 <= mypnum <= (nprocs - 1).

nprocs

is the number of processes available for the BLACS to use.

Returned as: a fullword integer value.

BLACS_GET

You call the BLACS_GET routine when you want the values the BLACS are using for internal defaults. The most common use is in retrieving a default system context for input into BLACS_GRIDINIT or BLACS_GRIDMAP.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_GET` (`icontxt, what, val`)
C	`blacs_get` (`&icontxt, &what, &val`);
C++ and `what` = 0, 2, 10	extern "FORTRAN" void blacs_get(const int &, const int &, const int &); `blacs_get` (`icontxt, what, val`);

On Entry

icontxt

has the following meaning:

If what = 0 or 2, icontxt is ignored.

If what = 10, icontxt is the integer handle indicating the BLACS context.

Specified as: a fullword integer value.

what

indicates the BLACS internal to be returned in val. For a description of the values of what, see Table 31.

Table 31. Input and Output for BLACS_GET

Value of what BLACS Internals That are Returned in val
0 Handle indicating the default system context
2 BLACS debug level
10 Handle indicating the system context used to define the BLACS context whose handle is icontxt.
You can redefine the shape of your process grid by calling BLACS_GET with what=10. For examples on how to do this, see the "Notes" section in "BLACS_GRIDINIT" or "BLACS_GRIDMAP".

Value of `what`	BLACS Internals That are Returned in `val`
0	Handle indicating the default system context
2	BLACS debug level
10	Handle indicating the system context used to define the BLACS context whose handle is `icontxt`. You can redefine the shape of your process grid by calling BLACS_GET with `what`=10. For examples on how to do this, see the "Notes" section in "BLACS_GRIDINIT" or "BLACS_GRIDMAP".

Specified as: a fullword integer value 0, 2, 10.

On Return

val: is the value of the BLACS internal, as defined for each value of what in Table 31.
Returned as: a fullword integer value.

BLACS_GRIDINIT

You call the BLACS_GRIDINIT routine when you want to map the processes sequentially in row-major order or column-major order into the process grid. You must specify the same input argument values in the calls to BLACS_GRIDINIT on every process.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_GRIDINIT` (`icontxt, order, nprow, npcol`)
C	`blacs_gridinit` (`&icontxt, &order, &nprow, &npcol`);
C++	extern "FORTRAN" void blacs_gridinit(const int &, char *, const int &, const int &); `blacs_gridinit` (`icontxt, order, nprow, npcol`);

On Entry

icontxt

is the system context to be used in creating the BLACS context. For examples on obtaining a default system context and reshaping your process grid, see the "Notes" section.

Specified as: a fullword integer value.

order

indicates how to map processes into the process grid, where:

If order = 'R', row-major natural ordering is used. This is the default.

If order = 'C', column-major natural ordering is used.

Specified as: a single character; order = 'R' or 'C'.

nprow

is the number of rows in this process grid.

Specified as: a fullword integer where: 1 <= nprow <= p.

npcol

is the number of columns in this process grid.

Specified as: a fullword integer value where: 1 <= npcol <= q.

On Return

icontxt: is the integer handle to the BLACS context, which is a mechanism for partitioning communication space. A defining property of a context is that a message in a context cannot be sent or received in another context. The BLACS context includes the definition of a grid, and each processor's coordinates in the grid.
Returned as: a fullword integer value.

Notes

You may obtain a default system context by calling BLACS_GET as follows:
```
CALL BLACS_GET(0, 0, icontxt)
```

You can redefine the shape of your process grid by calling BLACS_GET with what=10 and then calling BLACS_GRIDINIT. The following example shows how to create a 1 × 4 process grid, using the context from a 2 × 2 process grid:

*
* Define the 1 × 4 process grid
*
CALL BLACS_GET(0, 0, icontxt)
CALL BLACS_GRIDINIT(icontxt, 'R' 2, 2)
          .
          .
          .
*
* Redefine the shape to a 1 × 4 process grid
*
CALL BLACS_GET(icontxt, 10, newcontxt)
CALL BLACS_GRIDINIT(newcontxt, 'R', 1, 4)

Suppose you specified a total of fifteen processes in your MP_PROCS environment variable, referred to as t₀ through t₁₄. You then call BLACS_GRIDINIT in your Fortran program, as follows:

   CALL BLACS_GRIDINIT (icontxt,'R',3,4)

The processes would be mapped sequentially in row major order into a 3 by 4 process grid as follows:

Table 32. A 3 by 4 process grid

P_p,q 0 1 2 3
0 t₀ t₁ t₂ t₃
1 t₄ t₅ t₆ t₇
2 t₈ t₉ t₁₀ t₁₁

P_p,q	0	1	2	3
0	t₀	t₁	t₂	t₃
1	t₄	t₅	t₆	t₇
2	t₈	t₉	t₁₀	t₁₁

Note: In this example, the process grid is 3 by 4. You must execute a call to Parallel ESSL on all processes whose process row and column index satisfy 0 <= i < 3 and 0 <= j < 4, respectively.

BLACS_GRIDINFO

You call the BLACS_GRIDINFO routine to obtain the process row and column index.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_GRIDINFO` (`icontxt, nprow, npcol, myrow, mycol`)
C	`blacs_gridinfo` (`&icontxt, &nprow, &npcol, &myrow, &mycol`);
C++	extern "FORTRAN" void blacs_gridinfo(const int &, const int &, const int &, const int &, const int &); `blacs_gridinfo` (`icontxt, nprow, npcol, myrow, mycol`);

On Entry

icontxt: is the integer handle to the BLACS context which is a mechanism for partitioning communication space. A defining property of a context is that a message in a context cannot be sent or received in another context. The BLACS context include the definition of a grid, and each process coordinates in the grid.
Specified as: a fullword integer value, returned by BLACS_GRIDINIT or BLACS_GRIDMAP.

On Return

nprow

is the number of rows in this process grid.

Specified as: a fullword integer where: 1 <= nprow <= p.

npcol

is the number of columns in this process grid.

Specified as: a fullword integer value where: 1 <= npcol <= q.

myrow

is the process grid row index.

Returned as: a fullword integer value where: 0 <= myrow < p.

mycol

is the process grid column index.

Returned as: a fullword integer value where: 0 <= mycol < q.

BLACS_GRIDMAP

You call the BLACS_GRIDMAP routine when you want to map the processes in a specific manner into a process grid. You pass in a two-dimensional array containing the process numbers, which is mapped into your new process grid. You must specify the same input argument values in the calls to BLACS_GRIDMAP on every process.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_GRIDMAP` (`icontxt, usermap, ldumap, nprow, npcol`)
C	`blacs_gridmap` (`&icontxt, usermap, &ldumap, &nprow, &npcol`);
C++	extern "FORTRAN" void blacs_gridmap(const int &, int *, const int &, const int &, const int &); `blacs_gridmap` (`icontxt, usermap, ldumap, nprow, npcol`);

On Entry

icontxt

is the system context to be used in creating the BLACS context. For examples on obtaining a default system context and reshaping your process grid, see the "Notes" section.

Specified as: a fullword integer value.

usermap

specifies the process-to-grid mapping. USERMAP(i,j) contains the number of the process to be mapped to the process grid, location (i,j).

Specified as: a two dimensional integer array of size ldumap by npcol.

ldumap

is the leading dimension of the integer array USERMAP.

Specified as: an integer where: ldumap >= nprow

nprow

is the number of rows in this process grid.

Specified as: a fullword integer where: 1 <= nprow <= p.

npcol

is the number of columns in this process grid.

Specified as: a fullword integer value where: 1 <= npcol <= q.

On Return

icontxt: is the integer handle to the BLACS context which is a mechanism for partitioning communication space. A defining property of a context is that a message in a context cannot be sent or received in another context. The BLACS context include the definition of a grid, and each process coordinates in the grid.
Returned as: a fullword integer value.

Notes

You may obtain a default system context by calling BLACS_GET as follows:
```
CALL BLACS_GET(0, 0, icontxt)
```

You can redefine the shape of your process grid by calling BLACS_GET with what=10 and then calling BLACS_GRIDMAP. The following example shows how to create a 1 × 4 process grid, using the context from a 2 × 2 process grid:

*
* Define the 1 × 4 process grid
*
CALL BLACS_GET(0, 0, icontxt)
CALL BLACS_GRIDMAP(icontxt, usermap, 2, 2, 2)
          .
          .
          .
*
* Redefine the shape of your 2 × 2 process grid
* to a 1 × 4 process grid
*
CALL BLACS_GET(icontxt, 10, newcontxt)
CALL BLACS_GRIDMAP(newcontxt, usermap, 2, 1, 4)

Suppose you specified a total of 15 processes in your MP_PROCS environment variable, referred to as t₀ through t₁₄. You then called BLACS_GRIDMAP in your Fortran program, as follows:
```
   CALL BLACS_GRIDMAP (icontxt1,USERMAP,5,3,4)
```
Where array USERMAP1 contained the following integer values:
```
              *                 *
              |  0   1   2   3  |
              |  8   9  10  11  |
USERMAP1  =   |  4   5   6   7  |
              |  .   .   .   .  |
              |  .   .   .   .  |
              *                 *
```
then, the processes would be mapped into a 3 by 4 process grid as follows:

Table 33. 3 by 4 process grid

P_p,q 0 1 2 3
0 t₀ t₁ t₂ t₃
1 t₈ t₉ t₁₀ t₁₁
2 t₄ t₅ t₆ t₇

BLACS_GRIDMAP sets icontxt1. Use the value of icontxt1 in any subsequent calls to Parallel ESSL to use this process grid.
While the above process grid is active, another overlapping process grid can be defined. Suppose you then called BLACS_GRIDMAP in your Fortran program as follows:
```
   CALL BLACS_GRIDMAP(icontxt2, USERMAP2, 2, 2, 2)
```
where USERMAP contains the following values:
```
            *         *
USERMAP2 =  |  1   2  |
            |  10  11 |
            *         *
```
Then the processes would be mapped into a 2 by 2 process grid as follows:

Table 34. 2 by 2 process grid

P_p,q 0 1
0 t₁ t₂
1 t₁₀ t₁₁

BLACS_GRIDMAP will set icontxt2. Use the value of icontxt2 in any subsequent calls to Parallel ESSL to use this process grid.
Notes:
1. In this example, process t₁ is mapped to P₀₁ in the first grid and to P₀₀ in the second grid.
2. Both grids can simultaneously be used in your program.

P_p,q	0	1	2	3
0	t₀	t₁	t₂	t₃
1	t₈	t₉	t₁₀	t₁₁
2	t₄	t₅	t₆	t₇

P_p,q	0	1
0	t₁	t₂
1	t₁₀	t₁₁

BLACS_GRIDEXIT

You call the BLACS_GRIDEXIT routine to release a BLACS context.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_GRIDEXIT` (`icontxt`)
C	`blacs_gridexit` (`&icontxt`);
C++	extern "FORTRAN" void blacs_gridexit(const int &); `blacs_gridexit` (`icontxt`);

On Entry

icontxt: is the integer handle to the BLACS context indicating the BLACS context to be released.
Specified as: a fullword integer value, returned by BLACS_GRIDINIT or BLACS_GRIDMAP.

BLACS_EXIT

You call the BLACS_EXIT routine to release all the BLACS context and the memory allocated by the BLACS.

Syntax

Language	Call Statement
Fortran	CALL `BLACS_EXIT` (`continue`)
C	`blacs_exit` (`&continue`);
C++	extern "FORTRAN" void blacs_exit(const int &); `blacs_exit` (`continue`);

On Entry

continue

has the following meaning:

If continue = 0, all the BLACS context and memory allocated by the BLACS are released. In addition, Parallel ESSL calls MPI_Finalize to exit from MPI. There can only be one call to MPI_Finalize in your program. Therefore, at the end of your program, you should call BLACS_EXIT with continue = 0 or call MPI_Finalize directly.

If continue <> 0, the BLACS contexts and memory allocated by the BLACS are released, however, you can continue using MPI. When you are finished using MPI, you need to remember to call MPI_Finalize directly.

Specified as: a fullword integer.

Using Extrinsic Procedures--The Fortran 90 Sparse Linear Algebraic Equation Subroutines

In Fortran 90 programs, the Parallel ESSL sparse linear algebraic equation subroutines are invoked with the CALL statement, using the features of Fortran 90--generic interfaces, optional and keyword arguments, assumed-shape arrays, and modules.

The Fortran 90 sparse linear algebraic equation subroutines require that an explicit interface be provided for each extrinsic procedure entry in the scope where it is called, using an interface block. The interface blocks for the Parallel ESSL subroutines are provided for you in the module F90SPARSE, so you do not have to code the interface blocks yourself. In the beginning of your program, before any other specification statements, you must code the statement:

   use f90sparse

This gives the XL Fortran compiler access to the interface blocks. For examples of where to code this statement in your program, see "Application Program Outline for the Fortran 90 Sparse Linear Algebraic Equations and Their Utilities".

For further details on coding the CALL statement and other related aspects of Fortran 90 programs, see the following Fortran manuals:

IBM XL Fortran for AIX User's Guide Version 5
IBM XL for AIX Fortran Language Reference Version 5

Setting Up the Parallel ESSL Header File for C and C++

Before you can call the Parallel ESSL subroutines from your C or C++ program, you must have the Parallel ESSL header file installed on your system. The Parallel ESSL header file allows you to code your function calls as described in Part 2 of this book. The Parallel ESSL header file is named pessl.h. You should check with your system support group to verify that the appropriate Parallel ESSL header file is installed.

C Programs

In the beginning of your C program, before you call any of the Parallel ESSL subroutines, you must code the following statement for the Parallel ESSL header file:

   #include <pessl.h>

C++ Programs

In the beginning of your C++ program, before you call any of the Parallel ESSL subroutines, you must code the following statement for the Parallel ESSL header file:

   #include <pessl.h>

Application Program Outline

For the Level 2 and 3 PBLAS, dense and banded linear algebraic equations, and eigensystem analysis and singular value analysis subroutines, this application program outline shows how you can use the BLACS to define a process grid, set up a Type-1 array descriptor, call a Parallel ESSL subroutine, and exit the BLACS. For a complete example, see Appendix B. "Sample Programs".

              .
              .
              .
*
* Determine my process number and the total number of available
* processes
*
      CALL BLACS_PINFO(IAM, NNODES)
*
* Define a process grid that is as close to square as possible

*
      NPROW = INT(SQRT(REAL(NNODES)))
      NPCOL = NNODES/NPROW
*
* Get the default system context
* Define the process grid
* Determine my process row and column index
*
      CALL BLACS_GET(0, 0, ICONTXT)
      CALL BLACS_GRIDINIT(ICONTXT, 'R', NPROW, NPCOL)
      CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)

*
* Only call the Parallel ESSL subroutine if I am in the process grid
*
      IF (MYROW .LT. NPROW .AND. MYCOL .LT. NPCOL) THEN
*
* Setup input arrays, scalars, array descriptors, etc.
*
              .
              .
*

* NUMROC can be used to return the size of local arrays
* For example, here is one way to setup the descriptor vector for A
*
      DESC_A(1) = DTYPE_A
      DESC_A(2) = ICONTXT
      DESC_A(3) = M_A
      DESC_A(4) = N_A
      DESC_A(5) = MB_A
      DESC_A(6) = NB_A
      DESC_A(7) = RSRC_A
      DESC_A(8) = CSRC_A
      DESC_A(9) = MAX (1, NUMROC(DESC_A(3), DESC_A(5), MYROW, DESC_A(7), NPROW))

              .
              .
              .
*
*  CALL Parallel ESSL subroutine
*
       CALL PDTRAN(M, N, ALPHA, A, IA, JA, DESC_A, BETA, C, IC, JC, DESC_C)
*
* Process output arrays, scalars etc.

*
              .
              .
              .
*
* When finished with this process grid, release the process grid.
*
      CALL BLACS_GRIDEXIT(ICONTXT)
              .
      ENDIF

              .
              .
              .
              .
*
* At the end of the program, exit from the BLACS and MPI
*
      CALL BLACS_EXIT(0)
              .
              .
              .
      END

Application Program Outline for the Fortran 90 Sparse Linear Algebraic Equations and Their Utilities

The following is an outline for a application program that is calling the Fortran 90 sparse linear algebraic equation subroutines and their utilities. For a more complete example, see Example--Using the Fortran 90 Sparse Subroutines or "Fortran 90 Sample Sparse Program".

USE F90SPARSE
      .
      .
      .
!User-defined subroutine
INTERFACE PARTS
   SUBROUTINE PARTS(...)
     INTEGER  GLOBAL_INDEX, N, NP

     INTEGER  NV
     INTEGER  PV(*)
   END SUBROUTINE PARTS
END INTERFACE
      .
      .
      .
!Define the process grid
CALL BLACS_GET (...)
CALL BLACS_GRIDINIT(...)
CALL BLACS_GRIDINFO(...)

      .
      .
      .
!Allocate space for and initialize array descriptor desc_a.
CALL PADALL(...)
 
!Allocate space and initialize some values
!for sparse matrix A.
CALL PSPALL(...)

 
!Allocate and build vectors b and x.
CALL PGEALL(...)
 
!Build the sparse matrix A with multiple calls to PSPINS.
!Each process has to call PSPINS as many times as
!necessary to insert the local rows it owns.
!Update array descriptor desc_a.
do
  CALL PSPINS(...)
enddo

 
!Build vectors b and x with multiple calls to PGEINS.
!Each process has to call PGEINS as many times as
!necessary to insert the local elements it owns.
do
   CALL PGEINS(...)
enddo
 
!Finalize the sparse matrix A and array descriptor desc_a
CALL PSPASB(...)

 
!Finalize the vectors b and x.
!Matrix A and array descriptor desc_a
!must be finalized before calling PGEASB.
CALL PGEASB(...)
 
!Prepare preconditioner
CALL PSPGPR(...)
 
!Call solver

CALL PSPGIS(...)
 
!Cleanup and exit.
!Deallocate vectors b and x
!Deallocate matrix A and the preconditioner data structure PRC
CALL PGEFREE(...)
CALL PSPFREE(...)
 
!Deallocate the array descriptor desc_a only after
!vectors b and x, and matrix A are deallocated.
CALL PADFREE(...)

      .
      .
      .
CALL BLACS_GRIDEXIT(...)
CALL BLACS_EXIT(...)

Application Program Outline for the Fortran 77 Sparse Linear Algebraic Equations and Their Utilities

The following is an outline for a application program that is calling the Fortran 77 sparse linear algebraic equation subroutines and their utilities. For a complete example, see Example--Using the Fortran 77 Sparse Subroutines or "Fortran 77 Sample Sparse Program".

      .
      .
      .
EXTERNAL PARTS
      .
      .
      .
!Define the process grid
CALL BLACS_GET (...)

CALL BLACS_GRIDINIT(...)
CALL BLACS_GRIDINFO(...)
      .
      .
      .
!Initialize array descriptor desc_a.
CALL PADINIT(...)
 
!Initialize some values
!for sparse matrix A.
CALL PDSPINIT(...)

 
!Build the sparse matrix A with multiple calls to PDSPINS.
!Each process has to call PDSPINS as many times as
!necessary to insert the local rows it owns.
!Update array descriptor desc_a.
do
  CALL PDSPINS(...)
enddo

!Build vectors b and x with multiple calls to PDGEINS.
!Each process has to call PDGEINS as many times as
!necessary to insert the local elements it owns.
do
   CALL PDGEINS(...)
enddo
 
!Finalize the sparse matrix A and array descriptor desc_a
CALL PDSPASB(...)

 
!Finalize the vectors b and x.
CALL PDGEASB(...)
 
!Prepare preconditioner
CALL PDSPGPR(...)
 
!Call solver
CALL PDSPGIS(...)

      .
      .
      .
CALL BLACS_GRIDEXIT(...)
CALL BLACS_EXIT(...)

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]