Guide and Reference


Overview of the Banded Linear Algebraic Equation Subroutines

The banded linear algebraic equation subroutines provide solutions to linear systems of equations for real positive definite symmetric band matrices, real general tridiagonal matrices, diagonally-dominant real general tridiagonal matrices, and real positive definite symmetric tridiagonal matrices.

Table 120. List of Banded Linear Algebraic Equation Subroutines
Descriptive Name Long-Precision Subroutine Page
Positive Definite Symmetric Band Matrix Factorization and Solve PBSV PBSV--Positive Definite Symmetric Band Matrix Factorization and Solve
Positive Definite Symmetric Band Matrix Factorization PBTRF PBTRF--Positive Definite Symmetric Band Matrix Factorization
Positive Definite Symmetric Band Matrix Solve PBTRS PBTRS--Positive Definite Symmetric Band Matrix Solve
General Tridiagonal Matrix Factorization and Solve GTSV GTSV and DTSV--General Tridiagonal Matrix Factorization and Solve
General Tridiagonal Matrix Factorization GTTRF GTTRF and DTTRF--General Tridiagonal Matrix Factorization
General Tridiagonal Matrix Solve GTTRS GTTRS and DTTRS--General Tridiagonal Matrix Solve
Diagonally-Dominant General Tridiagonal Matrix Factorization and Solve DTSV GTSV and DTSV--General Tridiagonal Matrix Factorization and Solve
Diagonally-Dominant General Tridiagonal Matrix Factorization DTTRF GTTRF and DTTRF--General Tridiagonal Matrix Factorization
Diagonally-Dominant General Tridiagonal Matrix Solve DTTRS GTTRS and DTTRS--General Tridiagonal Matrix Solve
Positive Definite Symmetric Tridiagonal Matrix Factorization and Solve PTSV PTSV--Positive Definite Symmetric Tridiagonal Matrix Factorization and Solve
Positive Definite Symmetric Tridiagonal Matrix Factorization PTTRF PTTRF--Positive Definite Symmetric Tridiagonal Matrix Factorization
Positive Definite Symmetric Tridiagonal Matrix Solve PTTRS PTTRS--Positive Definite Symmetric Tridiagonal Matrix Solve

Dense Linear Algebraic Equation Subroutines

This section contains the dense linear algebraic equation subroutine descriptions.

GETRF--General Matrix Factorization

This subroutine factors general matrix A using Gaussian elimination with partial pivoting, ipiv, to compute the LU factorization of A, where, in this description:

A is the rectangular general matrix to be factored.
ipiv is the vector containing the pivoting information.
L is a lower triangular matrix.
U is an upper triangular matrix.

On output, the transformed matrix A contains U in the upper triangle (if size(a,1) >= size(a,2)) or upper trapezoid (if size(a,1) < size(a,2)). In its strict lower triangle (if size(a,1) <= size(a,2)) or lower trapezoid (if size(a,1) > size(a,2)), it contains the multipliers necessary to construct, with the help of ipiv, a matrix L, such that A = LU.

To solve the system of equations with any number of right-hand sides, follow the call to this subroutine with one or more calls to GETRS.

If any of the assumed-shape arrays have a size of zero, no computation is performed, and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 121. Data Types
A ipiv Subroutine
Long-precision real Integer GETRF
Long-precision complex Integer GETRF

Syntax

HPF CALL GETRF (a, ipiv)

CALL GETRF (a, ipiv, info)

On Entry

a

is the general matrix A, used in the system of equations.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 121.

ipiv

See 'On Return'.

info

See 'On Return'.

On Return

a

is the updated general matrix A, containing the results of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 121.

ipiv

is the vector ipiv, containing the pivot information necessary to construct matrix L from the information contained in the (output) transformed matrix A.

The elements of ipiv must be replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A:

   !HPF$ ALIGN IPIV(:) WITH A(:,*)

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 121.

info

has the following meaning, when info is present:

If info = 0, matrix A is not singular, and the factorization completed normally.

If info > 0, matrix A is singular; that is, one or more columns of L and the corresponding diagonal of U contain all zeros. All columns of L are checked. info is set equal to i, the first column of L with a corresponding U = 0 diagonal element, encountered at position (i,i) in A. The factorization is completed; however, if you call GETRS with these factors, results are unpredictable.

When info is not present and matrix A is singular, the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(a,1) = size(ipvt).

    If you plan to call GETRS, then additionally size(a,1) = size(a,2).

  2. The assumed-shape arrays must have no common elements; otherwise, results are unpredictable.

  3. The A and ipiv input to GETRS must be the same as for the corresponding output arguments for GETRF.

  4. The way this subroutine handles singularity differs from ScaLAPACK. This subroutine uses the info argument to provide information about the singularity of A, like ScaLAPACK, but also provides an error message.

  5. On both input and output, matrix A conforms to ScaLAPACK format.

  6. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  7. Block-cyclic data distribution is required for your array data. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vector and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 9.

  8. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

  9. For information about optimizing performance in this subroutine, see "Performance Considerations".

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is greater than 2 for a or ipiv.
  2. The process rank is not the same for a and ipiv.
  3. The process rank is not 1 or 2 for a or ipiv.

Stage 2

The process grid is not the same for a and ipiv.

Stage 3

The data distribution is unsupported for a.

Stage 4
  1. The row block size for a and the block size for ipiv are incompatible.
  2. The data distribution is unsupported for ipiv.

Stage 5

The shape of the assumed-shape arrays for a and ipiv is incompatible: size(a,1) <> size(ipiv)

Stage 6

The abstract process row indices for a and ipiv are incompatible.

Stage 7

The data distribution for a is unsupported.

Example 1

This example factors a 9 × 9 real general matrix. As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid, with ipiv being replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ ALIGN IPIV(:) WITH A(:,*)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A
 
CALL GETRF( A , IPIV )
-or-
CALL GETRF( A , IPIV , INFO )

Input

General 9 × 9 matrix A:

 *                                             *
 | 1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6 |
 | 1.2  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4 |
 | 1.4  1.2  1.0  1.2  1.4  1.6  1.8  2.0  2.2 |
 | 1.6  1.4  1.2  1.0  1.2  1.4  1.6  1.8  2.0 |
 | 1.8  1.6  1.4  1.2  1.0  1.2  1.4  1.6  1.8 |
 | 2.0  1.8  1.6  1.4  1.2  1.0  1.2  1.4  1.6 |
 | 2.2  2.0  1.8  1.6  1.4  1.2  1.0  1.2  1.4 |
 | 2.4  2.2  2.0  1.8  1.6  1.4  1.2  1.0  1.2 |
 | 2.6  2.4  2.2  2.0  1.8  1.6  1.4  1.2  1.0 |
 *                                             *

Output

General 9 × 9 transformed matrix A:

 *                                                     *
 | 2.6   2.4   2.2   2.0   1.8   1.6   1.4   1.2   1.0 |
 | 0.4   0.3   0.6   0.8   1.1   1.4   1.7   1.9   2.2 |
 | 0.5  -0.4   0.4   0.8   1.2   1.6   2.0   2.4   2.8 |
 | 0.5  -0.3   0.0   0.4   0.8   1.2   1.6   2.0   2.4 |
 | 0.6  -0.3   0.0   0.0   0.4   0.8   1.2   1.6   2.0 |
 | 0.7  -0.2   0.0   0.0   0.0   0.4   0.8   1.2   1.6 |
 | 0.8  -0.2   0.0   0.0   0.0   0.0   0.4   0.8   1.2 |
 | 0.8  -0.1   0.0   0.0   0.0   0.0   0.0   0.4   0.8 |
 | 0.9  -0.1   0.0   0.0   0.0   0.0   0.0   0.0   0.4 |
 *                                                     *

Vector ipiv of size 9:

 *   *
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 *   *

info = 0 (if info is present)

Example 2

This example factors a 9 × 9 complex matrix.

As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid, with ipiv being replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ ALIGN IPIV(:) WITH A(:,*)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A
 
CALL GETRF( A , IPIV )
-or-
CALL GETRF( A , IPIV , INFO )

Input

General 9 × 9 matrix A:


 *                                                                                                            *
 | (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)  (4.4,-1.0)  (4.8,-1.0)  (5.2,-1.0) |
 | (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)  (4.4,-1.0)  (4.8,-1.0) |
 | (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0)  (4.4,-1.0) |
 | (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0)  (4.0,-1.0) |
 | (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0)  (3.6,-1.0) |
 | (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0)  (3.2,-1.0) |
 | (4.4, 1.0)  (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0)  (2.8,-1.0) |
 | (4.8, 1.0)  (4.4, 1.0)  (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0)  (2.4,-1.0) |
 | (5.2, 1.0)  (4.8, 1.0)  (4.4, 1.0)  (4.0, 1.0)  (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0) |
 *                                                                                                            *

Output

General 9 × 9 transformed matrix A:


 *                                                                                                               *
 | (5.2, 1.0)  (4.8, 1.0)   (4.4, 1.0)   (4.0, 1.0)   (3.6, 1.0)  (3.2, 1.0)  (2.8, 1.0)  (2.4, 1.0)  (2.0, 1.0) |
 | (0.4, 0.1)  (0.6,-2.0)   (1.1,-1.9)   (1.7,-1.9)   (2.3,-1.8)  (2.8,-1.8)  (3.4,-1.7)  (3.9,-1.7)  (4.5,-1.6) |
 | (0.5, 0.1)  (0.0,-0.1)   (0.6,-1.9)   (1.2,-1.8)   (1.8,-1.7)  (2.5,-1.6)  (3.1,-1.5)  (3.7,-1.4)  (4.3,-1.3) |
 | (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)   (0.7,-1.9)   (1.3,-1.7)  (2.0,-1.6)  (2.7,-1.5)  (3.4,-1.4)  (4.0,-1.2) |
 | (0.6, 0.1)  (0.0,-0.1)  (-0.1,-0.1)  (-0.1, 0.0)   (0.7,-1.9)  (1.5,-1.7)  (2.2,-1.6)  (2.9,-1.5)  (3.7,-1.3) |
 | (0.7, 0.1)  (0.0,-0.1)   (0.0, 0.0)  (-0.1, 0.0)  (-0.1, 0.0)  (0.8,-1.9)  (1.6,-1.8)  (2.4,-1.6)  (3.2,-1.5) |
 | (0.8, 0.0)  (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  (0.8,-1.9)  (1.7,-1.8)  (2.5,-1.8) |
 | (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)  (0.8,-2.0)  (1.7,-1.9) |
 | (0.9, 0.0)  (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)  (0.8,-2.0) |
 *                                                                                                               *

Vector ipiv of size 9:

 *   *
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 | 9 |
 *   *

info = 0 (if info is present)

GETRS--General Matrix Solve

GETRS solves one of the following systems of equations for multiple right-hand sides:

1. AX = B
2. ATX = B
3. AHX = B

In the formulas above:

A is the square general matrix containing the LU factorization.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the solution vectors in its columns.

This subroutine uses the results of the factorization of matrix A, produced by a preceding call to GETRF. On input, the transformed matrix A consists of the upper triangular matrix U and the multipliers necessary to construct L using ipiv. For details on the factorization, see GETRF--General Matrix Factorization.

If any of the assumed-shape arrays have a size of zero, no computation is performed, and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 122. Data Types
A, B ipiv Subroutine
Long-precision real Integer GETRS
Long-precision complex Integer GETRS

Syntax

HPF CALL GETRS (a, ipiv, b)

CALL GETRS (a, ipiv, b, transa, info)

On Entry

a

is the general matrix A, containing the factorization of matrix A produced by a preceding call to GETRF.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 122, where size(a,1) = size(a,2).

ipiv

is the vector ipiv, containing the pivoting indices produced on a preceding call to GETRF.

The elements of ipiv must be replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A:

   !HPF$ ALIGN IPIV(:) WITH A(:,*)

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 122.

b

is the general matrix B, containing the right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 122.

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation, resulting in solution 1.

If transa = 'T', AT is used in the computation, resulting in solution 2.

If transa = 'C', AH is used in the computation, resulting in solution 3.

Type: optional

Default: transa = 'N'

Specified as: a single character; transa = 'N', 'T', or 'C'.

info

See 'On Return'.

On Return

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 122.

info

indicates that a successful computation occurred.

Type: optional

Returned as: a fullword integer; info = 0.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(a,1) = size(a,2) = size(b,1) = size(ipiv).

  2. This subroutine accepts lowercase letters for the transa argument.

  3. When using real data, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.

  4. The assumed-shape arrays must have no common elements; otherwise, results are unpredictable.

  5. The A and ipiv input to GETRS must be the same as for the corresponding output arguments for GETRF.

  6. On both input and output, matrices A and B conform to ScaLAPACK format.

  7. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  8. Block-cyclic data distribution is required for your array data. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vector and matrices, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 9.

  9. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.

Computational Errors

None
Note: If the factorization performed by GETRF failed because of a singular matrix A, the results returned by GETRS are unpredictable. For details, see the info output argument for GETRF.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is greater than 2 for a, b, or ipiv.
  2. The process rank is not the same for a, b, and ipiv.
  3. The process rank is not 1 or 2 for a, b, or ipiv.

Stage 2

The process grid is not the same for a, b, and ipiv.

Stage 3

The data distribution is unsupported for a and b.

Stage 4
  1. The row block sizes for a and b and the block size for ipiv are incompatible.
  2. The data distribution is unsupported for ipiv.

Stage 5

  1. The shape of the assumed-shape arrays for a, b, and ipiv is incompatible:
    size(a,1) <> size(a,2) or
    size(a,1) <> size(b,1) or
    size(a,1) <> size(ipiv)

  2. The shape of the assumed-shape array for a is invalid: size(a,1) <> size(a,2)

Stage 6

The abstract process row indices for a, b, and ipiv are incompatible.

Stage 7

The data distribution for a or b is unsupported.

Example 1

This example solves the real system AX = B with 5 right-hand sides. The input ipiv vector and transformed matrix A are the output from "Example 1".

An array section is specified for argument b, resulting in the computation using a submatrix B starting at row 1 and column 2 in the array.

As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid, with ipiv being replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ ALIGN IPIV(:) WITH A(:,*)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B
 
CALL GETRS( A , IPIV , B(1:9,2:6) )
-or-
CALL GETRS( A , IPIV , B(1:9,2:6) , 'N' , INFO )

Input

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                                       *
 |  .   93.0  186.0  279.0  372.0  465.0 |
 |  .   84.4  168.8  253.2  337.6  422.0 |
 |  .   76.6  153.2  229.8  306.4  383.0 |
 |  .   70.0  140.0  210.0  280.0  350.0 |
 |  .   65.0  130.0  195.0  260.0  325.0 |
 |  .   62.0  124.0  186.0  248.0  310.0 |
 |  .   61.4  122.8  184.2  245.6  307.0 |
 |  .   63.6  127.2  190.8  254.4  318.0 |
 |  .   69.0  138.0  207.0  276.0  345.0 |
 *                                       *

Output

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                                   *
 |  .    1.0   2.0   3.0   4.0   5.0 |
 |  .    2.0   4.0   6.0   8.0  10.0 |
 |  .    3.0   6.0   9.0  12.0  15.0 |
 |  .    4.0   8.0  12.0  16.0  20.0 |
 |  .    5.0  10.0  15.0  20.0  25.0 |
 |  .    6.0  12.0  18.0  24.0  30.0 |
 |  .    7.0  14.0  21.0  28.0  35.0 |
 |  .    8.0  16.0  24.0  32.0  40.0 |
 |  .    9.0  18.0  27.0  36.0  45.0 |
 *                                   *

info = 0 (if info is present)

Example 2

This example solves the complex system AX = B with 5 right-hand sides. The input ipiv vector and transformed matrix A are the output from "Example 2".

An array section is specified for argument b, resulting in the computation using a submatrix B starting at row 1 and column 2 in the array.

As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid, with ipiv being replicated across each element of the corresponding row of A; that is, a copy of ipiv is aligned with every column of A.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ ALIGN IPIV(:) WITH A(:,*)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B
 
CALL GETRS( A , IPIV , B(1:9,2:6) )
-or-
CALL GETRS( A , IPIV , B(1:9,2:6) , 'N' , INFO )

Input

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:


 *                                                                                 *
 |   .   (193.0,-10.6)  (200.0, 21.8)  (207.0, 54.2)  (214.0, 86.6)  (221.0,119.0) |
 |   .   (173.8, -9.4)  (178.8, 20.2)  (183.8, 49.8)  (188.8, 79.4)  (193.8,109.0) |
 |   .   (156.2, -5.4)  (159.2, 22.2)  (162.2, 49.8)  (165.2, 77.4)  (168.2,105.0) |
 |   .   (141.0,  1.4)  (142.0, 27.8)  (143.0, 54.2)  (144.0, 80.6)  (145.0,107.0) |
 |   .   (129.0, 11.0)  (128.0, 37.0)  (127.0, 63.0)  (126.0, 89.0)  (125.0,115.0) |
 |   .   (121.0, 23.4)  (118.0, 49.8)  (115.0, 76.2)  (112.0,102.6)  (109.0,129.0) |
 |   .   (117.8, 38.6)  (112.8, 66.2)  (107.8, 93.8)  (102.8,121.4)   (97.8,149.0) |
 |   .   (120.2, 56.6)  (113.2, 86.2)  (106.2,115.8)   (99.2,145.4)   (92.2,175.0) |
 |   .   (129.0, 77.4)  (120.0,109.8)  (111.0,142.2)  (102.0,174.6)   (93.0,207.0) |
 *                                                                                 *

Output

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                                                                  *
 |   .   (1.0, 1.0)  (1.0, 2.0)  (1.0, 3.0)  (1.0, 4.0)  (1.0, 5.0) |
 |   .   (2.0, 1.0)  (2.0, 2.0)  (2.0, 3.0)  (2.0, 4.0)  (2.0, 5.0) |
 |   .   (3.0, 1.0)  (3.0, 2.0)  (3.0, 3.0)  (3.0, 4.0)  (3.0, 5.0) |
 |   .   (4.0, 1.0)  (4.0, 2.0)  (4.0, 3.0)  (4.0, 4.0)  (4.0, 5.0) |
 |   .   (5.0, 1.0)  (5.0, 2.0)  (5.0, 3.0)  (5.0, 4.0)  (5.0, 5.0) |
 |   .   (6.0, 1.0)  (6.0, 2.0)  (6.0, 3.0)  (6.0, 4.0)  (6.0, 5.0) |
 |   .   (7.0, 1.0)  (7.0, 2.0)  (7.0, 3.0)  (7.0, 4.0)  (7.0, 5.0) |
 |   .   (8.0, 1.0)  (8.0, 2.0)  (8.0, 3.0)  (8.0, 4.0)  (8.0, 5.0) |
 |   .   (9.0, 1.0)  (9.0, 2.0)  (9.0, 3.0)  (9.0, 4.0)  (9.0, 5.0) |
 *                                                                  *

info = 0 (if info is present)

POTRF--Positive Definite Real Symmetric or Complex Hermitian Matrix Factorization

This subroutine uses Cholesky factorization. It factors a positive definite real symmetric matrix A into one of the following forms:

A = LLT if A is lower triangular.
A = UTU if A is upper triangular.

It factors a positive definite complex Hermitian matrix A into one of the following forms:

A = LLH if A is lower triangular.
A = UHU if A is upper triangular.

In the formulas above:

A is the positive definite real symmetric or complex Hermitian matrix to be factored.
L is a lower triangular matrix.
U is an upper triangular matrix.

To solve the system of equations with any number of right-hand sides, follow the call to this subroutine with one or more calls to POTRS.

If the assumed-shape array has a size of zero, no computation is performed, and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 123. Data Types
A Subroutine
Long-precision real POTRF
Long-precision complex POTRF

Syntax

HPF CALL POTRF (a, uplo)

CALL POTRF (a, uplo, info)

On Entry

a

is the real symmetric or complex Hermitian matrix A, used in the system of equations, where:

If uplo = 'U', the array contains the upper triangle of the symmetric matrix A in its upper triangle, and its strictly lower triangular part is not referenced.

If uplo = 'L', the array contains the lower triangle of the symmetric matrix A in its lower triangle, and its strictly upper triangular part is not referenced.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 123, where size(a,1) = size(a,2).

uplo

indicates whether the upper or lower triangular part of the real symmetric or complex Hermitian submatrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Type: required

Specified as: a single character; uplo = 'U' or 'L'.

info

See 'On Return'.

On Return

a

is the updated matrix A, containing the results of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 123.

info

has the following meaning, when info is present:

If info = 0, real symmetric or complex Hermitian matrix A is positive definite, and the factorization completed normally.

If info > 0, the leading minor of order k of the real symmetric or complex hermitian matrix A is not positive definite. info is set equal to k, where the leading minor was encountered at position (k,k) in A. The factorization is not completed. A is overwritten with the partial factors.

When info is not present and matrix A is not positive definite, the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules

  1. The assumed-shape array must have the exact size required for the computation, that is: size(a,1) = size(a,2).

  2. This subroutine accepts lowercase letters for the uplo argument.

  3. The imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, they are set to zero.

  4. The A input to POTRS must be the same as for the corresponding output argument for POTRF.

  5. The way this subroutine handles nonpositive definiteness differs from ScaLAPACK. This subroutine uses the info argument to provide information about the nonpositive definiteness of A, like ScaLAPACK, but also provides an error message.

  6. On both input and output, matrix A conforms to ScaLAPACK format.

  7. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  8. Block-cyclic data distribution is required for your array data. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 9.

  9. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

  10. For information about optimizing performance in this subroutine, see "Performance Considerations".

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is greater than 2 for a.
  2. The process rank is not 1 or 2 for a.

Stage 2

The data distribution is inconsistent for a.

Stage 3

The shape of the assumed-shape array for a is invalid: size(a,1) <> size(a,2)

Stage 4

The data distribution for a is unsupported.

Example 1

This example factors a 9 × 9 positive definite real symmetric matrix. As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A
 
CALL POTRF(  A , 'L' )
-or-
CALL POTRF(  A , 'L' , INFO )

Input

Real symmetric matrix A of order 9:

 *                                             *
 | 1.0   .    .    .    .    .    .    .    .  |
 | 1.0  2.0   .    .    .    .    .    .    .  |
 | 1.0  2.0  3.0   .    .    .    .    .    .  |
 | 1.0  2.0  3.0  4.0   .    .    .    .    .  |
 | 1.0  2.0  3.0  4.0  5.0   .    .    .    .  |
 | 1.0  2.0  3.0  4.0  5.0  6.0   .    .    .  |
 | 1.0  2.0  3.0  4.0  5.0  6.0  7.0   .    .  |
 | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0   .  |
 | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 |
 *                                             *

Output

Real symmetric matrix A of order 9:

 *                                             *
 | 1.0   .    .    .    .    .    .    .    .  |
 | 1.0  1.0   .    .    .    .    .    .    .  |
 | 1.0  1.0  1.0   .    .    .    .    .    .  |
 | 1.0  1.0  1.0  1.0   .    .    .    .    .  |
 | 1.0  1.0  1.0  1.0  1.0   .    .    .    .  |
 | 1.0  1.0  1.0  1.0  1.0  1.0   .    .    .  |
 | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
 | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   .  |
 | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
 *                                             *

info = 0 (if info is present)

Example 2

This example factors a 9 × 9 positive definite complex Hermitian matrix.

As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A
 
CALL POTRF(  A , 'L' )
-or-
CALL POTRF(  A , 'L' , INFO )

Input

Complex Hermitian matrix A of order 9:


 *                                                                                                                    *
 | (18.0,  . )       .            .            .            .            .            .            .            .     |
 |  (1.0, 1.0)  (18.0,  . )       .            .            .            .            .            .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)  (18.0,  . )       .            .            .            .            .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)  (18.0,  . )       .            .            .            .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)   (7.0, 1.0)  (18.0,  . )       .            .            .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)   (7.0, 1.0)   (9.0, 1.0)  (18.0,  . )       .            .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)   (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)  (18.0,  . )       .            .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)   (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)  (13.0, 1.0)  (18.0,  . )       .     |
 |  (1.0, 1.0)   (3.0, 1.0)   (5.0, 1.0)   (7.0, 1.0)   (9.0, 1.0)  (11.0, 1.0)  (13.0, 1.0)  (15.0, 1.0)  (18.0, . ) |
 *                                                                                                                    *

Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

Output

Complex Hermitian matrix A of order 9:


 *                                                                                                                       *
 |    (4.2, 0.0)        .            .           .            .            .            .            .            .      |
 |  (0.24, 0.24)    (4.2, 0.0)       .           .            .            .            .            .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)   (4.2, 0.0)      .            .            .            .            .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24)  (4.0, 0.0)       .            .            .            .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24) (1.3, 0.25)   (3.8, 0.0)       .            .            .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24) (1.3, 0.25)  (1.4, 0.26)   (3.5, 0.0)       .            .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24) (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)   (3.2, 0.0)       .            .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24) (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)  (1.6, 0.32)   (2.7, 0.0)       .      |
 |  (0.24, 0.24)  (0.68, 0.24)  (1.1, 0.24) (1.3, 0.25)  (1.4, 0.26)  (1.5, 0.28)  (1.6, 0.32)  (1.6, 0.37)   (2.2, 0.0) |
 *                                                                                                                       *

Note: On output, the imaginary parts of the diagonal elements of the matrix are set to zero.

info = 0 (if info is present)

POTRS--Positive Definite Real Symmetric or Complex Hermitian Matrix Solve

This subroutine solves the following systems of equations for multiple right-hand sides:

AX = B

where, in the formula above:

A is the positive definite real symmetric or complex Hermitian matrix factored by Cholesky factorization.
B is the general matrix B, containing the right-hand sides in its columns.
X represents the general matrix B, containing the solution vectors in its columns.

This subroutine uses the results of the factorization of matrix A, produced by a preceding call to POTRF. For details on the factorization, see POTRF--Positive Definite Real Symmetric or Complex Hermitian Matrix Factorization.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking. See references [16], [18], [22], [36], and [37].

Table 124. Data Types
A, B Subroutine
Long-precision real POTRS
Long-precision complex POTRS

Syntax

HPF CALL POTRS (a, b, uplo)

CALL POTRS (a, b, uplo, info)

On Entry

a

is the real symmetric or complex Hermitian matrix A, containing the factorization of matrix A produced by a preceding call to POTRF, where:

If uplo = 'U', the array contains the upper triangle of the symmetric matrix A in its upper triangle, and its strictly lower triangular part is not referenced.

If uplo = 'L', the array contains the lower triangle of the symmetric matrix A in its lower triangle, and its strictly upper triangular part is not referenced.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 124.

b

is the general matrix B, containing the right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 124.

uplo

indicates whether the upper or lower triangular part of the real symmetric or complex Hermitian submatrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Type: required

Specified as: a single character; uplo = 'U' or 'L'.

info

See 'On Return'.

On Return

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 124.

info

indicates that a successful computation occurred.

Type: optional

Returned as: a fullword integer; info = 0.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(a,1) = size(a,2) = size(b,1).

  2. This subroutine accepts lowercase letters for the uplo argument.

  3. The assumed-shape arrays must have no common elements; otherwise, results are unpredictable.

  4. The A input to POTRS must be the same as for the corresponding output argument for POTRF.

  5. On both input and output, matrices A and B conform to ScaLAPACK format.

  6. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  7. Block-cyclic data distribution is required for your array data. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your matrices, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 9.

  8. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.

Computational Errors

None
Note: If the factorization performed by POTRF failed because of a nonpositive definite matrix A, the results returned by POTRS are unpredictable. For details, see the info output argument for POTRF.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is greater than 2 for a or b.
  2. The process rank is not the same for a and b.
  3. The process rank is not 1 or 2 for a or b.

Stage 2

The process grid is not the same for a and b.

Stage 3

The data distribution is inconsistent for a and b.

Stage 4
  1. The shape of the assumed-shape arrays for a and b is incompatible:
    1. size(a,1) <> size(a,2) or
    2. size(a,1) <> size(b,1)
  2. The shape of the assumed-shape array for a is invalid: size(a,1) <> size(a,2)

Stage 5

The data distribution for a or b is unsupported.

Example 1

This example solves the positive definite real symmetric system AX = B with 5 right-hand sides. The transformed matrix A is the output from "Example 1".

An array section is specified for argument b, resulting in the computation using a submatrix B starting at row 1 and column 2 in the array.

As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B
 
CALL POTRS( A , B(1:9,2:6) , 'L' )
-or-
CALL POTRS( A , B(1:9,2:6) , 'L' , INFO )

Input

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                                       *
 |  .   18.0   27.0   36.0   45.0    9.0 |
 |  .   34.0   51.0   68.0   85.0   17.0 |
 |  .   48.0   72.0   96.0  120.0   24.0 |
 |  .   60.0   90.0  120.0  150.0   30.0 |
 |  .   70.0  105.0  140.0  175.0   35.0 |
 |  .   78.0  117.0  156.0  195.0   39.0 |
 |  .   84.0  126.0  168.0  210.0   42.0 |
 |  .   88.0  132.0  176.0  220.0   44.0 |
 |  .   90.0  135.0  180.0  225.0   45.0 |
 *                                       *

Output

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                              *
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 |  .   2.0  3.0  4.0  5.0  1.0 |
 *                              *

info = 0 (if info is present)

Example 2

This example solves the positive definite complex Hermitian system AX = B with 5 right-hand sides. The transformed matrix A is the output from "Example 2".

An array section is specified for argument b, resulting in the computation using a submatrix B starting at row 1 and column 2 in the array.

As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid.

!HPF$ PROCESSORS PROC(2,2)
!HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B
 
CALL POTRS( A , B(1:9,2:6) , 'L' )
-or-
CALL POTRS( A , B(1:9,2:6) , 'L' , INFO )

Input

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:


 *                                                                                  *
 |   .    (60.0, 10.0)   (86.0,  2.0)  (112.0,  -6.0)  (138.0,-14.0)   (34.0, 18.0) |
 |   .    (86.0, 28.0)  (126.0, 22.0)  (166.0,  16.0)  (206.0, 10.0)   (46.0, 34.0) |
 |   .   (108.0, 44.0)  (160.0, 40.0)  (212.0,  36.0)  (264.0, 32.0)   (56.0, 48.0) |
 |   .   (126.0, 58.0)  (188.0, 56.0)  (250.0,  54.0)  (312.0, 52.0)   (64.0, 60.0) |
 |   .   (140.0, 70.0)  (210.0, 70.0)  (280.0,  70.0)  (350.0, 70.0)   (70.0, 70.0) |
 |   .   (150.0, 80.0)  (226.0, 82.0)  (302.0,  84.0)  (378.0, 86.0)   (74.0, 78.0) |
 |   .   (156.0, 88.0)  (236.0, 92.0)  (316.0,  96.0)  (396.0, 100.0)  (76.0, 84.0) |
 |   .   (158.0, 94.0)  (240.0,100.0)  (322.0, 106.0)  (404.0, 112.0)  (76.0, 88.0) |
 |   .   (156.0, 98.0)  (238.0,106.0)  (320.0, 114.0)  (402.0, 122.0)  (74.0, 90.0) |
 *                                                                                  *

Output

Only a portion of the data structure is used--that is, submatrix B. Following is the 9 × 5 submatrix B, starting at row 1 and column 2 in the 9 × 6 array:

 *                                                                   *
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 |   .   (2.0, 1.0)  (3.0, 1.0)  (4.0, 1.0)  (5.0, 1.0)  (1.0, 1.0)  |
 *                                                                   *

info = 0 (if info is present)

Banded Linear Algebraic Equation Subroutines

This section contains the banded linear algebraic equation subroutine descriptions.

PBSV--Positive Definite Symmetric Band Matrix Factorization and Solve

This subroutine solves the following system of equations for multiple right-hand sides:

AX = B

where, in the formula above:

A is the positive definite symmetric band matrix, factored by Cholesky factorization.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [23], [2], [16], [18], [22], [36], and [37].

Table 125. Data Types
A, B Subroutine
Long-precision real PBSV

Syntax

HPF CALL PBSV (a, b, uplo)

CALL PBSV (a, b, uplo, info)

On Entry

a

is the positive definite symmetric band matrix A with half bandwidth k, where k=size(a,1)-1, to be factored. Matrix A is stored in upper- or lower-band-packed storage mode, where:

If uplo = 'U', the array contains the upper triangle of the symmetric band matrix A in its upper triangle, and its strictly lower triangular part is not referenced.

If uplo = 'L', the array contains the lower triangle of the symmetric band matrix A in its lower triangle, and its strictly upper triangular part is not referenced.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 125.

On output array A is overwritten; that is, the original input is not preserved.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 125.

uplo

indicates whether the upper or lower triangular part of the matrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Type: required

Specified as: a single character; uplo = 'U' or 'L'.

info

See 'On Return'.

On Return

a

is overwritten; that is, the original input is not preserved. This subroutine overwrites data in positions that do not contain the positive definite symmetric band matrix A stored in upper- or lower-band-packed storage mode.

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 125.

info

has the following meaning, when info is present:

If info = 0, matrix A is positive definite, and the factorization completed normally.

If info > 0, the leading minor of order i of the matrix A is not positive definite. info is set equal to i, where the first leading minor was encountered at position (i,i) in A. The results contained in matrix A are not defined.

When info is not present and matrix A is not positive definite, the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(a,2) = size(b,1). Also, in this subroutine, the half bandwidth k=size(a,1)-1.

  2. For performances reasons, it is suggested that you specify uplo = 'L'. For information on how bandwidth affects performance, see [2].

  3. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  4. This subroutine accepts lowercase letters for the uplo argument.

  5. The band matrix A must be positive definite. If A is not positive definite, this subroutine uses the info argument to provide information about A and issues an error message. This differs from ScaLAPACK, which only uses the info argument to provide information about A.

  6. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  7. The global positive definite symmetric band matrix A must be stored in upper- or lower-band-packed storage mode. For details, see the section on symmetric matrices in "Matrices".

    Matrix A must be distributed over a one-dimensional process grid, using block-column data distribution. For more information on using block-column data distribution, see "Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations".

    Matrix B must be distributed over a one-dimensional process grid, using block-row data distribution. For more information on using block-row data distribution, see the section on block distributing a general matrix containing the right-hand sides in "Matrices".

    Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your matrices, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  8. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for a or b.
  2. The process rank is not 1 for a or b.

Stage 2
  1. The process grid is not the same for a and b.
  2. a is not distributed (*,BLOCK).
  3. b is not distributed (BLOCK,*).

Stage 3
  1. The shape of the assumed-shape arrays for a and b is incompatible: size(a,2) <> size(b,1)
  2. The column block size for a and the row block size for b are not equal.
  3. The abstract process indices for a and b are not equal.
  4. The data distribution for a or b is unsupported.

Example

This example shows a factorization of the positive definite symmetric band matrix A of order 9 with a half bandwidth of 7:

           *                                             *
           | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.0 |
           | 1.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  1.0 |
           | 1.0  2.0  3.0  3.0  3.0  3.0  3.0  3.0  2.0 |
           | 1.0  2.0  3.0  4.0  4.0  4.0  4.0  4.0  3.0 |
           | 1.0  2.0  3.0  4.0  5.0  5.0  5.0  5.0  4.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  6.0  6.0  5.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0  6.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  7.0 |
           | 0.0  1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0 |
           *                                             *

Matrix A is stored in lower-band-packed storage mode:

           *                                             *
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0   .  |
           | 1.0  2.0  3.0  4.0  5.0  6.0  6.0   .    .  |
           | 1.0  2.0  3.0  4.0  5.0  5.0   .    .    .  |
           | 1.0  2.0  3.0  4.0  4.0   .    .    .    .  |
           | 1.0  2.0  3.0  3.0   .    .    .    .    .  |
           | 1.0  2.0  2.0   .    .    .    .    .    .  |
           | 1.0  1.0   .    .    .    .    .    .    .  |
           *                                             *

where "." means you do not have to store a value in that position in the local array. However, these storage positions are required and are overwritten during the computation.

As in "Example", array data is block distributed over 3 processes.
Note: On output, the matrix A is overwritten by this subroutine.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (*,BLOCK) ONTO PROC :: A
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL PBSV( A , B , 'L' )
-or-
CALL PBSV( A , B , 'L' , INFO=INFO )

Input

Matrix A, stored in an 8 × 9 array:

   *                                             *
   | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
   | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0   .  |
   | 1.0  2.0  3.0  4.0  5.0  6.0  6.0   .    .  |
   | 1.0  2.0  3.0  4.0  5.0  5.0   .    .    .  |
   | 1.0  2.0  3.0  4.0  4.0   .    .    .    .  |
   | 1.0  2.0  3.0  3.0   .    .    .    .    .  |
   | 1.0  2.0  2.0   .    .    .    .    .    .  |
   | 1.0  1.0   .    .    .    .    .    .    .  |
   *                                             *

Rectangular 9 × 3 matrix B:

 *                      *
 |  8.0    36.0    44.0 |
 | 16.0    80.0    80.0 |
 | 23.0   122.0   108.0 |
 | 29.0   161.0   129.0 |
 | 34.0   196.0   144.0 |
 | 38.0   226.0   154.0 |
 | 41.0   250.0   160.0 |
 | 43.0   267.0   163.0 |
 | 36.0   240.0   120.0 |
 *                      *

Output

Rectangular 9 × 3 matrix B:

 *                 *
 | 1.0   1.0   9.0 |
 | 1.0   2.0   8.0 |
 | 1.0   3.0   7.0 |
 | 1.0   4.0   6.0 |
 | 1.0   5.0   5.0 |
 | 1.0   6.0   4.0 |
 | 1.0   7.0   3.0 |
 | 1.0   8.0   2.0 |
 | 1.0   9.0   1.0 |
 *                 *

info = 0 (if info is present)

PBTRF--Positive Definite Symmetric Band Matrix Factorization

This subroutine uses Cholesky factorization to factor a positive definite symmetric band matrix A, stored in upper- or lower-band-packed storage mode, into one of the following forms:

A = UTU if A is upper triangular.
A = LLT if A is lower triangular.

where, in the formulas above:

A is the positive definite symmetric band matrix to be factored.
U is an upper triangular matrix.
L is a lower triangular matrix.

To solve the system of equations with multiple right-hand sides, follow the call to this subroutine with one of more calls to PBTRS. The output from this factorization subroutine should be used only as input to PBTRS.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [23], [2], [16], [18], [22], [36], and [37].

Table 126. Data Types
A, af Subroutine
Long-precision real PBTRF

Syntax

HPF CALL PBTRF (a, af, uplo)

CALL PBTRF (a, af, uplo, info)

On Entry

a

is the positive definite symmetric band matrix A with half bandwidth k, where k=size(a,1)-1, to be factored. Matrix A is stored in upper- or lower-band-packed storage mode, where:

If uplo = 'U', the array contains the upper triangle of the symmetric band matrix A in its upper triangle, and its strictly lower triangular part is not referenced.

If uplo = 'L', the array contains the lower triangle of the symmetric band matrix A in its lower triangle, and its strictly upper triangular part is not referenced.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 126.

On output array A is overwritten; that is, the original input is not preserved.

af

is a reserved output area.

Type: required

Specified as: for migration purposes, you should specify a one-dimensional long-precision assumed-shape array with shape (:), where:

size(af) >=
number_of_processors() [(ceiling{size(a,2) / number_of_processors()}
+ (2)(k)) (k)]

uplo

indicates whether the upper or lower triangular part of the matrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Type: required

Specified as: a single character; uplo = 'U' or 'L'.

info

See 'On Return'.

On Return

a

is the updated matrix A, containing the results of the factorization, where:

If uplo = 'U', the array contains the results of the factorization of the symmetric band matrix A in its upper triangle. The remaining elements stored in the array are overwritten by this subroutine.

If uplo = 'L', the array contains the results of the factorization of the symmetric band matrix A in its lower triangle. The remaining elements stored in the array are overwritten by this subroutine.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 126.

af

is a reserved output area.

info

has the following meaning, when info is present:

If info = 0, matrix A is positive definite, and the factorization completed normally.

If info > 0, the leading minor of order i of the matrix A is not positive definite. info is set equal to i, where the first leading minor was encountered at position (i,i) in A. The results contained in matrix A are not defined.

When info is not present and matrix A is not positive definite, the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: a fullword integer; info >= 0.

Notes and Coding Rules

  1. In this subroutine, the half bandwidth k=size(a,1)-1.

  2. For performances reasons, it is suggested that you specify uplo = 'L'. For information on how bandwidth affects performance, see [2].

  3. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  4. This subroutine accepts lowercase letters for the uplo argument.

  5. The output from this factorization subroutine should be used only as input to the solve subroutine PBTRS.

    The data specified for input argument uplo must be the same for both PBTRF and PBTRS.

    The matrix A and af input to PBTRS must be the same as the corresponding output arguments for PBTRF.

  6. The matrix A must remain unchanged between calls to PBTRF and PBTRS. This subroutine overwrites data in positions that do not contain the positive definite symmetric band matrix A stored in upper- or lower-band-packed storage mode.

  7. The band matrix A must be positive definite. If A is not positive definite, this subroutine uses the info argument to provide information about A and issues an error message. This differs from ScaLAPACK, which only uses the info argument to provide information about A.

  8. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  9. The global positive definite symmetric band matrix A must be stored in upper- or lower-band-packed storage mode. For details, see the section on symmetric matrices in "Matrices".

    Matrix A must be distributed over a one-dimensional process grid, using block-column data distribution. For more information on using block-column data distribution, see "Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations".

    Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vector and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  10. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for a or af.
  2. The process rank is not 1 for a or af.

Stage 2
  1. The process grid is not the same for a and af.
  2. a is not distributed (*,BLOCK).
  3. af is not distributed (BLOCK).

Stage 3

The data distribution for a is unsupported.

Example

This example shows a factorization of the positive definite symmetric band matrix A of order 9 with a half bandwidth of 7:

           *                                             *
           | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.0 |
           | 1.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  1.0 |
           | 1.0  2.0  3.0  3.0  3.0  3.0  3.0  3.0  2.0 |
           | 1.0  2.0  3.0  4.0  4.0  4.0  4.0  4.0  3.0 |
           | 1.0  2.0  3.0  4.0  5.0  5.0  5.0  5.0  4.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  6.0  6.0  5.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0  6.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  7.0 |
           | 0.0  1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0 |
           *                                             *

Matrix A is stored in lower-band-packed storage mode:

           *                                             *
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
           | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0   .  |
           | 1.0  2.0  3.0  4.0  5.0  6.0  6.0   .    .  |
           | 1.0  2.0  3.0  4.0  5.0  5.0   .    .    .  |
           | 1.0  2.0  3.0  4.0  4.0   .    .    .    .  |
           | 1.0  2.0  3.0  3.0   .    .    .    .    .  |
           | 1.0  2.0  2.0   .    .    .    .    .    .  |
           | 1.0  1.0   .    .    .    .    .    .    .  |
           *                                             *

where "." means you do not have to store a value in that position in the local array. However, these storage positions are required and are overwritten during the computation.

As in "Example", array data is block distributed over 3 processes.

Notes:

  1. Matrix A, output from PBTRF, must be passed, unchanged, to the solve subroutine PBTRS.

  2. The af argument is reserved and not shown in this example.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (*,BLOCK) ONTO PROC :: A
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: AF
 
CALL PBTRF( A , AF , 'L' )
-or-
CALL PBTRF( A , AF , 'L' , INFO=INFO )

Input

Matrix A, stored in an 8 × 9 array:

   *                                             *
   | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
   | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0   .  |
   | 1.0  2.0  3.0  4.0  5.0  6.0  6.0   .    .  |
   | 1.0  2.0  3.0  4.0  5.0  5.0   .    .    .  |
   | 1.0  2.0  3.0  4.0  4.0   .    .    .    .  |
   | 1.0  2.0  3.0  3.0   .    .    .    .    .  |
   | 1.0  2.0  2.0   .    .    .    .    .    .  |
   | 1.0  1.0   .    .    .    .    .    .    .  |
   *                                             *

Output

Matrix A, stored in an 8 × 9 array:

   *                                             *
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   .  |
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
   | 1.0  1.0  1.0  1.0  1.0  1.0   .    .    .  |
   | 1.0  1.0  1.0  1.0  1.0   .    .    .    .  |
   | 1.0  1.0  1.0  1.0   .    .    .    .    .  |
   | 1.0  1.0  1.0   .    .    .    .    .    .  |
   | 1.0  1.0   .    .    .    .    .    .    .  |
   *                                             *

info = 0 (if info is present)

PBTRS--Positive Definite Symmetric Band Matrix Solve

This subroutine solves the following system of equations for multiple right-hand sides:

AX = B

where, in the formula above:

A is the positive definite symmetric band matrix, factored by Cholesky factorization.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

This subroutine uses the results of the factorization of matrix A, produced by a preceding call to PBTRF. The output from PBTRF should be used only as input to this solve subroutine.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [23], [2], [16], [18], [22], [36], and [37].

Table 127. Data Types
A, B, af Subroutine
Long-precision real PBTRS

Syntax

HPF CALL PBTRS (a, b, af, uplo)

CALL PBTRS (a, b, af, uplo, info)

On Entry

a

is the positive definite symmetric band matrix A with half bandwidth k, where k=size(a,1)-1, containing the factorization of matrix A produced by a preceding call to PBTRF. Matrix A is stored in upper- or lower-band-packed storage mode, where:

If uplo = 'U', the array contains the results of the factorization of the symmetric band matrix A in its upper triangle.

If uplo = 'L', the array contains the results of the factorization of the symmetric band matrix A in its lower triangle.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 127.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 127.

af

is a reserved area.

Type: required

Specified as: for migration purposes, you should specify a one-dimensional long-precision assumed-shape array with shape (:), where:

size(af) >=
number_of_processors() [(ceiling{size(a,2) / number_of_processors()}
+ (2)(k) ) (k) ]

uplo

indicates whether the upper or lower triangular part of the matrix A is referenced, where:

If uplo = 'U', the upper triangular part is referenced.

If uplo = 'L', the lower triangular part is referenced.

Type: required

Specified as: a single character; uplo = 'U' or 'L'.

info

See 'On Return'.

On Return

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 127.

info

indicates that a successful computation occurred.

Type: optional

Returned as: a fullword integer; info = 0.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(a,2) = size(b,1). Also, in this subroutine, the half bandwidth k=size(a,1)-1.

  2. For performances reasons, it is suggested that you specify uplo = 'L'. For information on how bandwidth affects performance, see [2].

  3. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  4. This subroutine accepts lowercase letters for the uplo argument.

  5. The output from the factorization subroutine PBTRF should be used only as input to this solve subroutine.

    The data specified for input argument uplo must be the same for both PBTRF and PBTRS.

    The matrix A and af input to PBTRS must be the same as the corresponding output arguments for PBTRF.

  6. The matrix A must remain unchanged between calls to PBTRF and PBTRS. This subroutine overwrites data in positions that do not contain the positive definite symmetric band matrix A stored in upper- or lower-band-packed storage mode.

  7. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  8. The global positive definite symmetric band matrix A must be stored in upper- or lower-band-packed storage mode. For details, see the section on symmetric matrices in "Matrices".

    Matrix A must be distributed over a one-dimensional process grid, using block-column data distribution. For more information on using block-column data distribution, see "Specifying Block-Cyclically-Distributed Matrices for the Banded Linear Algebraic Equations".

    Matrix B must be distributed over a one-dimensional process grid, using block-row data distribution. For more information on using block-row data distribution, see the section on block distributing a general matrix containing the right-hand sides in "Matrices".

    Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vector and matrices, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  9. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.

Computational Errors

None
Note: If the factorization performed by PBTRF failed because of a nonpositive definite matrix A, the results returned by this subroutine are unpredictable. For details, see the info output argument for PBTRF.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for a, b, or af.
  2. The process rank is not 1 for a, b, or af.

Stage 2
  1. The process grid is not the same for a, b, and af.
  2. a is not distributed (*,BLOCK).
  3. b is not distributed (BLOCK,*).
  4. af is not distributed (BLOCK).

Stage 3
  1. The shape of the assumed-shape arrays for a and b is incompatible: size(a,2) <> size(b,1)
  2. The column block size for a and the row block size for b are not equal.
  3. The abstract process indices for a and b are not equal.
  4. The data distribution for a or b is unsupported.

Example

This example solves the system AX=B, where matrix A is the same matrix factored in "Example" for PBTRF.

As in "Example", array data is block distributed over 3 processes.

Notes:

  1. Matrix A, output from PBTRF, must be passed, unchanged, to the solve subroutine PBTRS.

  2. The af argument, output from PBTRF, must be passed, unchanged, to the solve subroutine PBTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (*,BLOCK) ONTO PROC :: A
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: AF
 
CALL PBTRS( A , B , AF , 'L' )
-or-
CALL PBTRS( A , B , AF , 'L' , INFO=INFO )

Input

Matrix A, stored in an 8 × 9 array:

   *                                             *
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   .  |
   | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
   | 1.0  1.0  1.0  1.0  1.0  1.0   .    .    .  |
   | 1.0  1.0  1.0  1.0  1.0   .    .    .    .  |
   | 1.0  1.0  1.0  1.0   .    .    .    .    .  |
   | 1.0  1.0  1.0   .    .    .    .    .    .  |
   | 1.0  1.0   .    .    .    .    .    .    .  |
   *                                             *

Rectangular 9 × 3 matrix B:

 *                      *
 |  8.0    36.0    44.0 |
 | 16.0    80.0    80.0 |
 | 23.0   122.0   108.0 |
 | 29.0   161.0   129.0 |
 | 34.0   196.0   144.0 |
 | 38.0   226.0   154.0 |
 | 41.0   250.0   160.0 |
 | 43.0   267.0   163.0 |
 | 36.0   240.0   120.0 |
 *                      *

Output

Rectangular 9 × 3 matrix B:

 *                 *
 | 1.0   1.0   9.0 |
 | 1.0   2.0   8.0 |
 | 1.0   3.0   7.0 |
 | 1.0   4.0   6.0 |
 | 1.0   5.0   5.0 |
 | 1.0   6.0   4.0 |
 | 1.0   7.0   3.0 |
 | 1.0   8.0   2.0 |
 | 1.0   9.0   1.0 |
 *                 *

info = 0 (if info is present)

GTSV and DTSV--General Tridiagonal Matrix Factorization and Solve

GTSV solves the tridiagonal systems of linear equations, AX = B, using Gaussian elimination with partial pivoting for the general tridiagonal matrix A stored in tridiagonal storage mode.

DTSV solves the tridiagonal systems of linear equations, AX = B, using Gaussian elimination for the diagonally dominant general tridiagonal matrix A stored in tridiagonal storage mode.

In these subroutines:

A is the square general tridiagonal matrix.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 128. Data Types
dl, d, du, B Subroutine
Long-precision real GTSV and DTSV

Syntax

HPF CALL GTSV | DTSV (dl, d, du, b)

CALL GTSV | DTSV (dl, d, du, b, info)

On Entry

dl

is the vector dl, containing the subdiagonal of the general tridiagonal matrix A in elements 2 through size(dl).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 128.

On output, DL is overwritten; that is, the original input is not preserved.

d

is the vector d, containing the main diagonal of the general tridiagonal matrix A in elements 1 through size(d).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 128.

On output, D is overwritten; that is, the original input is not preserved.

du

is the vector du, containing the superdiagonal of the general tridiagonal matrix A in elements 1 through size(du)-1.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 128.

On output, DU is overwritten; that is, the original input is not preserved.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 128.

info

See 'On Return'.

On Return

dl

is overwritten; that is, the original input is not preserved.

d

is overwritten; that is, the original input is not preserved.

du

is overwritten; that is, the original input is not preserved.

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 128.

info

is the vector info of length equal to the number_of_processors(), where, if you are running on the j-th process, then infoj has the following meaning, when info is present:

If infoj = 0 for all j, the factorization completed normally.
Note: For DTSV, if the input matrix A is not diagonally dominant, the subroutine may still complete the factorization; however, results are unpredictable.

If 1 <=  infoj <= number_of_processors() for any j, the portion of global submatrix A stored on process infoj-1 and factored locally, is singular or reducible (for GTSV), or not diagonally dominant (for DTSV). The magnitude of a pivot element was zero or too small.

If infoj > number_of_processors() for any j, the portion of global submatrix A stored on process infoj-number_of_processors()-1 representing interactions with other processes, is singular or reducible (for GTSV), or not diagonally dominant (for DTSV). The magnitude of a pivot element was zero or too small.

If infoj > 0 for any j, the factorization is completed; however, the results are unpredictable.

All elements of info will have the same value.

When info is not present or size(info)=0, and matrix A is singular or reducible (for GTSV) or not diagonally dominant (for DTSV), then the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers, where infoj >= 0 for j = 1...number_of_processors().

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(dl) = size(d) = size(du) = size(b,1).

  2. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  3. For GTSV, the general tridiagonal matrix A must be non-singular and irreducible. For DTSV, the general tridiagonal matrix A must be diagonally dominant to ensure numerical accuracy because no pivoting is performed. These subroutines use the info argument to provide information about A, like ScaLAPACK. However, these subroutines also issue an error message, which differs from ScaLAPACK.

  4. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  5. The general tridiagonal matrix A must be stored in tridiagonal storage mode. For details, see the section on tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  6. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument or if size(info)=0, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for dl, d, du, b, or info (if info is present).
  2. The process rank is not 1 for dl, d, du, b, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for dl, d, and du.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for dl, d, and du.
  4. b is not distributed (BLOCK,*).
  5. dl, d, or du is not distributed (BLOCK).
  6. The vector for dl, d, or du is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for dl, d, du, and b is incompatible:
    1. size(dl) <> size(b,1) or
    2. size(d) <> size(b,1) or
    3. size(du) <> size(b,1)
  2. The block sizes for dl, d, du, and b are incompatible.
  3. The abstract process indices for dl, d, du, and b are incompatible.
  4. The data distribution for dl, d, or du is unsupported.

Example 1

This example shows a factorization of the general tridiagonal matrix A, of order 12, where matrix A is stored in tridiagonal storage mode:

      *                                                            *
      | 2.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0 |
      *                                                            *

As in "Example", array data is block distributed over 3 processes.
Note: On output, vectors dl, d, and du are overwritten by this subroutine.


!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL GTSV( DL , D , DU , B )   -or-   CALL DTSV( DL , D , DU , B )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL GTSV( DL , D , DU , B , INFO )   -or-   CALL DTSV( DL , D , DU , B , INFO )

Input

Vector dl of size 12:

 *                                                            *
 |  .   1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
 *                                                            *

Vector d of size 12:

 *                                                            *
 | 2.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
 *                                                            *

Vector du of size 12:

 *                                                            *
 | 2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0   .  |
 *                                                            *

Rectangular 12 × 3 matrix B:

 *                   *
 | 46.0    6.0   4.0 |
 | 65.0   13.0   6.0 |
 | 59.0   19.0   6.0 |
 | 53.0   25.0   6.0 |
 | 47.0   31.0   6.0 |
 | 41.0   37.0   6.0 |
 | 35.0   43.0   6.0 |
 | 29.0   49.0   6.0 |
 | 23.0   55.0   6.0 |
 | 17.0   61.0   6.0 |
 | 11.0   67.0   6.0 |
 |  5.0   47.0   4.0 |
 *                   *

Output

Rectangular 12 × 3 matrix B:

 *                   *
 | 12.0    1.0   1.0 |
 | 11.0    2.0   1.0 |
 | 10.0    3.0   1.0 |
 |  9.0    4.0   1.0 |
 |  8.0    5.0   1.0 |
 |  7.0    6.0   1.0 |
 |  6.0    7.0   1.0 |
 |  5.0    8.0   1.0 |
 |  4.0    9.0   1.0 |
 |  3.0   10.0   1.0 |
 |  2.0   11.0   1.0 |
 |  1.0   12.0   1.0 |
 *                   *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *

GTTRF and DTTRF--General Tridiagonal Matrix Factorization

GTTRF factors the general tridiagonal matrix A, stored in tridiagonal storage mode, using Gaussian elimination with partial pivoting.

DTTRF factors the diagonally dominant general tridiagonal matrix A, stored in tridiagonal storage mode, using Gaussian elimination.

In these subroutines, A is a square general tridiagonal matrix.

To solve a tridiagonal system of linear equations with multiple right-hand sides, follow the call to GTTRF or DTTRF with one or more calls to GTTRS or DTTRS, respectively. The output from these factorization subroutines should be used only as input to the solve subroutines GTTRS and DTTRS, respectively.

If the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 129. Data Types
dl, d, du, du2, af ipiv Subroutine
Long-precision real Integer GTTRF and DTTRF

Syntax

HPF CALL GTTRF (dl, d, du, du2, ipiv, af)

CALL GTTRF (dl, d, du, du2, ipiv, af, info)

HPF CALL DTTRF (dl, d, du, af)

CALL DTTRF (dl, d, du, af, info)

On Entry

dl

is the vector dl, containing the subdiagonal of the general tridiagonal matrix A in elements 2 through size(dl).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, DL is overwritten; that is, the original input is not preserved.

d

is the vector d, containing the main diagonal of the general tridiagonal matrix A in elements 1 through size(d).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, D is overwritten; that is, the original input is not preserved.

du

is the vector du, containing the superdiagonal of the general tridiagonal matrix A in elements 1 through size(du)-1.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, DU is overwritten; that is, the original input is not preserved.

du2

See 'On Return'.

ipiv

See 'On Return'.

af

See 'On Return'.

info

See 'On Return'.

On Return

dl

is the updated vector dl, containing part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, DL is overwritten; that is, the original input is not preserved.

d

is the updated vector d, containing part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, D is overwritten; that is, the original input is not preserved.

du

is the updated vector du, containing part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

On output, DU is overwritten; that is, the original input is not preserved.

du2

is the vector du2, containing part of the factorization.

Type: required (GTTRF); not present (DTTRF)

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

ipiv

is the vector ipiv, containing the pivot information needed by GTTRS.

Type: required (GTTRF); not present (DTTRF)

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129.

af

is a work area used by these subroutines and contains part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 129, where:

For GTTRF:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 3 (ceiling(size(dl) / number_of_processors())) }

For DTTRF:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 2 (ceiling(size(dl) / number_of_processors())) }

info

is the vector info of length equal to the number_of_processors(), where, if you are running on the j-th process, then infoj has the following meaning, when info is present:

If infoj = 0 for all j, the factorization completed normally.
Note: For DTTRF, if the input matrix A is not diagonally dominant, the subroutine may still complete the factorization; however, results are unpredictable.

If 1 <=  infoj <= number_of_processors() for any j, the portion of global submatrix A stored on process infoj-1 and factored locally, is singular or reducible (for GTTRF), or not diagonally dominant (for DTTRF). The magnitude of a pivot element was zero or too small.

If infoj > number_of_processors() for any j, the portion of global submatrix A stored on process infoj-number_of_processors()-1 representing interactions with other processes, is singular or reducible (for GTTRF), or not diagonally dominant (for DTTRF). The magnitude of a pivot element was zero or too small.

If infoj > 0 for any j, the factorization is completed; however, if you call GTTRS/DTTRS with these factors, the results are unpredictable.

All elements of info will have the same value.

When info is not present or size(info)=0, and matrix A is singular or reducible (for GTTRF) or not diagonally dominant (for DTTRF), then the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers, where infoj >= 0 for j = 1...number_of_processors().

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is:

  2. The assumed-shape arrays must have no common elements; otherwise, results are unpredictable.

  3. The output from these factorization subroutines should be used only as input to the solve subroutines GTTRS and DTTRS, respectively.

    The factored matrix A is stored in an internal format that depends on the number of processes.

    The vectors for dl, d, du, du2, ipiv, and af input to GTTRS must be the same as the corresponding output arguments for GTTRF.

    The vectors for dl, d, du, and af input to DTTRS must be the same as the corresponding output arguments for DTTRF.

  4. For GTTRF, the general tridiagonal matrix A must be non-singular and irreducible. For DTTRF, the general tridiagonal matrix A must be diagonally dominant to ensure numerical accuracy because no pivoting is performed. These subroutines use the info argument to provide information about A, like ScaLAPACK. However, these subroutines also issue an error message, which differs from ScaLAPACK.

  5. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  6. The general tridiagonal matrix A must be stored in tridiagonal storage mode. For details, see the section on general tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  7. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument or if size(info)=0, your program terminates as a result of the computational error.

Input-Argument Errors for GTTRF

Stage 1
  1. The rank of the ultimate align target is not 1 for dl, d, du, du2, ipiv, or info (if info is present).
  2. The process rank is not 1 for dl, d, du, du2, ipiv, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for dl, d, du, du2, and ipiv.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for dl, d, du, du2, and ipiv.
  4. dl, d, du, du2, or ipiv is not distributed (BLOCK).
  5. The vector for dl, d, du, or du2 is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for dl, d, du, du2, and ipiv is incompatible:
    1. size(dl) <> size(ipiv) or
    2. size(d) <> size(ipiv) or
    3. size(du) <> size(ipiv) or
    4. size(du2) <> size(ipiv)
  2. The block sizes for dl, d, du, du2, and ipiv are incompatible.
  3. The abstract process indices for dl, d, du, du2, and ipiv are incompatible.
  4. The data distribution for dl, d, du, du2, or ipiv is unsupported.

Input-Argument Errors for DTTRF

Stage 1
  1. The rank of the ultimate align target is not 1 for dl, d, du, or info (if info is present).
  2. The process rank is not 1 for dl, d, du, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for dl, d, and du.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for dl, d, and du.
  4. dl, d, or du is not distributed (BLOCK).
  5. The vector for dl, d, or du is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for dl, d, and du is incompatible:
    1. size(dl) <> size(du) or
    2. size(d) <> size(du)
  2. The block sizes for dl, d, and du are incompatible.
  3. The abstract process indices for dl, d, and du are incompatible.
  4. The data distribution for dl, d, or du is unsupported.

Example 1

This example shows a factorization of the general tridiagonal matrix A, of order 12, where matrix A is stored in tridiagonal storage mode:

      *                                                            *
      | 2.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0  2.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0  3.0 |
      *                                                            *

As in "Example 1", array data is block distributed over 3 processes.

Notes:

  1. The vectors dl, d, and du, output from GTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine GTTRS.

  2. The contents of vectors du2 and af, output from GTTRF, is not shown. These vectors are passed, unchanged, to the solve subroutine GTTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, DU2, IPIV, AF
 
CALL GTTRF( DL , D , DU , DU2 , IPIV , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, DU2, IPIV, AF
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL GTTRF( DL , D , DU , DU2 , IPIV , AF , INFO )

Input

Vector dl of size 12:

 *     *
 |  .  |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 | 1.0 |
 *     *

Vector d of size 12:

 *     *
 | 2.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 | 3.0 |
 *     *

Vector du of size 12:

 *     *
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 |  .  |
 *     *

Output

Vector dl of size 12:

 *      *
 |  .   |
 | 0.5  |
 | 0.5  |
 | 0.5  |
 | 1.0  |
 | 0.33 |
 | 0.43 |
 | 0.47 |
 | 1.0  |
 | 1.0  |
 | 1.0  |
 | 1.0  |
 *      *

Vector d of size 12:

 *      *
 | 0.5  |
 | 0.5  |
 | 0.5  |
 | 2.0  |
 | 0.33 |
 | 0.43 |
 | 0.47 |
 | 2.07 |
 | 2.07 |
 | 0.47 |
 | 0.43 |
 | 0.33 |
 *      *

Vector du of size 12:

 *      *
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 0.93 |
 | 0.86 |
 | 0.67 |
 |  .   |
 *      *

Vector ipiv of size 12:

 *   *
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 *   *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *

Example 2

This example shows a factorization of the diagonally dominant general tridiagonal matrix A, of order 12, where matrix A is stored in tridiagonal storage mode.

Matrix A and the input and/or output values for dl, d, du, and info in this example are the same as shown for "Example 1".

As in "Example 2", array data is block distributed over 3 processes.

Notes:

  1. The vectors dl, d, and du, output from DTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine DTTRS.

  2. The contents of vector af, output from DTTRF, is not shown. This vector is passed, unchanged, to the solve subroutine DTTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, AF
 
CALL DTTRF( DL , D , DU , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, AF
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL DTTRF( DL , D , DU , AF , INFO )

GTTRS and DTTRS--General Tridiagonal Matrix Solve

These subroutines solve the following systems of equations for multiple right-hand sides:

    1. AX = B

GTTRS solves the tridiagonal systems of linear equations, using Gaussian elimination with partial pivoting for the general tridiagonal matrix A stored in tridiagonal storage mode.

DTTRS solves the tridiagonal systems of linear equations, using Gaussian elimination for the diagonally dominant general tridiagonal matrix A stored in tridiagonal storage mode.

In these subroutines:

A is the factored square general tridiagonal matrix.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

These subroutines use the results of the factorization of matrix A, produced by a preceding call to GTTRF or DTTRF, respectively. The output from these factorization subroutines should be used only as input to the solve subroutines GTTRS and DTTRS, respectively.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 130. Data Types
dl, d, du, du2, B, af ipiv Subroutine
Long-precision real Integer GTTRS and DTTRS

Syntax

HPF CALL GTTRS (dl, d, du, du2, ipiv, b, af)

CALL GTTRS (dl, d, du, du2, ipiv, b, af, transa, info)

HPF CALL DTTRS (dl, d, du, b, af)

CALL DTTRS (dl, d, du, b, af, transa, info)

On Entry

dl

is the updated vector dl, containing part of the factorization, produced by a preceding call to GTTRF or DTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130.

d

is the updated vector d, containing part of the factorization, produced by a preceding call to GTTRF or DTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130.

du

is the updated vector du, containing part of the factorization, produced by a preceding call to GTTRF or DTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130.

du2

is the vector du2, containing part of the factorization, produced by a preceding call to GTTRF.

Type: required (GTTRS); not present (DTTRS)

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130.

ipiv

is the vector ipiv, containing the pivot information produced by a preceding call to GTTRF.

Type: required (GTTRS); not present (DTTRS)

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 130.

af

is a work area used by these subroutines and contains part of the factorization, produced by a preceding call to GTTRF or DTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 130, where:

For GTTRS:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 3 (ceiling(size(dl) / number_of_processors())) }

For DTTRS:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 2 (ceiling(size(dl) / number_of_processors())) }

transa

indicates matrix A is used in the computation, resulting in solution 1.

Type: optional

Default: transa = 'N'

Specified as: a single character; transa = 'N'.

info

See 'On Return'.

On Return

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 130.

info

is the vector info of length equal to the number_of_processors(). info is set to zero, indicating that a successful computation occurred on each process.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is:

  2. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  3. The subroutine accepts lowercase letters for the transa argument.

  4. The output from the factorization subroutines GTTRF and DTTRF should be used only as input to the solve subroutines GTTRS and DTTRS, respectively.

    The factored matrix A is stored in an internal format that depends on the number of processes.

    The vectors for dl, d, du, du2, ipiv, and af input to GTTRS must be the same as the corresponding output arguments for GTTRF.

    The vectors for dl, d, du, and af input to DTTRS must be the same as the corresponding output arguments for DTTRF.

  5. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  6. The general tridiagonal matrix A must be stored in tridiagonal storage mode. For details, see the section on general tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  7. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.

Computational Errors

None
Note: If the factorization performed by GTTRF or DTTRF failed because of a singular or reducible matrix A (for GTTRF) or not diagonally dominant matrix A (for DTTRF), the results returned by this subroutine are unpredictable. For details, see the info output argument for GTTRF or DTTRF.

Input-Argument Errors for GTTRS

Stage 1
  1. The rank of the ultimate align target is not 1 for dl, d, du, du2, ipiv, b, or info (if info is present).
  2. The process rank is not 1 for dl, d, du, du2, ipiv, b, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for dl, d, du, du2, and ipiv.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for dl, d, du, du2, and ipiv.
  4. b is not distributed (BLOCK,*).
  5. dl, d, du, du2, or ipiv is not distributed (BLOCK).
  6. The vector for dl, d, du, or du2 is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for dl, d, du, du2, ipiv, and b is incompatible:
    1. size(dl) <> size(b,1) or
    2. size(d) <> size(b,1) or
    3. size(du) <> size(b,1) or
    4. size(du2) <> size(b,1)
    5. size(ipiv) <> size(b,1)
  2. The block sizes for dl, d, du, du2, ipiv, and b are incompatible.
  3. The abstract process indices for dl, d, du, du2, ipiv, and b are incompatible.
  4. The data distribution for dl, d, du, du2, or ipiv is unsupported.

Input-Argument Errors for DTTRS

Stage 1
  1. The rank of the ultimate align target is not 1 for dl, d, du, b, or info (if info is present).
  2. The process rank is not 1 for dl, d, du, b, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for dl, d, and du.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for dl, d, and du.
  4. b is not distributed (BLOCK,*).
  5. dl, d, or du is not distributed (BLOCK).
  6. The vector for dl, d, or du is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for dl, d, du, and b is incompatible:
    1. size(dl) <> size(b,1) or
    2. size(d) <> size(b,1) or
    3. size(du) <> size(b,1)
  2. The block sizes for dl, d, du, and b are incompatible.
  3. The abstract process indices for dl, d, du, and b are incompatible.
  4. The data distribution for dl, d, or du is unsupported.

Example 1

This example shows how to solve the system AX = B, where matrix A is the same matrix factored in "Example 1" for GTTRF.

As in "Example 1", array data is block distributed over 3 processes.

Notes:

  1. The vectors dl, d, and du, output from GTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine GTTRS.

  2. The contents of vectors du2 and af, output from GTTRF, is not shown. These vectors are passed, unchanged, to the solve subroutine GTTRS.


!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, DU2, IPIV, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL GTTRS( DL , D , DU , DU2 , IPIV , B , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, DU2, IPIV, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL GTTRS( DL , D , DU , DU2 , IPIV , B , AF , TRANSA='N' , INFO=INFO )

Input

Vector dl of size 12:

 *      *
 |  .   |
 | 0.5  |
 | 0.5  |
 | 0.5  |
 | 1.0  |
 | 0.33 |
 | 0.43 |
 | 0.47 |
 | 1.0  |
 | 1.0  |
 | 1.0  |
 | 1.0  |
 *      *

Vector d of size 12:

 *      *
 | 0.5  |
 | 0.5  |
 | 0.5  |
 | 2.0  |
 | 0.33 |
 | 0.43 |
 | 0.47 |
 | 2.07 |
 | 2.07 |
 | 0.47 |
 | 0.43 |
 | 0.33 |
 *      *

Vector du of size 12:

 *      *
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 2.0  |
 | 0.93 |
 | 0.86 |
 | 0.67 |
 |  .   |
 *      *

Vector ipiv of size 12:

 *   *
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 | 0 |
 *   *

Rectangular 12 × 3 matrix B:

 *                   *
 | 46.0    6.0   4.0 |
 | 65.0   13.0   6.0 |
 | 59.0   19.0   6.0 |
 | 53.0   25.0   6.0 |
 | 47.0   31.0   6.0 |
 | 41.0   37.0   6.0 |
 | 35.0   43.0   6.0 |
 | 29.0   49.0   6.0 |
 | 23.0   55.0   6.0 |
 | 17.0   61.0   6.0 |
 | 11.0   67.0   6.0 |
 |  5.0   47.0   4.0 |
 *                   *

Output

Rectangular 12 × 3 matrix B:

 *                   *
 | 12.0    1.0   1.0 |
 | 11.0    2.0   1.0 |
 | 10.0    3.0   1.0 |
 |  9.0    4.0   1.0 |
 |  8.0    5.0   1.0 |
 |  7.0    6.0   1.0 |
 |  6.0    7.0   1.0 |
 |  5.0    8.0   1.0 |
 |  4.0    9.0   1.0 |
 |  3.0   10.0   1.0 |
 |  2.0   11.0   1.0 |
 |  1.0   12.0   1.0 |
 *                   *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *

Example 2

This example shows how to solve the system AX = B, where matrix A is the same matrix factored in "Example 2" for DTTRF.

The input and/or output values for dl, d, du, b, transa, and info in this example are the same as shown for "Example 1".

As in "Example 2", array data is block distributed over 3 processes.

Notes:

  1. The vectors dl, d, and du, output from DTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine DTTRS.

  2. The contents of vector af, output from DTTRF, is not shown. This vector is passed, unchanged, to the solve subroutine DTTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL DTTRS( DL , D , DU , B , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: DL, D, DU, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL DTTRS( DL , D , DU , B , AF , TRANSA='N' , INFO=INFO )

PTSV--Positive Definite Symmetric Tridiagonal Matrix Factorization and Solve

This subroutine solves the tridiagonal systems of linear equations, AX = B, where the positive definite symmetric tridiagonal matrix A is stored in parallel-symmetric-tridiagonal storage mode. In this description:

A is the positive definite symmetric tridiagonal matrix.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 131. Data Types
d, e, B Subroutine
Long-precision real PTSV

Syntax

HPF CALL PTSV (d, e, b)

CALL PTSV (d, e, b, info)

On Entry

d

is the vector d, containing the main diagonal of the positive definite symmetric tridiagonal matrix A in elements 1 through size(d).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 131.

On output, D is overwritten; that is, the original input is not preserved.

e

is the vector e, containing the off-diagonal of the positive definite symmetric tridiagonal matrix A in elements 1 through size(e)-1.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 131.

On output, E is overwritten; that is, the original input is not preserved.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 131.

info

See 'On Return'.

On Return

d

is overwritten; that is, the original input is not preserved.

e

is overwritten; that is, the original input is not preserved.

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 131.

info

is the vector info of length equal to the number_of_processors(), where, if you are running on the j-th process, then infoj has the following meaning, when info is present:

If infoj = 0 for all j, matrix A is positive definite, and the factorization completed normally.

If 1 <=  infoj <= number_of_processors() for any j, the portion of global submatrix A stored on process infoj-1 and factored locally, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If infoj > number_of_processors() for any j, the portion of global submatrix A stored on process infoj-number_of_processors()-1 representing interactions with other processes, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If infoj > 0 for any j, the results of the computation are unpredictable.

All elements of info will have the same value.

When info is not present or size(info)=0, and matrix A is not positive definite, then the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers, where infoj >= 0 for j = 1...number_of_processors().

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is: size(d) = size(e) = size(b,1).

  2. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  3. The symmetric tridiagonal matrix A must be positive definite. This subroutine uses the info argument to provide information about A, like ScaLAPACK. However, this subroutine also issues an error message, which differs from ScaLAPACK.

  4. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  5. The positive definite symmetric tridiagonal matrix A must be stored in parallel-symmetric-tridiagonal storage mode. For details, see the section on tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  6. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument or if size(info)=0, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for d, e, b, or info (if info is present).
  2. The process rank is not 1 for d, e, b, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for d and e.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for d and e.
  4. b is not distributed (BLOCK,*).
  5. d or e is not distributed (BLOCK).
  6. The vector for d or e is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for d, e, and b is incompatible:
    1. size(d) <> size(b,1) or
    2. size(e) <> size(b,1)
  2. The block sizes for d, e, and b are incompatible.
  3. The abstract process indices for d, e, and b are incompatible.
  4. The data distribution for d or e is unsupported.

Example 1

This example shows a factorization of the positive definite symmetric tridiagonal matrix A, of order 12, where matrix A is stored in parallel-symmetric-tridiagonal storage mode:

      *                                                            *
      | 4.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0 |
      *                                                            *

As in "Example", array data is block distributed over 3 processes.
Note: On output, vectors d and e are overwritten by this subroutine.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL PTSV( D , E , B )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL PTSV( D , E , B , INFO )

Input

Vector d of size 12:

 *                                                            *
 | 4.0  5.0  5.0  5.0  5.0  5.0  5.0  5.0  5.0  5.0  5.0  5.0 |
 *                                                            *

Vector e of size 12:

 *                                                            *
 | 2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0   .  |
 *                                                            *

Rectangular 12 × 3 matrix B:

 *                   *
 | 70.0    8.0   6.0 |
 | 99.0   18.0   9.0 |
 | 90.0   27.0   9.0 |
 | 81.0   36.0   9.0 |
 | 72.0   45.0   9.0 |
 | 63.0   54.0   9.0 |
 | 54.0   63.0   9.0 |
 | 45.0   72.0   9.0 |
 | 36.0   81.0   9.0 |
 | 27.0   90.0   9.0 |
 | 18.0   99.0   9.0 |
 |  9.0   82.0   7.0 |
 *                   *

Output

Rectangular 12 × 3 matrix B:

 *                   *
 | 12.0    1.0   1.0 |
 | 11.0    2.0   1.0 |
 | 10.0    3.0   1.0 |
 |  9.0    4.0   1.0 |
 |  8.0    5.0   1.0 |
 |  7.0    6.0   1.0 |
 |  6.0    7.0   1.0 |
 |  5.0    8.0   1.0 |
 |  4.0    9.0   1.0 |
 |  3.0   10.0   1.0 |
 |  2.0   11.0   1.0 |
 |  1.0   12.0   1.0 |
 *                   *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *

PTTRF--Positive Definite Symmetric Tridiagonal Matrix Factorization

This subroutine factors the positive definite symmetric tridiagonal matrix A, stored in parallel-symmetric-tridiagonal storage mode.

To solve a tridiagonal system of linear equations with multiple right-hand sides, follow the call to PTTRF with one or more calls to PTTRS, respectively. The output from these factorization subroutines should be used only as input to the solve subroutines PTTRS, respectively.

If the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 132. Data Types
d, e, af Subroutine
Long-precision real PTTRF

Syntax

HPF CALL PTTRF (d, e, af)

CALL PTTRF (d, e, af, info)

On Entry

d

is the vector d, containing the main diagonal of the tridiagonal matrix A in elements 1 through size(d).

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 132.

On output, D is overwritten; that is, the original input is not preserved.

e

is the vector e, containing the off-diagonal of the positive definite symmetric tridiagonal matrix A in elements 1 through size(e)-1.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 132.

On output, E is overwritten; that is, the original input is not preserved.

af

See 'On Return'.

info

See 'On Return'.

On Return

d

is the updated vector d, containing part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 132.

On output, D is overwritten; that is, the original input is not preserved.

e

is the updated vector e, containing part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 132.

On output, E is overwritten; that is, the original input is not preserved.

af

is a work area used by these subroutines and contains part of the factorization.

Type: required

Returned as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 132, where:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 3 (ceiling(size(e) / number_of_processors())) }

info

is the vector info of length equal to the number_of_processors(), where, if you are running on the j-th process, then infoj has the following meaning, when info is present:

If infoj = 0 for all j, matrix A is positive definite, and the factorization completed normally.

If 1 <=  infoj <= number_of_processors() for any j, the portion of global submatrix A stored on process infoj-1 and factored locally, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If infoj > number_of_processors() for any j, the portion of global submatrix A stored on process infoj-number_of_processors()-1 representing interactions with other processes, is not positive definite. A pivot element whose value is less than or equal to a small positive number was detected.

If infoj > 0 for any j, the factorization is completed; however, if you call PTTRS with these factors, the results are unpredictable.

All elements of info will have the same value.

When info is not present or size(info) = 0 and matrix A is not positive definite,then the information for the above computational error is issued in an error message, and your program is terminated.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers, where infoj >= 0 for j = 1...number_of_processors().

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is:
    size(d) = size(e)

  2. The assumed-shape arrays must have no common elements; otherwise, results are unpredictable.

  3. The output from this factorization subroutine should be used only as input to the solve subroutine PTTRS.

    The factored matrix A is stored in an internal format that depends on the number of processes.

    The vectors for d, e, and af input to PTTRS must be the same as the corresponding output arguments for PTTRF.

  4. The symmetric tridiagonal matrix A must be positive definite. This subroutine uses the info argument to provide information about A, like ScaLAPACK. However, these subroutines also issue an error message, which differs from ScaLAPACK.

  5. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  6. The positive definite symmetric tridiagonal matrix A must be stored in parallel-symmetric-tridiagonal storage mode. For details, see the section on tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  7. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. All errors listed in "Error Conditions" also apply to this subroutine; however, for computational errors, if you do not specify the optional info argument or if size(info)=0, your program terminates as a result of the computational error.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for d, e, or info (if info is present).
  2. The process rank is not 1 for d, e, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for d and e.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for d and e.
  4. d or e is not distributed (BLOCK).
  5. The vector for d or e is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for d and e is incompatible:
    size(d) <> size(e)
  2. The block sizes for d and e are incompatible.
  3. The abstract process indices for d and e are incompatible.
  4. The data distribution for d or e is unsupported.

Example 1

This example shows a factorization of the positive definite symmetric tridiagonal matrix A, of order 12, where matrix A is stored in parallel-symmetric-tridiagonal storage mode:

      *                                                            *
      | 4.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0  0.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0  2.0 |
      | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  5.0 |
      *                                                            *

As in "Example", array data is block distributed over 3 processes.

Notes:

  1. The vectors, d and e, output from PTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine PTTRS.

  2. The contents of vector af, output from PTTRF, is not shown. This vector is passed, unchanged, to the solve subroutine PTTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E, AF
 
CALL PTTRF( D , E , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E, AF
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL PTTRF( D , E , AF , INFO )

Input

Vector d of size 12:

 *     *
 | 4.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 | 5.0 |
 *     *

Vector e of size 12:

 *     *
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 | 2.0 |
 |  .  |
 *     *

Output

Vector d of size 12:

 *       *
 |  .25  |
 |  .25  |
 |  .25  |
 | 4.0   |
 |  .20  |
 |  .24  |
 |  .25  |
 | 4.01  |
 | 4.01  |
 |  .25  |
 |  .24  |
 |  .20  |
 *       *

Vector e of size 12:

 *       *
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 |  .49  |
 |  .48  |
 |  .40  |
 |  .    |
 *       *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *

PTTRS--Positive Definite Symmetric Tridiagonal Matrix Solve

This subroutine solves the following systems of equations for multiple right-hand sides:

    1. AX = B

PTTRS solves the tridiagonal systems of linear equations, where the positive definite symmetric tridiagonal matrix A is stored in parallel-symmetric-tridiagonal storage mode, where:

A is the factored positive definite symmetric tridiagonal matrix.
B is the general matrix containing the right-hand sides in its columns.
X represents the general matrix B, containing the output solution vectors in its columns.

This subroutine uses the results of the factorization of matrix A, produced by a preceding call to PTTRF. The output from these factorization subroutines should be used only as input to the solve subroutines PTTRS.

If any of the assumed-shape arrays have a size of zero, no computation is performed and the subroutine returns after doing some parameter checking.

See references [51], [16], [18], [22], [36], and [37].

Table 133. Data Types
d, e, B, af Subroutine
Long-precision real PTTRS

Syntax

HPF CALL PTTRS (d, e, b, af)

CALL PTTRS (d, e, b, af, info)

On Entry

d

is the updated vector d, containing part of the factorization, produced by a preceding call to PTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 133.

e

is the updated vector e, containing part of the factorization, produced by a preceding call to PTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 133.

b

is the general matrix B, containing the multiple right-hand sides of the system.

Type: required

Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 133.

af

is a work area used by these subroutines and contains part of the factorization, produced by a preceding call to PTTRF.

Type: required

Specified as: an assumed-shape array with shape (:), containing numbers of the data type indicated in Table 133, where:

size(af) >=
number_of_processors() { (12)(number_of_processors())
+ 3 (ceiling(size(e) / number_of_processors())) }

info

See 'On Return'.

On Return

b

is the updated matrix B, containing the solution vectors.

Type: required

Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 133.

info

is the vector info of length equal to the number_of_processors(). info is set to zero, indicating that a successful computation occurred on each process.

Type: optional

Returned as: an assumed-shape array with shape (:), containing fullword integers.

Notes and Coding Rules

  1. The assumed-shape arrays must have the exact size required for the computation, that is:
    size(d) = size(e) = size(b,1)

  2. The assumed-shape arrays must have no common elements; otherwise results are unpredictable.

  3. The output from the factorization subroutine PTTRF should be used only as input to the solve subroutine PTTRS.

    The factored matrix A is stored in an internal format that depends on the number of processes.

    The vectors for d, e, and af input to PTTRS must be the same as the corresponding output arguments for PTTRF.

  4. For details on how to set up and code your HPF program using Parallel ESSL, see "Coding Your HPF Program"

  5. The positive definite symmetric tridiagonal matrix A must be stored in parallel-symmetric-tridiagonal storage mode. For details, see the section on tridiagonal matrices in "Matrices".

    Block data distribution is required for all array data, except the array for info, which requires cyclic data distribution. Because data directives are included in the interface module PESSL_HPF, you can specify any data distribution for your vectors and matrix, and the XL HPF compiler will, if necessary, redistribute the data prior to calling this subroutine. For how to code your HPF directives, see "Distributing Data in an HPF Program". For a sample program including directives, see Figure 10.

  6. The restrictions given in "Notes and Coding Rules" also apply to this subroutine.

Error Conditions

HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.

Computational Errors

None
Note: If the factorization performed by PTTRF failed because of a nonpositive definite matrix A, the results returned by this subroutine are unpredictable. For details, see the info output argument for PTTRF.

Input-Argument Errors

Stage 1
  1. The rank of the ultimate align target is not 1 for d, e, b, or info (if info is present).
  2. The process rank is not 1 for d, e, b, or info (if info is present).

Stage 2
  1. The data distribution is inconsistent for d and e.
  2. info is present and:
    1. The data distribution is unsupported for info.
    2. info is not distributed (CYCLIC).
    3. The vector for info is replicated.
  3. The process grid is not the same for d and e.
  4. b is not distributed (BLOCK,*).
  5. d or e is not distributed (BLOCK).
  6. The vector for d or e is replicated.

Stage 3
  1. The shape of the assumed-shape arrays for d, e, and b is incompatible:
    1. size(d) <> size(b,1) or
    2. size(e) <> size(b,1)
  2. The block sizes for d, e, and b are incompatible.
  3. The abstract process indices for d, e, and b are incompatible.
  4. The data distribution for d or e is unsupported.

Example 1

This example shows how to solve the system AX = B, where matrix A is the same matrix factored in "Example 1" for PTTRF.

As in "Example", array data is block distributed over 3 processes.

Notes:

  1. The vectors d and e output from PTTRF, are stored in an internal format that depends on the number of processes. These vectors are passed, unchanged, to the solve subroutine PTTRS.

  2. The contents of vector af output from PTTRF, is not shown. This vector is passed, unchanged, to the solve subroutine PTTRS.

!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
 
CALL PTTRS( D , E , B , AF )
 
      -or-
 
!HPF$ PROCESSORS PROC(3)
!HPF$ DISTRIBUTE (BLOCK) ONTO PROC :: D, E, AF
!HPF$ DISTRIBUTE (BLOCK,*) ONTO PROC :: B
!HPF$ DISTRIBUTE (CYCLIC) ONTO PROC :: INFO
 
CALL PTTRS( D , E , B , AF , INFO=INFO )

Input

Vector d of size 12:

 *       *
 |  .25  |
 |  .25  |
 |  .25  |
 | 4.0   |
 |  .20  |
 |  .24  |
 |  .25  |
 | 4.01  |
 | 4.01  |
 |  .25  |
 |  .24  |
 |  .20  |
 *       *

Vector e of size 12:

 *       *
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 | 2.0   |
 |  .49  |
 |  .48  |
 |  .40  |
 |   .   |
 *       *

Rectangular 12 × 3 matrix B:

 *                   *
 | 70.0    8.0   6.0 |
 | 99.0   18.0   9.0 |
 | 90.0   27.0   9.0 |
 | 81.0   36.0   9.0 |
 | 72.0   45.0   9.0 |
 | 63.0   54.0   9.0 |
 | 54.0   63.0   9.0 |
 | 45.0   72.0   9.0 |
 | 36.0   81.0   9.0 |
 | 27.0   90.0   9.0 |
 | 18.0   99.0   9.0 |
 |  5.0   82.0   7.0 |
 *                   *

Output

Rectangular 12 × 3 matrix B:

 *                   *
 | 12.0    1.0   1.0 |
 | 11.0    2.0   1.0 |
 | 10.0    3.0   1.0 |
 |  9.0    4.0   1.0 |
 |  8.0    5.0   1.0 |
 |  7.0    6.0   1.0 |
 |  6.0    7.0   1.0 |
 |  5.0    8.0   1.0 |
 |  4.0    9.0   1.0 |
 |  3.0   10.0   1.0 |
 |  2.0   11.0   1.0 |
 |  1.0   12.0   1.0 |
 *                   *

Vector info of size 3: (if info is present)

 *   *
 | 0 |
 | 0 |
 | 0 |
 *   *


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]