This part of the book is organized into five areas, providing reference information for coding the Parallel ESSL calling sequences in a High Performance Fortran (HPF) program. It is organized as follows:
This chapter describes the Level 2 and 3 PBLAS subroutines that can be called from an HPF program.
The Level 2 and 3 PBLAS include a subset of the standard set of distributed memory parallel versions of the Level 2 and 3 BLAS.
Note: | These subroutines are designed to be consistent with the proposals for the Fortran 90 BLAS and the Fortran 90 LAPACK. (See references [30] and [31].) If these subroutines do not comply with any eventual proposal for HPF interfaces to the PBLAS and ScaLAPACK, IBM will consider updating them to do so. If IBM updates these subroutines, the update could require modifications of the calling application program. |
Table 110. List of Level 2 PBLAS (HPF)
Descriptive Name | Long-Precision Subprogram | Page |
---|---|---|
Matrix-Vector Product for a General Matrix or Its Transpose | GEMM | GEMM--Matrix-Matrix Product for a General Matrix, Its Transpose, or Its Conjugate Transpose |
Matrix-Vector Product for a Real Symmetric Matrix | SYMM | SYMM--Matrix-Matrix Product Where One Matrix is Real Symmetric |
Rank-One Update of a General Matrix | GEMM | GEMM--Matrix-Matrix Product for a General Matrix, Its Transpose, or Its Conjugate Transpose |
Rank-One Update of a Real Symmetric Matrix | SYRK | SYRK--Rank-K Update of a Real Symmetric Matrix |
Rank-Two Update of a Real Symmetric Matrix | SYR2K | SYR2K--Rank-2K Update of a Real Symmetric Matrix |
Matrix-Vector Product for a Triangular Matrix or Its Transpose | TRMM | TRMM--Triangular Matrix-Matrix Product |
Solution of Triangular System of Equations with a Single Right-Hand Side | TRSM | TRSM--Solution of Triangular System of Equations with Multiple Right-Hand Sides |
Table 111. List of Level 3 PBLAS (HPF)
Descriptive Name | Long-Precision Subprogram | Page |
---|---|---|
Matrix-Matrix Product for a General Matrix, Its Transpose, or Its Conjugate Transpose | GEMM | GEMM--Matrix-Matrix Product for a General Matrix, Its Transpose, or Its Conjugate Transpose |
Matrix-Matrix Product Where One Matrix is Real Symmetric | SYMM | SYMM--Matrix-Matrix Product Where One Matrix is Real Symmetric |
Triangular Matrix-Matrix Product | TRMM | TRMM--Triangular Matrix-Matrix Product |
Solution of Triangular System of Equations with Multiple Right-Hand Sides | TRSM | TRSM--Solution of Triangular System of Equations with Multiple Right-Hand Sides |
Rank-K Update of a Real Symmetric Matrix | SYRK | SYRK--Rank-K Update of a Real Symmetric Matrix |
Rank-2K Update of a Real Symmetric Matrix | SYR2K | SYR2K--Rank-2K Update of a Real Symmetric Matrix |
Matrix Transpose for a General Matrix | TRAN | TRAN--Matrix Transpose for a General Matrix |
This section contains the PBLAS subroutine descriptions.
This subroutine performs any one of the following combined matrix computations:
where, in the formulas above:
Note: | No data should be moved to form the matrix transposes or matrix conjugate transposes; that is, the matrices should always be stored in their untransposed forms. |
In the following cases, no computation is performed and the subroutine returns after doing some parameter checking:
Assuming the above conditions do not exist, if beta is not one and the assumed-shape arrays for A and B have a size of zero, then betaC is returned.
See references [17], [30], [31], and [44].
alpha, beta, A, B, C, a, b, c | Subroutine |
Long-precision real | GEMM |
Long-precision complex | GEMM |
HPF | Equations 1-9 | CALL GEMM (alpha, a, b, beta,
c)
CALL GEMM (alpha, a, b, beta, c, transa, transb) |
HPF | Equations 10 and 11 | CALL GEMM (alpha, a, b, beta,
c)
CALL GEMM (alpha, a, b, beta, c, transa) |
HPF | Equation 12 | CALL GEMM (alpha, a, b, c) |
Type: required
Specified as: a number of the data type indicated in Table 112.
If transa = 'N', A is used in the computation.
If transa = 'T', AT is used in the computation.
If transa = 'C', AH is used in the computation.
Note: | No data should be moved to form AT or AH; that is, the matrix A should always be stored in its untransposed form. |
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 112.
If transb = 'N', B is used in the computation.
If transb = 'T', BT is used in the computation.
If transb = 'C', BH is used in the computation.
Type: required
Note: | No data should be moved to form BT or BH; that is, the matrix B should always be stored in its untransposed form. |
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 112.
Type: required (equations 1-11); not present (equation 12)
Specified as: a number of the data type indicated in Table 112.
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 112.
If transa = 'N', A is used in the computation, resulting in equation 1, 2, 7, or 10.
If transa = 'T', AT is used in the computation, resulting in equation 3, 4, 8, or 11.
If transa = 'C', AH is used in the computation, resulting in equation 5, 6, or 9.
Type: optional (equations 1-11); not present (equation 12)
Default: transa = 'N'
Specified as: a single character; transa = 'N', 'T', or 'C'.
If transb = 'N', B is used in the computation, resulting in equation 1, 3, or 5.
If transb = 'T', BT is used in the computation, resulting in equation 2, 4, or 6.
If transb = 'C', BH is used in the computation, resulting in equation 7, 8, or 9.
Type: optional (equations 1-9); not present (equations 10-12)
Default: transb = 'N'
Specified as: a single character; transb = 'N' or 'T'.
Type: required
Returned as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 112.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions", "Error Conditions", and "Error Conditions" also apply to this subroutine.
The process grid is not the same for a, b, and c.
The data distribution is inconsistent for a, b, and c.
The shape of the assumed-shape arrays a, b, and c is incompatible:
The data distribution for a, b, or c is unsupported.
transa is present, and transa <>'N', 'T', or 'C'.
The process grid is not the same for a, b, and c.
The data distribution for a, b, or c is unsupported.
The process grid is not the same for a, b, and c.
The data distribution for c is unsupported.
The vector for a or b is replicated.
The data distribution is inconsistent for a and b.
The data distribution for a, b, or c is unsupported.
This example computes C = alphaAB+betaC. As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL GEMM( 1.0D0 , A , B , 2.0D0 , C ) -or- CALL GEMM( 1.0D0 , A , B , 2.0D0 , C , TRANSA='N' , TRANSB='N' )
General 6 × 5 matrix A:
* * | 1.0 2.0 -1.0 -1.0 4.0 | | 2.0 0.0 1.0 1.0 -1.0 | | 1.0 -1.0 -1.0 1.0 2.0 | | -3.0 2.0 2.0 2.0 0.0 | | 4.0 0.0 -2.0 1.0 -1.0 | | -1.0 -1.0 1.0 -3.0 2.0 | * *
General 5 × 4 matrix B:
* * | 1.0 -1.0 0.0 2.0 | | 2.0 2.0 -1.0 -2.0 | | 1.0 0.0 -1.0 1.0 | | -3.0 -1.0 1.0 -1.0 | | 4.0 2.0 -1.0 1.0 | * *
General 6 × 4 matrix C:
* * | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | | 0.5 0.5 0.5 0.5 | * *
General 6 × 4 matrix C:
* * | 24.0 13.0 -5.0 3.0 | | -3.0 -4.0 2.0 4.0 | | 4.0 1.0 2.0 5.0 | | -2.0 6.0 -1.0 -9.0 | | -4.0 -6.0 5.0 5.0 | | 16.0 7.0 -4.0 7.0 | * *
This example computes C = alphaAB+betaC. As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL GEMM( (1.0D0,0.0D0) , A , B , (2.0D0,0.0D0) , C ) -or- CALL GEMM( (1.0D0,0.0D0) , A , B , (2.0D0,0.0D0) , C , TRANSA='N' , TRANSB='N' )
General 6 × 3 matrix A:
* * | (1.0,5.0) (9.0,2.0) (1.0,9.0) | | (2.0,4.0) (8.0,3.0) (1.0,8.0) | | (3.0,3.0) (7.0,5.0) (1.0,7.0) | | (4.0,2.0) (4.0,7.0) (1.0,5.0) | | (5.0,1.0) (5.0,1.0) (1.0,6.0) | | (6.0,6.0) (3.0,6.0) (1.0,4.0) | * *
General 3 × 2 matrix B:
* * | (1.0,8.0) (2.0,7.0) | | (4.0,4.0) (6.0,8.0) | | (6.0,2.0) (4.0,5.0) | * *
General 6 × 2 matrix C:
* * | (0.5,0.0) (0.5,0.0) | | (0.5,0.0) (0.5,0.0) | | (0.5,0.0) (0.5,0.0) | | (0.5,0.0) (0.5,0.0) | | (0.5,0.0) (0.5,0.0) | | (0.5,0.0) (0.5,0.0) | * *
General 6 × 2 matrix C:
* * | (-22.0,113.0) (-35.0.142.0) | | (-19.0,114.0) (-35.0.141.0) | | (-20.0,119.0) (-43.0.146.0) | | (-27.0,110.0) (-58.0.131.0) | | (8.0,103.0) (0.0.112.0) | | (-55.0,116.0) (-75.0.135.0) | * *
This example computes c = alphaAb+betac. The input matrices A, B, and C, used here, are the same as the matrices used in "Example 1". The updated portion of C is also the same, as this computation is equivalent to a portion of the computation.
Array sections are specified for arguments a, b, and c, resulting in the computation using a submatrix A starting at row 3 and column 1 in an array, a column vector b starting at row 1 and column 2 in an array, and a column vector c, starting at row 3 and column 2 in an array.
As in "Example 1", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL GEMM( 1.0D0 , A(3:6,1:5) , B(1:5,2:2) , 2.0D0 , C(3:6,2:2) ) -or- CALL GEMM( 1.0D0 , A(3:6,1:5) , B(1:5,2:2) , 2.0D0 , C(3:6,2:2) , TRANSA='N' )
Only a portion of the data structure is used--that is, submatrix A. Following is the 4 × 5 submatrix A, starting at row 3 and column 1 in the 6 × 5 array:
* * | . . . . . | | . . . . . | | 1.0 -1.0 -1.0 1.0 2.0 | | -3.0 2.0 2.0 2.0 0.0 | | 4.0 0.0 -2.0 1.0 -1.0 | | -1.0 -1.0 1.0 -3.0 2.0 | * *
Only a portion of the data structure is used--that is, vector b, which is a column vector. Following is the vector b of size 5, starting at row 1 and column 2 in the 5 × 4 array:
* * | . -1.0 . . | | . 2.0 . . | | . 0.0 . . | | . -1.0 . . | | . 2.0 . . | * *
Only a portion of the data structure is used--that is, vector c, which is a column vector. Following is the vector c of size 4, starting at row 3 and column 2 in the 6 × 4 array:
* * | . . . . | | . . . . | | . 0.5 . . | | . 0.5 . . | | . 0.5 . . | | . 0.5 . . | * *
Only a portion of the data structure is used--that is, vector c, which is a column vector. Following is the vector c of size 4, starting at row 3 and column 2 in the 6 × 4 array:
* * | . . . . | | . . . . | | . 1.0 . . | | . 6.0 . . | | . -6.0 . . | | . 7.0 . . | * *
This example computes c = alphaAb+betac. The input matrices A, B, and C, used here, are the same as A, B, and C, used in "Example 1".
Array sections are specified for arguments a, b, and c, resulting in the computation using a submatrix A starting at row 2 and column 2 in an array, a row vector b starting at row 4 and column 2 in an array, and a column vector c starting at row 2 and column 3 in an array.
As in "Example 2", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL GEMM( 1.0D0 , A(2:5,2:4) , B(4:4,2:4) , 2.0D0 , C(2:5,3:3) ) -or- CALL GEMM( 1.0D0 , A(2:5,2:4) , B(4:4,2:4) , 2.0D0 , C(2:5,3:3) , TRANSA='N' )
Only a portion of the data structure is used--that is, submatrix A. Following is the 4 × 3 submatrix A, starting at row 2 and column 2 in the 6 × 5 array:
* * | . . . . . | | . 0.0 1.0 1.0 . | | . -1.0 -1.0 1.0 . | | . 2.0 2.0 2.0 . | | . 0.0 -2.0 1.0 . | | . . . . . | * *
Only a portion of the data structure is used--that is, vector b, which is a row vector. Following is the vector b of size 3, starting at row 4 and column 2 in the 5 × 4 array:
* * | . . . . | | . . . . | | . . . . | | . -1.0 1.0 -1.0 | | . . . . | * *
Only a portion of the data structure is used--that is, vector c, which is a column vector. Following is the vector c of size 4, starting at row 2 and column 3 in the 6 × 4 array:
* * | . . . . | | . . 0.5 . | | . . 0.5 . | | . . 0.5 . | | . . 0.5 . | | . . . . | * *
Only a portion of the data structure is used--that is, vector c, which is a column vector. Following is the vector c of size 4, starting at row 2 and column 3 in the 6 × 4 array:
* * | . . . . | | . . 1.0 . | | . . 0.0 . | | . . -1.0 . | | . . -2.0 . | | . . . . | * *
This example computes C = alphaabT+C.
Array sections are specified for arguments a, b, and c, resulting in the computation using a submatrix C starting at row 2 and column 2 in an array, a column vector a, starting at element 2 in an array, and a row vector b starting at element 2 in an array.
As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN A(:) WITH C(:,1) !HPF$ ALIGN B(:) WITH C(1,:) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: C CALL GEMM( 1.0D0 , A(2:10) , B(2:10) , C(2:10,2:10) )
Only a portion of the data structure is used--that is, submatrix C. Following is the 9 × 9 submatrix C, starting at row 2 and column 2 in the 10 × 10 array:
* * | . . . . . . . . . . | | . 12.0 22.0 32.0 42.0 52.0 62.0 72.0 82.0 92.0 | | . 13.0 23.0 33.0 43.0 53.0 63.0 73.0 83.0 93.0 | | . 14.0 24.0 34.0 44.0 54.0 64.0 74.0 84.0 94.0 | | . 15.0 25.0 35.0 45.0 55.0 65.0 75.0 85.0 95.0 | | . 16.0 26.0 36.0 46.0 56.0 66.0 76.0 86.0 96.0 | | . 17.0 27.0 37.0 47.0 57.0 67.0 77.0 87.0 97.0 | | . 18.0 28.0 38.0 48.0 58.0 68.0 78.0 88.0 98.0 | | . 19.0 29.0 39.0 49.0 59.0 69.0 79.0 89.0 99.0 | | . 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 100.0 | * *
Only a portion of the data structure is used--that is, vector a, which is a column vector. Following is the vector a of size 9, starting at element 2 in the array of size 11:
* * | . | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | . | * *
Only a portion of the data structure is used--that is, vector b, which is a row vector. Following is the vector b of size 9, starting at element 2 in the array of size 11:
* * | . 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 10.0 . | * *
Only a portion of the data structure is used--that is, submatrix C. Following is the 9 × 9 submatrix C, starting at row 2 and column 2 in the 10 × 10 array:
* * | . . . . . . . . . . | | . 14.0 25.0 36.0 47.0 58.0 69.0 80.0 91.0 102.0 | | . 15.0 26.0 37.0 48.0 59.0 70.0 81.0 92.0 103.0 | | . 16.0 27.0 38.0 49.0 60.0 71.0 82.0 93.0 104.0 | | . 17.0 28.0 39.0 50.0 61.0 72.0 83.0 94.0 105.0 | | . 18.0 29.0 40.0 51.0 62.0 73.0 84.0 95.0 106.0 | | . 19.0 30.0 41.0 52.0 63.0 74.0 85.0 96.0 107.0 | | . 20.0 31.0 42.0 53.0 64.0 75.0 86.0 97.0 108.0 | | . 21.0 32.0 43.0 54.0 65.0 76.0 87.0 98.0 109.0 | | . 22.0 33.0 44.0 55.0 66.0 77.0 88.0 99.0 110.0 | * *
This subroutine computes one of the following matrix-matrix products:
where, in the formulas above:
In the following two cases, no computation is performed and the subroutine returns after doing some parameter checking:
See references [17], [30], [31], and [44].
alpha, beta, A, B, C, b, c | Subprogram |
Long-precision real | SYMM |
HPF | Equations 1 and 2 | CALL SYMM (alpha, a, b, beta, c, uplo, side) |
HPF | Equation 3 | CALL SYMM (alpha, a, b, beta, c, uplo) |
Type: required
Specified as: a number of the data type indicated in Table 113.
If uplo = 'U', the array contains the upper triangle of the symmetric matrix A in its upper triangle, and its strictly lower triangular part is not referenced.
If uplo = 'L', the array contains the lower triangle of the symmetric matrix A in its lower triangle, and its strictly upper triangular part is not referenced.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 113, where size(a,1) = size(a,2).
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 113.
Type: required
Specified as: a number of the data type indicated in Table 113.
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 113.
If uplo = 'U', the upper triangular part is referenced.
If uplo = 'L', the lower triangular part is referenced.
Type: required
Specified as: a single character; uplo = 'U' or 'L'.
If side = 'L', A is to the left of B, resulting in equation 1.
If side = 'R', A is to the right of B, resulting in equation 2.
Type: required (equations 1 and 2); not present (equation 3)
Specified as: a single character; side = 'L' or 'R'.
Type: required
Returned as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 113.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" and "Error Conditions" also apply to this subroutine.
The process grid is not the same for a, b, and c.
The data distribution is inconsistent for a, b, and c.
The data distribution for a, b, or c is unsupported.
The process grid is not the same for a, b, and c.
The data distribution for a, b, or c is unsupported.
This example computes C = alphaBA+betaC. Because beta = 0, C need not be set on input. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL SYMM( 1.0D0, A, B, 0.0D0, C, 'U', 'R' )
Symmetric matrix A of order 8:
* * | 0.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 | | . 1.0 0.0 1.0 0.0 1.0 0.0 1.0 | | . . -1.0 -1.0 0.0 0.0 1.0 0.0 | | . . . -1.0 1.0 1.0 0.0 1.0 | | . . . . -1.0 0.0 0.0 0.0 | | . . . . . 1.0 0.0 0.0 | | . . . . . . 0.0 0.0 | | . . . . . . . 0.0 | * *
General 16 × 8 matrix B:
* * | -1.0 0.0 1.0 -1.0 1.0 1.0 -1.0 -1.0 | | -1.0 -1.0 1.0 0.0 1.0 -1.0 -1.0 1.0 | | 1.0 1.0 -1.0 0.0 -1.0 0.0 1.0 0.0 | | 0.0 -1.0 0.0 0.0 0.0 0.0 0.0 -1.0 | | 0.0 1.0 0.0 1.0 0.0 1.0 1.0 0.0 | | 0.0 0.0 1.0 0.0 -1.0 -1.0 0.0 0.0 | | 1.0 1.0 0.0 0.0 1.0 1.0 0.0 -1.0 | | 0.0 0.0 -1.0 0.0 0.0 1.0 0.0 1.0 | | 0.0 0.0 0.0 -1.0 1.0 1.0 0.0 1.0 | | -1.0 -1.0 1.0 0.0 0.0 -1.0 0.0 1.0 | | 0.0 0.0 0.0 1.0 1.0 0.0 0.0 0.0 | | 0.0 0.0 1.0 1.0 0.0 -1.0 0.0 0.0 | | 1.0 1.0 -1.0 0.0 -1.0 -1.0 1.0 1.0 | | 0.0 0.0 0.0 0.0 1.0 0.0 0.0 -1.0 | | 0.0 1.0 0.0 0.0 0.0 0.0 0.0 0.0 | | -1.0 0.0 -1.0 0.0 0.0 1.0 1.0 0.0 | * *
General 16 × 8 matrix C:
* * | -1.0 0.0 0.0 1.0 -2.0 0.0 1.0 -1.0 | | 0.0 0.0 -1.0 -1.0 -1.0 -2.0 1.0 -1.0 | | 0.0 0.0 1.0 1.0 1.0 1.0 -1.0 1.0 | | 1.0 -2.0 0.0 -2.0 0.0 -1.0 0.0 -1.0 | | -1.0 3.0 0.0 1.0 1.0 3.0 0.0 2.0 | | -1.0 -1.0 -1.0 -3.0 1.0 -1.0 1.0 0.0 | | -1.0 0.0 -1.0 2.0 -1.0 2.0 0.0 1.0 | | 1.0 2.0 1.0 3.0 0.0 1.0 -1.0 0.0 | | 0.0 1.0 1.0 4.0 -2.0 0.0 0.0 -1.0 | | 0.0 0.0 0.0 -2.0 0.0 -2.0 1.0 -1.0 | | 0.0 1.0 -1.0 0.0 0.0 1.0 0.0 1.0 | | -1.0 0.0 -2.0 -3.0 1.0 0.0 1.0 1.0 | | 0.0 0.0 1.0 1.0 1.0 0.0 -1.0 1.0 | | 0.0 -1.0 0.0 0.0 -1.0 0.0 0.0 0.0 | | -1.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 | | 1.0 2.0 3.0 2.0 0.0 1.0 -1.0 0.0 | * *
This example computes c = alphaAb+betac. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN B(:) WITH A(:,1) !HPF$ ALIGN C(:) WITH A(:,1) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A CALL SYMM( 1.0D0, A, B, 0.0D0, C, 'U' )
Symmetric matrix A of order 8:
* * | 0.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 | | . 1.0 0.0 1.0 0.0 1.0 0.0 1.0 | | . . -1.0 -1.0 0.0 0.0 1.0 0.0 | | . . . -1.0 1.0 1.0 0.0 1.0 | | . . . . -1.0 0.0 0.0 0.0 | | . . . . . 1.0 0.0 0.0 | | . . . . . . 0.0 0.0 | | . . . . . . . 0.0 | * *
Vector b of size 8:
* * | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | * *
Vector c of size 8:
* * | -2.0 | | 3.0 | | -2.0 | | 2.0 | | 0.0 | | 3.0 | | 1.0 | | 2.0 | * *
This subroutine computes one of the following matrix-matrix products:
where, in the formulas above:
Note: | No data should be moved to form the matrix transpose; that is, the matrix should always be stored in its untransposed form. |
If any of the assumed-shape arrays have a size of zero, no computation is performed, and the subroutine returns after doing some parameter checking.
See references [17], [30], [31], and [44].
alpha, A, B, b | Subprogram |
Long-precision real | TRMM |
HPF | Equations 1-4 | CALL TRMM (alpha, a, b, uplo,
side)
CALL TRMM (alpha, a, b, uplo, side, transa, diag) |
HPF | Equations 5 and 6 | CALL TRMM (a, b, uplo)
CALL TRMM (a, b, uplo, transa, diag) |
Type: required (equations 1-4); not present (equations 5 and 6)
Specified as: a number of the data type indicated in Table 114.
If uplo = 'U', the array contains the upper triangle of the triangular matrix A in its upper triangle, and its strictly lower triangular part is not referenced.
If uplo = 'L', the array contains the lower triangle of the triangular matrix A in its lower triangle, and its strictly upper triangular part is not referenced.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 114, where size(a,1) = size(a,2).
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 114.
If uplo = 'U', the upper triangular part is referenced.
If uplo = 'L', the lower triangular part is referenced.
Type: required
Specified as: a single character; uplo = 'U' or 'L'.
If side = 'L', A is to the left of B, resulting in equation 1 or 2.
If side = 'R', A is to the right of B, resulting in equation 3 or 4.
Type: required (equations 1-4); not present (equations 5 and 6)
Specified as: a single character; side = 'L' or 'R'.
If transa = 'N', A is used in the computation, resulting in equation 1, 3, or 5.
If transa = 'T', AT is used in the computation, resulting in equation 2, 4, or 6.
Type: optional
Default: transa = 'N'
Specified as: a single character; transa = 'N' or 'T'.
If diag = 'U', A is a unit triangular matrix.
If diag = 'N', A is not a unit triangular matrix.
Type: optional
Default: diag = 'N'
Specified as: a single character; diag = 'U' or 'N'.
Type: required
Returned as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 114.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" and "Error Conditions" also apply to this subroutine.
The process grid is not the same for a and b.
The data distribution is inconsistent for a and b.
The data distribution for a or b is unsupported.
The process grid is not the same for a and b.
The data distribution is inconsistent for a and b.
The shape of the assumed-shape array for a is invalid: size(a,1) <> size(a,2)
The data distribution for a or b is unsupported.
The shape of the assumed-shape arrays a and b is incompatible: size(a,1) <> size(b)
This example computes B = alphaAB. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B CALL TRMM( 1.0D0 , A , B , 'U' , 'L' ) -or- CALL TRMM( 1.0D0 , A , B , 'U' , 'L' , TRANSA='N' , DIAG='N' )
Triangular matrix A of order 5 is upper triangular:
* * | 3.0 -1.0 2.0 2.0 1.0 | | . -2.0 4.0 -1.0 3.0 | | . . -3.0 0.0 2.0 | | . . . 4.0 -2.0 | | . . . . 1.0 | * *
Rectangular 5 × 3 matrix B:
* * | 2.0 3.0 1.0 | | 5.0 5.0 4.0 | | 0.0 1.0 2.0 | | 3.0 1.0 -3.0 | | -1.0 2.0 1.0 | * *
Rectangular 5 × 3 matrix B:
* * | 6.0 10.0 -2.0 | | -16.0 -1.0 6.0 | | -2.0 1.0 -4.0 | | 14.0 0.0 -14.0 | | -1.0 2.0 1.0 | * *
This example computes b = Ab, where A is not a unit triangular matrix, and b is a column vector.
Array sections are specified for arguments a and b, resulting in the computation using a submatrix A starting at row 2 and column 2 in an array and a column vector b starting at element 2 in an array.
As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN B(:) WITH A(:,1) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A CALL TRMM( A(2:13,2:13) , B(2:13) , 'U' ) -or- CALL TRMM( A(2:13,2:13) , B(2:13) , 'U' , TRANSA='N' , DIAG='N' )
Only a portion of the data structure is used--that is, submatrix A. Following is the triangular submatrix A of order 12, starting at row 2 and column 2 in the array of order 13:
* * | . . . . . . . . . . . . . | | . 1.0 2.0 1.0 2.0 1.0 1.0 3.0 1.0 1.0 2.0 3.0 2.0 | | . . 3.0 2.0 3.0 1.0 2.0 3.0 1.0 1.0 2.0 3.0 3.0 | | . . . 3.0 1.0 3.0 2.0 1.0 2.0 1.0 2.0 3.0 1.0 | | . . . . 1.0 2.0 2.0 1.0 1.0 1.0 2.0 3.0 2.0 | | . . . . . 2.0 1.0 2.0 2.0 1.0 2.0 3.0 3.0 | | . . . . . . 1.0 2.0 1.0 1.0 2.0 3.0 1.0 | | . . . . . . . 2.0 1.0 1.0 2.0 3.0 2.0 | | . . . . . . . . 2.0 1.0 2.0 3.0 3.0 | | . . . . . . . . . 3.0 1.0 3.0 1.0 | | . . . . . . . . . . 2.0 2.0 2.0 | | . . . . . . . . . . . 1.0 3.0 | | . . . . . . . . . . . . 1.0 | * *
Only a portion of the data structure is used--that is, vector b, which is a column vector. Following is the vector b of size 12, starting at element 2 in the array of size 13:
* * | . | | 2.0 | | 3.0 | | 1.0 | | 2.0 | | 3.0 | | 1.0 | | 2.0 | | 3.0 | | 1.0 | | 2.0 | | 3.0 | | 1.0 | * *
Only a portion of the data structure is used--that is, vector b, which is a column vector. Following is the vector b of size 12, starting at element 2 in the array of size 13:
* * | . | | 42.0 | | 48.0 | | 39.0 | | 31.0 | | 34.0 | | 23.0 | | 23.0 | | 23.0 | | 15.0 | | 12.0 | | 6.0 | | 1.0 | * *
This subroutine performs one of the following solves for a triangular
system of equations with multiple right-hand sides:
Solution | Equation |
|
---|---|---|
1. B <-- alpha(A-1)B | AX = alphaB |
|
2. B <-- alpha(A-T)B | ATX = alphaB |
|
3. B <-- alphaB(A-1) | XA = alphaB |
|
4. B <-- alphaB(A-T) | XAT = alphaB |
|
5. b <-- (A-1)b | Ax = b |
|
6. b <-- (A-T)b | ATx = b |
|
where, in the formulas above:
Notes:
If any of the assumed-shape arrays have a size of zero, no computation is performed, and the subroutine returns after doing some parameter checking.
See references [17], [30], [31], and [44].
alpha, A, B, b | Subprogram |
Long-precision real | TRSM |
HPF | Solutions 1-4 | CALL TRSM (alpha, a, b, uplo,
side)
CALL TRSM (alpha, a, b, uplo, side, transa, diag) |
HPF | Solutions 5 and 6 | CALL TRSM (a, b, uplo)
CALL TRSM (a, b, uplo, transa, diag) |
Type: required (solutions 1-4); not present (solutions 5 and 6)
Specified as: a number of the data type indicated in Table 115.
If uplo = 'U', the array contains the upper triangle of the triangular matrix A in its upper triangle, and its strictly lower triangular part is not referenced.
If uplo = 'L', the array contains the lower triangle of the triangular matrix A in its lower triangle, and its strictly upper triangular part is not referenced.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 115, where size(a,1) = size(a,2).
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 115.
If uplo = 'U', the upper triangular part is referenced.
If uplo = 'L', the lower triangular part is referenced.
Type: required
Specified as: a single character; uplo = 'U' or 'L'.
If side = 'L', A is to the left of B, resulting in solution 1 or 2.
If side = 'R', A is to the right of B, resulting in solution 3 or 4.
Type: required (solutions 1-4); not present (solutions 5 and 6)
Specified as: a single character; side = 'L' or 'R'.
If transa = 'N', A is used in the system of equations, resulting in solution 1, 3, or 5.
If transa = 'T', AT is used in the system of equations, resulting in solution 2, 4, or 6.
Type: optional
Default: transa = 'N'
Specified as: a single character; transa = 'N' or 'T'.
If diag = 'U', A is a unit triangular matrix.
If diag = 'N', A is not a unit triangular matrix.
Type: optional
Default: diag = 'N'
Specified as: a single character; diag = 'U' or 'N'.
Type: required
Returned as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 115.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" and "Error Conditions" also apply to this subroutine.
The process grid is not the same for a and b.
The data distribution is inconsistent for a and b.
The data distribution for a or b is unsupported.
The process grid is not the same for a and b.
The data distribution is inconsistent for a and b.
The shape of the assumed-shape array for a is invalid: size(a,1) <> size(a,2)
The data distribution for a or b is unsupported.
The shape of the assumed-shape arrays a and b is incompatible: size(a,1) <> size(b)
This example shows the solution B <-- alpha(A-1)B. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B CALL TRSM( 1.0D0 , A , B , 'U' , 'L' ) -or- CALL TRSM( 1.0D0 , A , B , 'U' , 'L' , TRANSA='N' , DIAG='N' )
Triangular matrix A of order 5 is upper triangular:
* * | 3.0 -1.0 2.0 2.0 1.0 | | . -2.0 4.0 -1.0 3.0 | | . . -3.0 0.0 2.0 | | . . . 4.0 -2.0 | | . . . . 1.0 | * *
General 5 × 3 matrix B:
* * | 6.0 10.0 -2.0 | | -16.0 -1.0 6.0 | | -2.0 1.0 -4.0 | | 14.0 0.0 -14.0 | | -1.0 2.0 1.0 | * *
General 5 × 3 matrix B:
* * | 2.0 3.0 1.0 | | 5.0 5.0 4.0 | | 0.0 1.0 2.0 | | 3.0 1.0 -3.0 | | -1.0 2.0 1.0 | * *
This example solves b <-- A-1b, where A is a unit triangular matrix, and b is a row vector.
Array sections are specified for arguments a and b, resulting in the computation using a submatrix A starting at row 2 and column 2 in an array and a row vector b starting at element 2 in an array.
As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN B(:) WITH A(1,:) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A CALL TRSM( A(2:13,2:13) , B(2:13) , 'L' , DIAG='U' ) -or- CALL TRSM( A(2:13,2:13) , B(2:13) , 'L' , TRANSA='N' , DIAG='U' )
Only a portion of the data structure is used--that is, submatrix A. Following is the triangular submatrix A of order 12, starting at row 2 and column 2 in the array of order 13:
* * | . . . . . . . . . . . . . | | . 1.0 . . . . . . . . . . . | | . 2.0 1.0 . . . . . . . . . . | | . 3.0 2.0 1.0 . . . . . . . . . | | . 1.0 3.0 2.0 1.0 . . . . . . . . | | . 2.0 1.0 3.0 2.0 1.0 . . . . . . . | | . 3.0 2.0 1.0 3.0 2.0 1.0 . . . . . . | | . 1.0 3.0 2.0 1.0 3.0 2.0 1.0 . . . . . | | . 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 . . . . | | . 3.0 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 . . . | | . 1.0 3.0 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 . . | | . 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 . | | . 3.0 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 3.0 2.0 1.0 | * *
Note: | Because matrix A is unit triangular, the diagonal elements are not referenced. This subroutine assumes a value of 1.0 for the diagonal elements. |
Only a portion of the data structure is used--that is, vector b, which is a row vector. Following is the vector b of size 12, starting at element 2 in the array of size 13:
* * | . 2.0 7.0 13.0 15.0 17.0 26.0 28.0 27.0 39.0 41.0 37.0 52.0 | * *
Only a portion of the data structure is used--that is, vector b, which is a row vector. Following is the vector b of size 12, starting at element 2 in the array of size 13:
* * | . 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 1.0 2.0 3.0 1.0 | * *
This subroutine computes one of the following rank-k updates:
where, in the formulas above:
Note: | No data should be moved to form the matrix transpose; that is, the matrix should always be stored in its untransposed form. |
In the following cases, no computation is performed and the subroutine returns after doing some parameter checking:
See references [17], [30], [31], and [44].
alpha, beta, A, C, a | Subprogram |
Long-precision real | SYRK |
HPF | Equations 1 and 2 | CALL SYRK (alpha, a, beta, c,
uplo)
CALL SYRK (alpha, a, beta, c, uplo, trans) |
HPF | Equation 3 | CALL SYRK (alpha, a, c, uplo) |
Type: required
Specified as: a number of the data type indicated in Table 116.
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 116.
Type: required (equations 1 and 2); not present (equation 3)
Specified as: a number of the data type indicated in Table 116.
If uplo = 'U', the array contains the upper triangle of the symmetric matrix C in its upper triangle, and its strictly lower triangular part is not referenced.
If uplo = 'L', the array contains the lower triangle of the symmetric matrix C in its lower triangle, and its strictly upper triangular part is not referenced.
For equations 1 and 2, when beta is zero, C need not be set on input.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 116, where size(c,1) = size(c,2).
If uplo = 'U', the upper triangular part is referenced.
If uplo = 'L', the lower triangular part is referenced.
Type: required
Specified as: a single character; uplo = 'U' or 'L'.
If trans = 'N', the computation in equation 1 is performed.
If trans = 'T', the computation in equation 2 is performed.
Type: optional (equations 1 and 2); not present (equation 3)
Default: trans = 'N'
Specified as: a single character; trans = 'N' or 'T'.
Type: required
Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 116.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" and "Error Conditions" also apply to this subroutine.
The process grid is not the same for a and c.
The data distribution is inconsistent for a and c.
The data distribution for a or c is unsupported.
The process grid is not the same for a and c.
The data distribution is unsupported for c.
The vector for a is replicated.
The data distribution for a is unsupported.
The shape of the assumed-shape array for c is invalid: size(c,1) <> size(c,2)
The data distribution for c or a is unsupported.
The shape of the assumed-shape arrays c and a is incompatible: size(c,1) <> size(a)
This example computes C = alphaAAT+betaC. As in "Example", array data is block-cyclically distributed using a 2 × 3 process grid.
!HPF$ PROCESSORS PROC(2,3) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, C CALL SYRK( 1.0D0 , A , 1.0D0 , C , UPLO='L' ) -or- CALL SYRK( 1.0D0 , A , 1.0D0 , C , UPLO='L' , TRANS='N' )
General 8 × 5 matrix A:
* * | 0.0 8.0 16.0 24.0 32.0 | | 1.0 9.0 17.0 25.0 33.0 | | 2.0 10.0 18.0 26.0 34.0 | | 3.0 11.0 19.0 27.0 35.0 | | 4.0 12.0 20.0 28.0 36.0 | | 5.0 13.0 21.0 29.0 37.0 | | 6.0 14.0 22.0 30.0 38.0 | | 7.0 15.0 23.0 31.0 39.0 | * *
Symmetric matrix C of order 8:
* * | 0.0 . . . . . . . | | 1.0 8.0 . . . . . . | | 2.0 9.0 15.0 . . . . . | | 3.0 10.0 16.0 21.0 . . . . | | 4.0 11.0 17.0 22.0 26.0 . . . | | 5.0 12.0 18.0 23.0 27.0 30.0 . . | | 6.0 13.0 19.0 24.0 28.0 31.0 33.0 . | | 7.0 14.0 20.0 25.0 29.0 32.0 34.0 35.0 | * *
Symmetric matrix C of order 8:
* * | 1920.0 . . . . . . . | | 2001.0 2093.0 . . . . . . | | 2082.0 2179.0 2275.0 . . . . . | | 2163.0 2265.0 2366.0 2466.0 . . . . | | 2244.0 2351.0 2457.0 2562.0 2666.0 . . . | | 2325.0 2437.0 2548.0 2658.0 2767.0 2875.0 . . | | 2406.0 2523.0 2639.0 2754.0 2868.0 2981.0 3093.0 . | | 2487.0 2609.0 2730.0 2850.0 2969.0 3087.0 3204.0 3320.0 | * *
This example computes C = alphaaaT+C. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN A(:) WITH C(:,1) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A CALL SYRK( 1.0D0 , A , C , UPLO='L' )
Symmetric matrix C of order 9:
* * | 1.0 . . . . . . . . | | 2.0 12.0 . . . . . . . | | 3.0 13.0 23.0 . . . . . . | | 4.0 14.0 24.0 34.0 . . . . . | | 5.0 15.0 25.0 35.0 45.0 . . . . | | 6.0 16.0 26.0 36.0 46.0 56.0 . . . | | 7.0 17.0 27.0 37.0 47.0 57.0 67.0 . . | | 8.0 18.0 28.0 38.0 48.0 58.0 68.0 78.0 . | | 9.0 19.0 29.0 39.0 49.0 59.0 69.0 79.0 89.0 | * *
Vector a of size 9:
* * | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | * *
Matrix C of order 9:
* * | 2.0 . . . . . . . . | | 3.0 13.0 . . . . . . . | | 4.0 14.0 24.0 . . . . . . | | 5.0 15.0 25.0 35.0 . . . . . | | 6.0 16.0 26.0 36.0 46.0 . . . . | | 7.0 17.0 27.0 37.0 47.0 57.0 . . . | | 8.0 18.0 28.0 38.0 48.0 58.0 68.0 . . | | 9.0 19.0 29.0 39.0 49.0 59.0 69.0 79.0 . | | 10.0 20.0 30.0 40.0 50.0 60.0 70.0 80.0 90.0 | * *
This subroutine computes one of the following rank-2k updates:
where, in the formulas above:
Note: | No data should be moved to form the matrix transposes; that is, the matrices should always be stored in their untransposed forms. |
In the following cases, no computation is performed and the subroutine returns after doing some parameter checking:
See references [17], [30], [31], and [44].
alpha, beta, A, B, C, a, b | Subprogram |
Long-precision real | SYR2K |
HPF | Equations 1 and 2 | CALL SYR2K (alpha, a, b, beta,
c, uplo)
CALL SYR2K (alpha, a, b, beta, c, uplo, trans) |
HPF | Equation 3 | CALL SYR2K (alpha, a, b, c, uplo) |
Type: required
Specified as: a number of the data type indicated in Table 117.
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 117.
Type: required
Specified as: an assumed-shape array with shape (:,:) or (:), containing numbers of the data type indicated in Table 117.
Type: required (equations 1 and 2); not present (equation 3)
Specified as: a number of the data type indicated in Table 117.
If uplo = 'U', the array contains the upper triangle of the symmetric matrix C in its upper triangle, and its strictly lower triangular part is not referenced.
If uplo = 'L', the array contains the lower triangle of the symmetric matrix C in its lower triangle, and its strictly upper triangular part is not referenced.
For equations 1 and 2, when beta is zero, C need not be set on input.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 117, where size(c,1) = size(c,2).
If uplo = 'U', the upper triangular part is referenced.
If uplo = 'L', the lower triangular part is referenced.
Type: required
Specified as: a single character; uplo = 'U' or 'L'.
If trans = 'N', the computation in equation 1 is performed.
If trans = 'T', the computation in equation 2 is performed.
Type: optional (equations 1 and 2); not present (equation 3)
Default: trans = 'N'
Specified as: a single character; trans = 'N' or 'T'.
Type: required
Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 117.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Notes and Coding Rules" and "Notes and Coding Rules" also apply to this subroutine.
The process grid is not the same for a, b, and c.
The data distribution is inconsistent for a, b, and c.
The data distribution for a, b, or c is unsupported.
The process grid is not the same for a, b, and c.
The data distribution is unsupported for c.
The vector for a or b is replicated.
The data distribution is unsupported for a or b.
The data distribution for a, b, or c is unsupported.
This example computes C = alphaATB+alphaBTA+betaC. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, B, C CALL SYR2K( 1.0D0 , A , B , 0.0D0 , C , 'U' , 'T' )
General 8 × 9 matrix A:
* * | 0.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 1.0 | | 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 1.0 | | 0.0 0.0 -1.0 -1.0 0.0 0.0 1.0 0.0 1.0 | | 0.0 1.0 0.0 -1.0 1.0 1.0 0.0 1.0 1.0 | | 1.0 0.0 0.0 0.0 -1.0 0.0 0.0 0.0 1.0 | | 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 1.0 | | 0.0 0.0 -1.0 0.0 -1.0 0.0 0.0 0.0 1.0 | | -1.0 0.0 0.0 0.0 0.0 0.0 -1.0 0.0 1.0 | * *
General 8 × 9 matrix B:
* * | 0.0 1.0 1.0 0.0 0.0 0.0 0.0 0.0 -1.0 | | 0.0 -1.0 0.0 -1.0 0.0 -1.0 0.0 -1.0 -1.0 | | 0.0 0.0 1.0 1.0 0.0 0.0 -1.0 0.0 -1.0 | | 0.0 -1.0 0.0 1.0 -1.0 -1.0 0.0 -1.0 -1.0 | | -1.0 0.0 0.0 0.0 1.0 0.0 0.0 0.0 -1.0 | | -1.0 0.0 0.0 0.0 -1.0 -1.0 0.0 0.0 -1.0 | | 0.0 0.0 1.0 0.0 1.0 0.0 0.0 0.0 -1.0 | | 1.0 0.0 0.0 0.0 0.0 0.0 1.0 0.0 -1.0 | * *
Symmetric matrix C of order 9:
* * | -6.0 0.0 0.0 0.0 0.0 -2.0 -2.0 0.0 -2.0 | | . -6.0 -2.0 0.0 -2.0 -4.0 0.0 -4.0 -2.0 | | . . -6.0 -2.0 -2.0 0.0 2.0 0.0 6.0 | | . . . -6.0 2.0 0.0 2.0 0.0 2.0 | | . . . . -8.0 -4.0 0.0 -2.0 0.0 | | . . . . . -6.0 0.0 -4.0 -6.0 | | . . . . . . -4.0 0.0 0.0 | | . . . . . . . -4.0 -4.0 | | . . . . . . . . -16.0 | * *
This example computes C = alphaabT+alphabaT+C. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ ALIGN A(:) WITH C(:,1) !HPF$ ALIGN B(:) WITH C(:,1) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: C CALL SYR2K( 1.0D0 , A , B , C , 'L' )
Symmetric matrix C of order 9:
* * | 1.0 . . . . . . . . | | 2.0 12.0 . . . . . . . | | 3.0 13.0 23.0 . . . . . . | | 4.0 14.0 24.0 34.0 . . . . . | | 5.0 15.0 25.0 35.0 45.0 . . . . | | 6.0 16.0 26.0 36.0 46.0 56.0 . . . | | 7.0 17.0 27.0 37.0 47.0 57.0 67.0 . . | | 8.0 18.0 28.0 38.0 48.0 58.0 68.0 78.0 . | | 9.0 19.0 29.0 39.0 49.0 59.0 69.0 79.0 89.0 | * *
Vector a of size 9:
* * | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | | 1.0 | * *
Vector b of size 9:
* * | 2.0 | | 2.0 | | 2.0 | | 2.0 | | 2.0 | | 2.0 | | 2.0 | | 2.0 | | 2.0 | * *
Matrix C of order 9:
* * | 5.0 . . . . . . . . | | 6.0 16.0 . . . . . . . | | 7.0 17.0 27.0 . . . . . . | | 8.0 18.0 28.0 38.0 . . . . . | | 9.0 19.0 29.0 39.0 49.0 . . . . | | 10.0 20.0 30.0 40.0 50.0 60.0 . . . | | 11.0 21.0 31.0 41.0 51.0 61.0 71.0 . . | | 12.0 22.0 32.0 42.0 52.0 62.0 72.0 82.0 . | | 13.0 23.0 33.0 43.0 53.0 63.0 73.0 83.0 93.0 | * *
This subroutine performs the following matrix computation:
where, in the formula above:
Note: | No data should be moved to form the matrix transpose; that is, the matrix should always be stored in its untransposed form. |
In the following two cases, no computation is performed and the subroutine returns after doing some parameter checking:
See references [17], [30], [31], and [44].
alpha, beta, A, C | Subprogram |
Long-precision real | TRAN |
HPF | CALL TRAN (alpha, a, beta, c) |
Type: required
Specified as: a number of the data type indicated in Table 118.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 118.
Type: required
Specified as: a number of the data type indicated in Table 118.
Type: required
Specified as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 113.
Type: required
Returned as: an assumed-shape array with shape (:,:), containing numbers of the data type indicated in Table 118.
HPF-specific errors are listed below. Resource and input-argument errors listed in "Error Conditions" also apply to this subroutine.
The process grid is not the same for a and c.
The data distribution is inconsistent for a and c.
The shape of the assumed-shape arrays a and c is incompatible:
The data distribution for a or c is unsupported.
This example computes C = betaC+alphaAT. As in "Example", array data is block-cyclically distributed using a 2 × 2 process grid.
!HPF$ PROCESSORS PROC(2,2) !HPF$ DISTRIBUTE (CYCLIC, CYCLIC) ONTO PROC :: A, C CALL TRAN( 1.0D0 , A , 1.0D0 , C )
General 8 × 9 matrix A:
* * | 0.0 -1.0 -1.0 0.0 0.0 0.0 0.0 0.0 1.0 | | 0.0 1.0 0.0 1.0 0.0 1.0 0.0 1.0 1.0 | | 0.0 0.0 -1.0 -1.0 0.0 0.0 1.0 0.0 1.0 | | 0.0 1.0 0.0 -1.0 1.0 1.0 0.0 1.0 1.0 | | 1.0 0.0 0.0 0.0 -1.0 0.0 0.0 0.0 1.0 | | 1.0 0.0 0.0 0.0 1.0 1.0 0.0 0.0 1.0 | | 0.0 0.0 -1.0 0.0 -1.0 0.0 0.0 0.0 1.0 | | -1.0 0.0 0.0 0.0 0.0 0.0 -1.0 0.0 1.0 | * *
General 9 × 8 matrix C:
* * | 0.0 1.0 1.0 5.0 6.0 7.0 8.0 9.0 | | 0.0 -1.0 0.0 -1.0 0.0 -1.0 0.0 1.0 | | 0.0 0.0 1.0 1.0 0.0 0.0 -1.0 0.0 | | 0.0 -1.0 0.0 1.0 -1.0 -1.0 0.0 1.0 | | -1.0 2.0 0.0 0.0 1.0 0.0 0.0 0.0 | | -1.0 3.0 0.0 0.0 -1.0 -1.0 0.0 0.0 | | 0.0 4.0 1.0 0.0 1.0 0.0 0.0 0.0 | | 1.0 5.0 0.0 0.0 0.0 0.0 1.0 0.0 | | 1.0 2.0 3.0 4.0 1.0 1.0 1.0 1.0 | * *
General 9 × 8 matrix C:
* * | 0.0 1.0 1.0 5.0 7.0 8.0 8.0 8.0 | | -1.0 0.0 0.0 0.0 0.0 -1.0 0.0 1.0 | | -1.0 0.0 0.0 1.0 0.0 0.0 -2.0 0.0 | | 0.0 0.0 -1.0 0.0 -1.0 -1.0 0.0 1.0 | | -1.0 2.0 0.0 1.0 0.0 1.0 -1.0 0.0 | | -1.0 4.0 0.0 1.0 -1.0 0.0 0.0 0.0 | | 0.0 4.0 2.0 0.0 1.0 0.0 0.0 -1.0 | | 1.0 6.0 0.0 1.0 0.0 0.0 1.0 0.0 | | 2.0 3.0 4.0 5.0 2.0 2.0 2.0 2.0 | * *