Guide and Reference

Linear Least Squares Considerations

This section provides some key points about using the linear least squares subroutines.

Use Considerations

If you want to use a singular value decomposition method to compute the minimal norm linear least squares solution of AX is congruent to B, calls to SGESVF or DGESVF should be followed by calls to SGESVS or DGESVS, respectively.

Performance and Accuracy Considerations

Least squares solutions obtained by using a singular value decomposition require more storage and run time than those obtained using a QR decomposition with column pivoting. The singular value decomposition method, however, is a more reliable way to handle rank deficiency.
The short-precision subroutines provide increased accuracy by accumulating intermediate results in long precision. Occasionally, for performance reasons, these intermediate results are stored.
The accuracy of the resulting singular values and singular vectors varies between the short- and long-precision versions of each subroutine. The degree of difference depends on the size and conditioning of the matrix computation.
There are ESSL-specific rules that apply to the results of computations on the workstation processors using the ANSI/IEEE standards. For details, see "What Data Type Standards Are Used by ESSL, and What Exceptions Should You Know About?".

Dense Linear Algebraic Equation Subroutines

This section contains the dense linear algebraic equation subroutine descriptions.

SGEF, DGEF, CGEF, and ZGEF--General Matrix Factorization

This subroutine factors a square general matrix A using Gaussian elimination with partial pivoting. To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGES/SGESM, DGES/DGESM, CGES/CGESM, or ZGES/ZGESM, respectively. To compute the inverse of matrix A, follow the call to these subroutines with a call to SGEICD or DGEICD, respectively.

Table 87. Data Types

A Subroutine
Short-precision real SGEF
Long-precision real DGEF
Short-precision complex CGEF
Long-precision complex ZGEF

Note: The output from these factorization subroutines should be used only as input to the following subroutines for performing a solve or inverse: SGES/SGESM/SGEICD, DGES/DGESM/DGEICD, CGES/CGESM, and ZGES/ZGESM, respectively.

Syntax

Fortran	CALL SGEF \| DGEF \| CGEF \| ZGEF (`a`, `lda`, `n`, `ipvt`)
C and C++	sgef \| dgef \| cgef \| zgef (`a`, `lda`, `n`, `ipvt`);
PL/I	CALL SGEF \| DGEF \| CGEF \| ZGEF (`a`, `lda`, `n`, `ipvt`);

On Entry

a: is the n by n general matrix A to be factored. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 87.
lda: is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.
n: is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.
ipvt: See 'On Return'.

On Return

a: is the n by n transformed matrix A, containing the results of the factorization. See "Function". Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 87.
ipvt: is the integer vector ipvt of length n, containing the pivot information necessary to construct matrix L from the information contained in the output array a. Returned as: a one-dimensional array of (at least) length n, containing fullword integers.

Notes

ipvt is not a permutation vector in the strict sense. It is used to record row interchanges in L due to partial pivoting.
Calling SGEFCD or DGEFCD with iopt = 0 is equivalent to calling SGEF or DGEF.

Function

The matrix A is factored using Gaussian elimination with partial pivoting to compute the LU factorization of A, where:

ipvt is the vector containing the pivoting information.

L is a unit lower triangular matrix.

U is an upper triangular matrix.

The transformed matrix A contains U in the upper triangle. In its strict lower triangle, it contains the multipliers necessary to construct, with the help of ipvt, a matrix L, such that A = LU.

If n is 0, no computation is performed. See references [36] and [38].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is singular.

One or more columns of L and the corresponding diagonal of U contain all zeros (all columns of L are checked). The first column, i, of L with a corresponding U = 0 diagonal element is identified in the computational error message.
The return code is set to 1.
i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2103 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

lda <= 0
n < 0
n > lda

Example 1

This example shows a factorization of a real general matrix A of order 9.

Call Statement and Input

           A  LDA  N   IPVT
           |   |   |    |
CALL SGEF( A , 9 , 9 , IPVT )

        *                                                *
        | 1.0  1.0  1.0  1.0  0.0  0.0   0.0   0.0   0.0 |
        | 1.0  1.0  1.0  1.0  1.0  0.0   0.0   0.0   0.0 |
        | 4.0  1.0  1.0  1.0  1.0  1.0   0.0   0.0   0.0 |
        | 0.0  5.0  1.0  1.0  1.0  1.0   1.0   0.0   0.0 |
A    =  | 0.0  0.0  6.0  1.0  1.0  1.0   1.0   1.0   0.0 |
        | 0.0  0.0  0.0  7.0  1.0  1.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  8.0  1.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  0.0  9.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  0.0  0.0  10.0  11.0  12.0 |
        *                                                *

Output

        *                                                                             *
        | 4.0000  1.0000  1.0000  1.0000   1.0000   1.0000   0.0000   0.0000   0.0000 |
        | 0.0000  5.0000  1.0000  1.0000   1.0000   1.0000   1.0000   0.0000   0.0000 |
        | 0.0000  0.0000  6.0000  1.0000   1.0000   1.0000   1.0000   1.0000   0.0000 |
        | 0.0000  0.0000  0.0000  7.0000   1.0000   1.0000   1.0000   1.0000   1.0000 |
A    =  | 0.0000  0.0000  0.0000  0.0000   8.0000   1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   9.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000  10.0000  11.0000  12.0000 |
        | 0.2500  0.1500  0.1000  0.0714   0.0536  -0.0694  -0.0306   0.1806   0.3111 |
        | 0.2500  0.1500  0.1000  0.0714  -0.0714  -0.0556  -0.0194   0.9385  -0.0031 |
        *                                                                             *

IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)

Example 2

This example shows a factorization of a complex general matrix A of order 4.

Call Statement and Input

           A  LDA  N   IPVT
           |   |   |    |
CALL CGEF( A , 4 , 4 , IPVT )
 
        *                                             *
        | (1.0, 2.0) (1.0, 7.0) (2.0, 4.0) (3.0, 1.0) |
A    =  | (2.0, 0.0) (1.0, 3.0) (4.0, 4.0) (2.0, 3.0) |
        | (2.0, 1.0) (5.0, 0.0) (3.0, 6.0) (0.0, 0.0) |
        | (8.0, 5.0) (1.0, 9.0) (6.0, 6.0) (8.0, 1.0) |
        *                                             *

Output

        *                                                                          *
        |  (8.0000, 5.0000)   (1.0000, 9.0000)  (6.0000, 6.0000)  (8.0000, 1.0000) |
A    =  |  (0.2022, 0.1236)   (1.9101, 5.0562)  (1.5281, 2.0449) (1.5056, -0.1910) |
        | (0.2360, -0.0225) (-0.0654, -0.9269) (-0.3462, 6.2692) (-1.6346, 1.3269) |
        | (0.1798, -0.1124)   (0.2462, 0.1308) (0.4412, -0.3655)  (0.2900, 2.3864) |
        *                                                                          *

IPVT     =  (4, 4, 3, 4)

SGES, DGES, CGES, and ZGES--General Matrix, Its Transpose, or Its Conjugate Transpose Solve

These subroutines solve the system Ax = b for x, where A is a general matrix and x and b are vectors. Using the iopt argument, they can also solve the real system A^Tx = b or the complex system A^Hx = b for x. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively.

Table 88. Data Types

A, b, x Subroutine
Short-precision real SGES
Long-precision real DGES
Short-precision complex CGES
Long-precision complex ZGES

Note: The input to these solve subroutines must be the output from the factorization subroutines SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.

Syntax

Fortran	CALL SGES \| DGES \| CGES \| ZGES (`a`, `lda`, `n`, `ipvt`, `bx`, `iopt`)
C and C++	sges \| dges \| cges \| zges (`a`, `lda`, `n`, `ipvt`, `bx`, `iopt`);
PL/I	CALL SGES \| DGES \| CGES \| ZGES (`a`, `lda`, `n`, `ipvt`, `bx`, `iopt`);

On Entry

a

is the factorization of matrix A, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 88.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.

ipvt

is the integer vector ipvt of length n, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively. It contains the pivot information necessary to construct matrix L from the information contained in the array specified for a.

Specified as: a one-dimensional array of (at least) length n, containing fullword integers.

bx

is the vector b of length n, containing the right-hand side of the system. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 88.

iopt

determines the type of computation to be performed, where:

If iopt = 0, A is used in the computation.

If iopt = 1, A^T is used in SGES and DGES. A^H is used in CGES and ZGES.
Note: No data should be moved to form A^T or A^H; that is, the matrix A should always be stored in its untransposed form.

Specified as: a fullword integer; iopt = 0 or 1.

On Return

bx: is the solution vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 88.

Notes

The scalar data specified for input arguments lda and n for these subroutines must be the same as the corresponding input arguments specified for SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.
The array data specified for input arguments a and ipvt for these subroutines must be the same as the corresponding output arguments for SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

The system Ax = b is solved for x, where A is a general matrix and x and b are vectors. Using the iopt argument, this subroutine can also solve the real system A^Tx = b or the complex system A^Hx = b for x. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively. The transformed matrix A consists of the upper triangular matrix U and the multipliers necessary to construct L using ipvt, as defined in "Function". For a description of how A is factored, see SGEF, DGEF, CGEF, and ZGEF--General Matrix Factorization.

If n is 0, no computation is performed. See references [36] and [38].

Error Conditions

Computational Errors

None
Note: If the factorization performed by SGEF, DGEF, CGEF, ZGEF, SGEFCD, DGEFCD, or DGEFP failed because a pivot element is zero, the results returned by this subroutine are unpredictable, and there may be a divide-by-zero program exception message.

Input-Argument Errors

lda <= 0
n < 0
n > lda
iopt <> 0 or 1

Example 1

Part 1

This part of the example shows how to solve the system Ax = b, where matrix A is the same matrix factored in the "Example 1" for SGEF and DGEF.

Call Statement and Input

           A  LDA  N   IPVT   BX  IOPT
           |   |   |    |     |    |
CALL SGES( A , 9 , 9 , IPVT , BX , 0  )

IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)
BX       =  (4.0, 5.0, 9.0, 10.0, 11.0, 12.0, 12.0, 12.0, 33.0)
A        =(same as output A in
"Example 1")

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

Part 2

This part of the example shows how to solve the system A^Tx = b, where matrix A is the input matrix factored in "Example 1" for SGEF and DGEF. Most of the input is the same in Part 2 as in Part 1.

Call Statement and Input

           A  LDA  N   IPVT   BX  IOPT
           |   |   |    |     |    |
CALL SGES( A , 9 , 9 , IPVT , BX , 1  )

IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)
BX       =  (6.0, 8.0, 10.0, 12.0, 13.0, 14.0, 15.0, 15.0, 15.0)
A        =(same as output A in
"Example 1")

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

Example 2

Part 1

This part of the example shows how to solve the system Ax = b, where matrix A is the same matrix factored in the "Example 2" for CGEF and ZGEF.

Call Statement and Input

           A  LDA  N   IPVT   BX  IOPT
           |   |   |    |     |    |
CALL CGES( A , 4 , 4 , IPVT , BX , 0  )

IPVT     =  (4, 4, 3, 4)
BX       =  ((-10.0, 85.0), (-6.0, 61.0), (10.0, 38.0),
             (58.0, 168.0))
A        =(same as output A in
"Example 1")

Output

BX       =  ((9.0, 0.0), (5.0, 1.0), (1.0, 6.0), (3.0, 4.0))

Part 2

This part of the example shows how to solve the system A^Hx = b, where matrix A is the input matrix factored in "Example 2" for CGEF and ZGEF. Most of the input is the same in Part 2 as in Part 1.

Call Statement and Input

           A  LDA  N   IPVT   BX  IOPT
           |   |   |    |     |    |
CALL CGES( A , 4 , 4 , IPVT , BX , 1  )

IPVT     =  (4, 4, 3, 4)
BX       =  ((71.0, 12.0), (61.0, -70.0), (123.0, -34.0),
             (68.0, 7.0))
A        =(same as output A in
"Example 1")

Output

BX       =  ((9.0, 0.0), (5.0, 1.0), (1.0, 6.0), (3.0, 4.0))

SGESM, DGESM, CGESM, and ZGESM--General Matrix, Its Transpose, or Its Conjugate Transpose Multiple Right-Hand Side Solve

These subroutines solve the following systems of equations for multiple right-hand sides, where A, X, and B are general matrices. SGESM and DGESM solve one of the following:

1. AX = B

2. A^TX = B

CGESM and ZGESM solve one of the following:

1. AX = B

2. A^TX = B

3. A^HX = B

These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively.

Table 89. Data Types

A, B, X Subroutine
Short-precision real SGESM
Long-precision real DGESM
Short-precision complex CGESM
Long-precision complex ZGESM

Note: The input to these solve subroutines must be the output from the factorization subroutines SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.

Syntax

Fortran	CALL SGESM \| DGESM \| CGESM \| ZGESM (`trans`, `a`, `lda`, `n`, `ipvt`, `b`, `ldb`, `nrhs`)
C and C++	sgesm \| dgesm \| cgesm \| zgesm (`trans`, `a`, `lda`, `n`, `ipvt`, `b`, `ldb`, `nrhs`);
PL/I	CALL SGESM \| DGESM \| CGESM \| ZGESM (`trans`, `a`, `lda`, `n`, `ipvt`, `b`, `ldb`, `nrhs`);

On Entry

trans

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation, resulting in equation 1.

If transa = 'T', A^T is used in the computation, resulting in equation 2.

If transa = 'C', A^H is used in the computation, resulting in equation 3.

Specified as: a single character. It must be 'N', 'T', or 'C'.

a

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.

ipvt

Specified as: a one-dimensional array of (at least) length n, containing fullword integers.

b

is the matrix B, containing the nrhs right-hand sides of the system. The right-hand sides, each of length n, reside in the columns of matrix B. Specified as: an ldb by (at least) nrhs array, containing numbers of the data type indicated in Table 89.

ldb

is the leading dimension of the array specified for b. Specified as: a fullword integer; ldb > 0 and ldb >= n.

nrhs

is the number of right-hand sides in the system to be solved. Specified as: a fullword integer; nrhs >= 0.

On Return

b: is the matrix B, containing the nrhs solutions to the system in the columns of B. Specified as: an ldb by (at least) nrhs array, containing numbers of the data type indicated in Table 89.

Notes

For SGESM and DGESM, if you specify 'C' for the trans argument, it is interpreted as though you specified 'T'.
The scalar data specified for input arguments lda and n for these subroutines must be the same as the corresponding input arguments specified for SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.
The array data specified for input arguments a and ipvt for these subroutines must be the same as the corresponding output arguments for SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, and ZGEF, respectively.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

One of the following systems of equations is solved for multiple right-hand sides:

1. AX = B

2. A^TX = B

3. A^HX = B (only for CGESM and ZGESM)

where A, B, and X are general matrices. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGEF/SGEFCD, DGEF/DGEFP/DGEFCD, CGEF, or ZGEF, respectively. The transformed matrix A consists of the upper triangular matrix U and the multipliers necessary to construct L using ipvt, as defined in "Function". For a description of how A is factored, see SGEF, DGEF, CGEF, and ZGEF--General Matrix Factorization.

If n or nrhs is 0, no computation is performed. See references [36] and [38].

Error Conditions

Computational Errors

Input-Argument Errors

trans <> 'N', 'T', or 'C'
lda, ldb <= 0
n < 0
n > lda, ldb
nrhs < 0

Example 1

Part 1

This part of the example shows how to solve the system AX = B for two right-hand sides, where matrix A is the same matrix factored in the "Example 1" for SGEF and DGEF.

Call Statement and Input

           TRANS  A  LDA  N   IPVT   B  LDB  NRHS
             |    |   |   |    |     |   |    |
CALL SGESM( 'N' , A , 9 , 9 , IPVT , B , 9 ,  2  )

IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)
A        =(same as output A in
"Example 1")

        *             *
        |  4.0   10.0 |
        |  5.0   15.0 |
        |  9.0   24.0 |
        | 10.0   35.0 |
B    =  | 11.0   48.0 |
        | 12.0   63.0 |
        | 12.0   70.0 |
        | 12.0   78.0 |
        | 33.0  266.0 |
        *             *

Output

        *          *
        | 1.0  1.0 |
        | 1.0  2.0 |
        | 1.0  3.0 |
        | 1.0  4.0 |
B    =  | 1.0  5.0 |
        | 1.0  6.0 |
        | 1.0  7.0 |
        | 1.0  8.0 |
        | 1.0  9.0 |
        *          *

Part 2

This part of the example shows how to solve the system A^TX = B for two right-hand sides, where matrix A is the input matrix factored in "Example 1" for SGEF and DGEF.

Call Statement and Input

           TRANS  A  LDA  N   IPVT   B  LDB  NRHS
             |    |   |   |    |     |   |    |
CALL SGESM( 'T' , A , 9 , 9 , IPVT , B , 9 ,  2  )

IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)
A        =(same as output A in
"Example 1")

        *             *
        |  6.0   15.0 |
        |  8.0   26.0 |
        | 10.0   40.0 |
        | 12.0   57.0 |
B    =  | 13.0   76.0 |
        | 14.0   97.0 |
        | 15.0  120.0 |
        | 15.0  125.0 |
        | 15.0  129.0 |
        *             *

Output

        *          *
        | 1.0  1.0 |
        | 1.0  2.0 |
        | 1.0  3.0 |
        | 1.0  4.0 |
B    =  | 1.0  5.0 |
        | 1.0  6.0 |
        | 1.0  7.0 |
        | 1.0  8.0 |
        | 1.0  9.0 |
        *          *

Example 2

Part 1

This part of the example shows how to solve the system AX = B for two right-hand sides, where matrix A is the same matrix factored in the "Example 2" for CGEF and ZGEF.

Call Statement and Input

           TRANS  A  LDA  N   IPVT   B  LDB  NRHS
             |    |   |   |    |     |   |    |
CALL CGESM( 'N' , A , 4 , 4 , IPVT , B , 4 ,  2  )

IPVT     =  (4, 4, 3, 4)
A        =(same as output A in
"Example 2")

        *                              *
        | (-10.0, 85.0)  (-11.0, 53.0) |
B    =  |  (-6.0, 61.0)   (-6.0, 54.0) |
        |  (10.0, 38.0)    (2.0, 40.0) |
        | (58.0, 168.0)  (15.0, 105.0) |
        *                              *

Output

        *                        *
        | (9.0, 0.0)  (1.0, 1.0) |
B    =  | (5.0, 1.0)  (2.0, 2.0) |
        | (1.0, 6.0)  (3.0, 3.0) |
        | (3.0, 4.0)  (4.0, 4.0) |
        *                        *

Part 2

This part of the example shows how to solve the system A^TX = B for two right-hand sides, where matrix A is the input matrix factored in "Example 2" for CGEF and ZGEF.

Call Statement and Input

           TRANS  A  LDA  N   IPVT   B  LDB  NRHS
             |    |   |   |    |     |   |    |
CALL CGESM( 'T' , A , 4 , 4 , IPVT , B , 4 ,  2  )

IPVT     =  (4, 4, 3, 4)
A        =(same as output A in
"Example 2")

        *                               *
        |   (71.0, 12.0)   (18.0, 68.0) |
B    =  |  (61.0, -70.0)  (-27.0, 71.0) |
        | (123.0, -34.0)  (-11.0, 97.0) |
        |    (68.0, 7.0)   (28.0, 50.0) |
        *                               *

Output

        *                        *
        | (9.0, 0.0)  (1.0, 1.0) |
B    =  | (5.0, 1.0)  (2.0, 2.0) |
        | (1.0, 6.0)  (3.0, 3.0) |
        | (3.0, 4.0)  (4.0, 4.0) |
        *                        *

Part 3

This part of the example shows how to solve the system A^HX = B for two right-hand sides, where matrix A is the input matrix factored in "Example 2" for CGEF and ZGEF.

Call Statement and Input

           TRANS  A  LDA  N   IPVT   B  LDB  NRHS
             |    |   |   |    |     |   |    |
CALL CGESM( 'C' , A , 4 , 4 , IPVT , B , 4 ,  2  )

IPVT     =  (4, 4, 3, 4)
A        =(same as output A in
"Example 2")

        *                              *
        |  (58.0, -3.0)   (45.0, 20.0) |
B    =  | (68.0, -31.0)  (83.0, -20.0) |
        | (89.0, -22.0)    (98.0, 1.0) |
        |  (53.0, 15.0)   (45.0, 25.0) |
        *                              *

Output

        *                        *
        | (1.0, 4.0)  (4.0, 5.0) |
B    =  | (2.0, 3.0)  (3.0, 4.0) |
        | (3.0, 2.0)  (2.0, 3.0) |
        | (4.0, 1.0)  (1.0, 2.0) |
        *                        *

SGETRF, DGETRF, CGETRF and ZGETRF--General Matrix Factorization

These subroutines factor general matrix A using Gaussian elimination with partial pivoting. To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGETRS, DGETRS CGETRS, or ZGETRS, respectively. To compute the inverse of matrix A, follow the call to these subroutines with a call to SGEICD or DGEICD, respectively.

Table 90. Data Types

A Subroutine
Short-precision real SGETRF
Long-precision real DGETRF
Short-precision complex CGETRF
Long-precision complex ZGETRF

Note: The output from these factorization subroutines should be used only as input to the following subroutines for performing a solve or inverse: SGETRS, DGETRS, CGETRS, ZGETRS, SGEICD or DGEICD respectively.

Syntax

Fortran	CALL SGETRF \| DGETRF \| CGETRF \| ZGETRF (`m`, `n`, `a`, `lda`, `ipvt`, `info`)
C and C++	sgetrf \| dgetrf \| cgetrf \| zgetrf (`m`, `n`, `a`, `lda`, `ipvt`, `info`);
PL/I	CALL SGETRF \| DGETRF \| CGETRF \| ZGETRF (`m`, `n`, `a`, `lda`, `ipvt`, `info`);

On Entry

m: the number of rows in general matrix A used in the computation. Specified as: a fullword integer; 0 <= m <= lda.
n: the number of columns in general matrix A used in the computation. Specified as: a fullword integer; n >= 0.
a: is the m by n general matrix A to be factored. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 90.
lda: is the leading dimension of matrix A. Specified as: a fullword integer; lda > 0 and lda >= m.
ipvt: See 'On Return'.
info: See 'On Return'.

On Return

a

is the m by n transformed matrix A, containing the results of the factorization. See "Function". Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 90.

ipvt

is the integer vector ipvt of length min(m,n), containing the pivot information necessary to construct matrix L from the information contained in the output array A. Returned as: a one-dimensional array of (at least) length min(m,n), containing fullword integers,where 1 <= ipvt(i) <= m.

info

has the following meaning:

If info = 0, the factorization of general matrix A completed successfully.

If info > 0, info is set equal to the first i, where U_ii is singular and its inverse could not be computed.

Specified as: a fullword integer; info >= 0.

Notes

In your C program, argument info must be passed by reference.
The matrix A and vector ipvt must have no common elements; otherwise results are unpredictable.
The way these subroutines handle singularity differs from LAPACK. These subroutines us the info argument to provide information about the singularity of A, like LAPACK, but also provide an error message.
On both input and output, matrix A conforms to LAPACK format.

Function

The matrix A is factored using Gaussian elimination with partial pivoting to compute the LU factorization of A, where:

ipvt is the vector containing the pivoting information.

L is a unit lower triangular matrix.

U is an upper triangular matrix.

On output, the transformed matrix A contains U in the upper triangle (if m >= n) or upper trapezoid (if m < n). In its strict lower triangle (if m <= n) or lower trapezoid (if m > n), it contains the multipliers necessary to construct, with the help of ipvt, a matrix L, such that A = LU.

If m or n is 0, no computation is performed and the subroutine returns after doing some parameter checking. See references [36] and [59].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is singular.

The first column, i, of L with a corresponding U_ii = 0 diagonal element is identified in the computational error message.
The computational error message may occur multiple times with processing continuing after each error, because the default for the number of allowable errors for error code 2146 is set to be unlimited in the ESSL error option table.

Input-Argument Errors

m < 0
n < 0
m > lda
lda <= 0

Example 1

This example shows a factorization of a real general matrix A of order 9.

Call Statement and Input

             M   N   A  LDA  IPVT  INFO
             |   |   |   |     |     |
CALL DGETRF( 9 , 9 , A,  9 , IPVT, INFO )

        *                                             *
        | 1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4  2.6 |
        | 1.2  1.0  1.2  1.4  1.6  1.8  2.0  2.2  2.4 |
        | 1.4  1.2  1.0  1.2  1.4  1.6  1.8  2.0  2.2 |
        | 1.6  1.4  1.2  1.0  1.2  1.4  1.6  1.8  2.0 |
A    =  | 1.8  1.6  1.4  1.2  1.0  1.2  1.4  1.6  1.8 |
        | 2.0  1.8  1.6  1.4  1.2  1.0  1.2  1.4  1.6 |
        | 2.2  2.0  1.8  1.6  1.4  1.2  1.0  1.2  1.4 |
        | 2.4  2.2  2.0  1.8  1.6  1.4  1.2  1.0  1.2 |
        | 2.6  2.4  2.2  2.0  1.8  1.6  1.4  1.2  1.0 |
        *                                             *

Output

        *                                              *
        | 2.6   2.4  2.2  2.0  1.8  1.6  1.4  1.2  1.0 |
        | 0.4   0.3  0.6  0.8  1.1  1.4  1.7  1.9  2.2 |
        | 0.5  -0.4  0.4  0.8  1.2  1.6  2.0  2.4  2.8 |
        | 0.5  -0.3  0.0  0.4  0.8  1.2  1.6  2.0  2.4 |
A    =  | 0.6  -0.3  0.0  0.0  0.4  0.8  1.2  1.6  2.0 |
        | 0.7  -0.2  0.0  0.0  0.0  0.4  0.8  1.2  1.6 |
        | 0.8  -0.2  0.0  0.0  0.0  0.0  0.4  0.8  1.2 |
        | 0.8  -0.1  0.0  0.0  0.0  0.0  0.0  0.4  0.8 |
        | 0.9  -0.1  0.0  0.0  0.0  0.0  0.0  0.0  0.4 |
        *                                              *

IPVT     =  (9, 9, 9, 9, 9, 9, 9, 9, 9)
INFO     =  0

Example 2

This example shows a factorization of a complex general matrix A of order 9.

Call Statement and Input

             M   N   A  LDA  IPVT  INFO
             |   |   |   |     |     |
CALL ZGETRF( 9 , 9 , A,  9 , IPVT, INFO )

 
        *                                                                                                     *
        | (2.0, 1.0) (2.4,-1.0) (2.8,-1.0) (3.2,-1.0)  (3.6,-1.0) (4.0,-1.0) (4.4,-1.0) (4.8,-1.0) (5.2,-1.0) |
        | (2.4, 1.0) (2.0, 1.0) (2.4,-1.0) (2.8,-1.0)  (3.2,-1.0) (3.6,-1.0) (4.0,-1.0) (4.4,-1.0) (4.8,-1.0) |
        | (2.8, 1.0) (2.4, 1.0) (2.0, 1.0) (2.4,-1.0)  (2.8,-1.0) (3.2,-1.0) (3.6,-1.0) (4.0,-1.0) (4.4,-1.0) |
        | (3.2, 1.0) (2.8, 1.0) (2.4, 1.0) (2.0, 1.0)  (2.4,-1.0) (2.8,-1.0) (3.2,-1.0) (3.6,-1.0) (4.0,-1.0) |
A    =  | (3.6, 1.0) (3.2, 1.0) (2.8, 1.0) (2.4, 1.0)  (2.0, 1.0) (2.4,-1.0) (2.8,-1.0) (3.2,-1.0) (3.6,-1.0) |
        | (4.0, 1.0) (3.6, 1.0) (3.2, 1.0) (2.8, 1.0)  (2.4, 1.0) (2.0, 1.0) (2.4,-1.0) (2.8,-1.0) (3.2,-1.0) |
        | (4.4, 1.0) (4.0, 1.0) (3.6, 1.0) (3.2, 1.0)  (2.8, 1.0) (2.4, 1.0) (2.0, 1.0) (2.4,-1.0) (2.8,-1.0) |
        | (4.8, 1.0) (4.4, 1.0) (4.0, 1.0) (3.6, 1.0)  (3.2, 1.0) (2.8, 1.0) (2.4, 1.0) (2.0, 1.0) (2.4,-1.0) |
        | (5.2, 1.0) (4.8, 1.0) (4.4, 1.0) (4.0, 1.0)  (3.6, 1.0) (3.2, 1.0) (2.8, 1.0) (2.4, 1.0) (2.0, 1.0) |
        *                                                                                                     *

Output

        *                                                                                                         *
        | (5.2, 1.0) (4.8, 1.0) (4.4, 1.0)   (4.0, 1.0)   (3.6, 1.0)  (3.2, 1.0) (2.8, 1.0) (2.4, 1.0) (2.0, 1.0) |
        | (0.4, 0.1) (0.6,-2.0) (1.1,-1.9)   (1.7,-1.9)   (2.3,-1.8)  (2.8,-1.8) (3.4,-1.7) (3.9,-1.7) (4.5,-1.6) |
        | (0.5, 0.1) (0.0,-0.1) (0.6,-1.9)   (1.2,-1.8)   (1.8,-1.7)  (2.5,-1.6) (3.1,-1.5) (3.7,-1.4) (4.3,-1.3) |
        | (0.6, 0.1) (0.0,-0.1) (-0.1,-0.1)  (0.7,-1.9)   (1.3,-1.7)  (2.0,-1.6) (2.7,-1.5) (3.4,-1.4) (4.0,-1.2) |
A    =  | (0.6, 0.1) (0.0,-0.1) (-0.1,-0.1) (-0.1, 0.0)   (0.7,-1.9)  (1.5,-1.7) (2.2,-1.6) (2.9,-1.5) (3.7,-1.3) |
        | (0.7, 0.1) (0.0,-0.1)  (0.0, 0.0) (-0.1, 0.0)  (-0.1, 0.0)  (0.8,-1.9) (1.6,-1.8) (2.4,-1.6) (3.2,-1.5) |
        | (0.8, 0.0) (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0) (0.8,-1.9) (1.7,-1.8) (2.5,-1.8) |
        | (0.9, 0.0) (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0) (0.0, 0.0) (0.8,-2.0) (1.7,-1.9) |
        | (0.9, 0.0) (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0)   (0.0, 0.0)  (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.8,-2.0) |
        *                                                                                                         *

IPVT     =  (9, 9, 9, 9, 9, 9, 9, 9, 9)
INFO     =  0

SGETRS, DGETRS, CGETRS, and ZGETRS--General Matrix Multiple Right-Hand Side Solve

SGETRS and DGETRS solve one of the following systems of equations for multiple right-hand sides:

1. AX = B

2. A^TX = B

CGETRS and ZGETRS solve one of the following systems of equations for multiple right-hand sides:

1. AX = B

2. A^TX = B

3. A^HX = B

In the formulas above:

A represents the general matrix A containing the LU factorization.

B represents the general matrix B containing the right-hand sides in its columns.

X represents the general matrix B containing the solution vectors in its columns.

These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGETRF, DGETRF, CGETRF, or ZGETRF, respectively.

Table 91. Data Types

A, B Subroutine
Short-precision real SGETRS
Long-precision real DGETRS
Short-precision complex CGETRS
Long-precision complex ZGETRS

Note: The input to these solve subroutines must be the output from the factorization subroutines SGETRF, DGETRF, CGETRF and ZGETRF, respectively.

Syntax

Fortran	CALL SGETRS \| DGETRS \| CGETRS \| ZGETRS (`transa`, `n`, `nrhs`, `a`, `lda`, `ipvt`, `bx`, `ldb`, `info`)
C and C++	sgetrs \| dgetrs \| cgetrs \| zgetrs (`transa`, `n`, `nrhs`, `a`, `lda`, `ipvt`, `bx`, `ldb`, `info`);
PL/I	CALL SGETRS \| DGETRS \| CGETRS \| ZGETRS (`transa`, `n`, `nrhs`, `a`, `lda`, `ipvt`, `bx`, `ldb`, `info`);

On Entry

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation, resulting in solution 1.

If transa = 'T', A^T is used in the computation, resulting in solution 2.

If transa = 'C', A^H is used in the computation, resulting in solution 3.

Specified as: a single character; transa = 'N', 'T', or 'C'.

n

is the order of factored matrix A and the number of rows in matrix B. Specified as: a fullword integer; n >= 0.

nrhs

the number of right-hand sides--that is, the number of columns in matrix B used in the computation. Specified as: a fullword integer; nrhs >= 0.

a

is the factorization of matrix A, produced by a preceding call to SGETRF, DGETRF, CGETRF, or ZGETRF, respectively. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 91.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

ipvt

is the integer vector ipvt of length n, produced by a preceding call to SGETRF, DGETRF, CGETRF or ZGETRF, respectively. It contains the pivot information necessary to construct matrix L from the information contained in the array specified for a.

Specified as: a one-dimensional array of (at least) length n, containing fullword integers, where 1 <= ipvt(i) <= n.

bx

is the general matrix B containing the right-hand side of the system. Specified as: an ldb by (at least) nrhs array, containing numbers of the data type indicated in Table 91.

ldb

is the leading dimension of the array specified for b. Specified as: a fullword integer; ldb > 0 and ldb >= n.

info

See 'On Return'.

On Return

bx: is the solution X containing the results of the computation. Returned as: an ldb by (at least) nrhs array, containing numbers of the data type indicated in Table 91.
info: info has the following meaning:
If info = 0, the solve of general matrix A completed successfully.

Notes

In your C program, argument info must be passed by reference.
These subroutines accept lower case letters for the transa argument.
For SGETRS and DGETRS, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.
The scalar data specified for input argument n must be the same for both _GETRF and _GETRS. In addition, the scalar data specified for input argument m in _GETRF must be the same as input argument n in both _GETRF and _GETRS.
If, however, you do not plan to call _GETRS after calling _GETRF, then input arguments m and n in _GETRF do not need to be equal.
The array data specified for input arguments a and ipvt for these subroutines must be the same as the corresponding output arguments for SGETRF, DGETRF, CGETRF, and ZGETRF, respectively.
The matrices and vector used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".
On both input and output, matrices A and B conform to LAPACK format.

Function

One of the following systems of equations is solved for multiple right-hand sides:

1. AX = B

2. A^TX = B

3. A^HX = B (only for CGETRS and ZGETRS)

where A, B, and X are general matrices. These subroutines uses the results of the factorization of matrix A, produced by a preceding call to SGETRF, DGETRF, CGETRF or ZGETRF, respectively. On input, the transformed matrix A consists of the upper triangular matrix U and the multipliers necessary to construct L using ipvt, as defined in "Function". For details on the factorization, see SGETRF, DGETRF, CGETRF and ZGETRF--General Matrix Factorization.

If n = 0 or nrhs = 0, no computation is performed and the subroutine returns after doing some parameter checking. See references [36] and [59].

Error Conditions

Computational Errors

None
Note: If the factorization performed by SGETRF, DGETRF, CGETRF or ZGETRF failed because a pivot element is zero, the results returned by this subroutine are unpredictable, and there may be a divide-by-zero program exception message.

Input-Argument Errors

transa <> 'N', 'T', or 'C'
n < 0
nrhs < 0
n > lda
n > ldb
lda <= 0
ldb <= 0

Example 1

This example shows how to solve the system AX = B, where matrix A is the same matrix factored in the "Example 1" for DGETRF.

Call Statement and Input

           TRANSA  N  NRHS  A  LDA  IPIV  BX LDB  INFO
             |     |    |   |   |     |   |   |     |
CALL DGETRS('N' ,  9 ,  5 , A , 9 , IPIV, B , 9 , INFO)

IPVT     =  (9, 9, 9, 9, 9, 9, 9, 9, 9)
A      =(same as output A in
"Example 1")

        *                                  *
        | 93.0  186.0  279.0  372.0  465.0 |
        | 84.4  168.8  253.2  337.6  422.0 |
        | 76.6  153.2  229.8  306.4  383.0 |
        | 70.0  140.0  210.0  280.0  350.0 |
B    =  | 65.0  130.0  195.0  260.0  325.0 |
        | 62.0  124.0  186.0  248.0  310.0 |
        | 61.4  122.8  184.2  245.6  307.0 |
        | 63.6  127.2  190.8  254.4  318.0 |
        | 69.0  138.0  207.0  276.0  345.0 |
        *                                  *

Output

        *                             *
        | 1.0   2.0   3.0   4.0   5.0 |
        | 2.0   4.0   6.0   8.0  10.0 |
        | 3.0   6.0   9.0  12.0  15.0 |
        | 4.0   8.0  12.0  16.0  20.0 |
B    =  | 5.0  10.0  15.0  20.0  25.0 |
        | 6.0  12.0  18.0  24.0  30.0 |
        | 7.0  14.0  21.0  28.0  35.0 |
        | 8.0  16.0  24.0  32.0  40.0 |
        | 9.0  18.0  27.0  36.0  45.0 |
        *                             *

INFO     =  0

Example 2

This example shows how to solve the system AX = b, where matrix A is the same matrix factored in the "Example 2" for ZGETRF.

Call Statement and Input

           TRANS   N  NRHS  A  LDA  IPIV  B  LDB  INFO
             |     |    |   |   |     |   |   |     |
CALL ZGETRS('N' ,  9 ,  5 , A , 9 , IPIV, B , 9 , INFO)

IPVT     =  (9, 9, 9, 9, 9, 9, 9, 9, 9)
A        =(same as output A in
"Example 2")

        *                                                                           *
        | (193.0,-10.6)  (200.0, 21.8)  (207.0, 54.2)  (214.0, 86.6)  (221.0,119.0) |
        | (173.8, -9.4)  (178.8, 20.2)  (183.8, 49.8)  (188.8, 79.4)  (193.8,109.0) |
        | (156.2, -5.4)  (159.2, 22.2)  (162.2, 49.8)  (165.2, 77.4)  (168.2,105.0) |
        | (141.0,  1.4)  (142.0, 27.8)  (143.0, 54.2)  (144.0, 80.6)  (145.0,107.0) |
B    =  | (129.0, 11.0)  (128.0, 37.0)  (127.0, 63.0)  (126.0, 89.0)  (125.0,115.0) |
        | (121.0, 23.4)  (118.0, 49.8)  (115.0, 76.2)  (112.0,102.6)  (109.0,129.0) |
        | (117.8, 38.6)  (112.8, 66.2)  (107.8, 93.8)  (102.8,121.4)   (97.8,149.0) |
        | (120.2, 56.6)  (113.2, 86.2)  (106.2,115.8)   (99.2,145.4)   (92.2,175.0) |
        | (129.0, 77.4)  (120.0,109.8)  (111.0,142.2). (102.0,174.6)   (93.0,207.0) |
        *                                                                           *

Output

        *                                                     *
        | (1.0,1.0)  (1.0,2.0)  (1.0,3.0) (1.0,4.0) (1.0,5.0) |
        | (2.0,1.0)  (2.0,2.0)  (2.0,3.0) (2.0,4.0) (2.0,5.0) |
        | (3.0,1.0)  (3.0,2.0)  (3.0,3.0) (3.0,4.0) (3.0,5.0) |
        | (4.0,1.0)  (4.0,2.0)  (4.0,3.0) (4.0,4.0) (4.0,5.0) |
B     = | (5.0,1.0)  (5.0,2.0)  (5.0,3.0) (5.0,4.0) (5.0,5.0) |
        | (6.0,1.0)  (6.0,2.0)  (6.0,3.0) (6.0,4.0) (6.0,5.0) |
        | (7.0,1.0)  (7.0,2.0)  (7.0,3.0) (7.0,4.0) (7.0,5.0) |
        | (8.0,1.0)  (8.0,2.0)  (8.0,3.0) (8.0,4.0) (8.0,5.0) |
        | (9.0,1.0)  (9.0,2.0)  (9.0,3.0) (9.0,4.0) (9.0,5.0) |
        *                                                     *

INFO     =  0

SGEFCD and DGEFCD--General Matrix Factorization, Condition Number Reciprocal, and Determinant

These subroutines factor general matrix A using Gaussian elimination. An estimate of the reciprocal of the condition number and the determinant of matrix A can also be computed. To solve a system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGES/SGESM or DGES/DGESM, respectively. To compute the inverse of matrix A, follow the call to these subroutines with a call to SGEICD and DGEICD, respectively.

Table 92. Data Types

A, aux, rcond, det Subroutine
Short-precision real SGEFCD
Long-precision real DGEFCD

Note: The output from these factorization subroutines should be used only as input to the following subroutines for performing a solve or inverse: SGES/SGESM/SGEICD and DGES/DGESM/DGEICD, respectively.

Syntax

Fortran	CALL SGEFCD \| DGEFCD (`a`, `lda`, `n`, `ipvt`, `iopt`, `rcond`, `det`, `aux`, `naux`)
C and C++	sgefcd \| dgefcd (`a`, `lda`, `n`, `ipvt`, `iopt`, `rcond`, `det`, `aux`, `naux`);
PL/I	CALL SGEFCD \| DGEFCD (`a`, `lda`, `n`, `ipvt`, `iopt`, `rcond`, `det`, `aux`, `naux`);

On Entry

a

is a general matrix A of order n, whose factorization, reciprocal of condition number, and determinant are computed. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 92.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.

ipvt

See 'On Return'.

iopt

indicates the type of computation to be performed, where:

If iopt = 0, the matrix is factored.

If iopt = 1, the matrix is factored, and the reciprocal of the condition number is computed.

If iopt = 2, the matrix is factored, and the determinant is computed.

If iopt = 3, the matrix is factored, and the reciprocal of the condition number and the determinant are computed.

Specified as: a fullword integer; iopt = 0, 1, 2, or 3.

rcond

See 'On Return'.

det

See 'On Return'.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is is a storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing numbers of the data type indicated in Table 92.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, SGEFCD and DGEFCD dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux >= n.

On Return

a

is the transformed matrix A of order n, containing the results of the factorization. See "Function". Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 92.

ipvt

is the integer vector ipvt of length n, containing the pivot information necessary to construct matrix L from the information contained in output matrix A. Returned as: a one-dimensional array of (at least) length n, containing fullword integers.

rcond

is an estimate of the reciprocal of the condition number, rcond, of matrix A. Returned as: a number of the data type indicated in Table 92; rcond >= 0.

det

is the vector det, containing the two components, det₁ and det₂, of the determinant of matrix A. The determinant is:

where 1 <= det₁ < 10. Returned as: an array of length 2, containing numbers of the data type indicated in Table 92.

Notes

In your C program, argument rcond must be passed by reference.
When iopt = 0, these subroutines provide the same function as a call to SGEF or DGEF, respectively.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

Matrix A is factored using Gaussian elimination with partial pivoting to compute the LU factorization of A, where:

ipvt is the vector containing the pivoting information.

L is a unit lower triangular matrix.

U is an upper triangular matrix.

The transformed matrix A contains U in the upper triangle. In its strict lower triangle, it contains the multipliers necessary to construct, with the help of ipvt, a matrix L, such that A = LU.

An estimate of the reciprocal of the condition number, rcond, and the determinant, det, can also be computed by this subroutine. The estimate of the condition number uses an enhanced version of the algorithm described in references [63] and [64].

If n is 0, no computation is performed. See reference [36].

These subroutines call SGEF and DGEF, respectively, to perform the factorization. ipvt is an output vector of SGEF and DGEF. It is returned for use by SGES/SGESM and DGES/DGESM, the solve subroutines.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

Matrix A is singular.

If your program is not terminated by SGEF and DGEF, then SGEFCD and DGEFCD, respectively, return 0 for rcond and det.
One or more columns of L and the corresponding diagonal of U contain all zeros (all columns of L are checked). The first column, i, of L with a corresponding U = 0 diagonal element is identified in the computational error message, issued by SGEF or DGEF, respectively.
i can be determined at run time by using the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2103 in the ESSL error option table; otherwise, the default value causes your program to be terminated by SGEF or DGEF, respectively, when this error occurs. If your program is not terminated by SGEF or DGEF, respectively, the return code is set to 2. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

lda <= 0
n < 0
n > lda
iopt <> 0, 1, 2, or 3
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example

This example shows a factorization of matrix A of order 9. The input is the same as used in SGEF and DGEF. See "Example 1". The reciprocal of the condition number and the determinant of matrix A are also computed. The values used to estimate the reciprocal of the condition number in this example are obtained with the following values:

||A||₁ = max(6.0, 8.0, 10.0, 12.0, 13.0, 14.0, 15.0, 15.0, 15.0) = 15.0

Estimate of ||A^-1||₁ = 1091.87

This estimate is equal to the actual rcond of 5.436(10^-5), which is computed by SGEICD and DGEICD. (See "Example 1".) On output, the value in det, |A|, is equal to 336.

Call Statement and Input

             A  LDA  N   IPVT  IOPT  RCOND   DET   AUX  NAUX
             |   |   |    |     |     |       |     |    |
CALL DGEFCD( A , 9 , 9 , IPVT , 3  , RCOND , DET , AUX , 9  )

A        =(same as input A in
"Example 1")

Output

A        =(same as output A in
"Example 1")
IPVT     =  (3, 4, 5, 6, 7, 8, 9, 8, 9)
RCOND    =  0.00005436
DET      =  (3.36, 2.00)

SPPF, DPPF, SPOF, DPOF, CPOF, and ZPOF--Positive Definite Real Symmetric or Complex Hermitian Matrix Factorization

The SPPF and DPPF subroutines factor positive definite symmetric matrix A, stored in lower-packed storage mode, using Gaussian elimination (LDL^T) or the Cholesky factorization method. To solve a system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SPPS or DPPS, respectively. To find the inverse of matrix A, follow the call to these subroutines, performing Cholesky factorization, with a call to SPPICD or DPPICD, respectively.

The SPOF, DPOF, CPOF, and ZPOF subroutines factor matrix A stored in upper or lower storage mode, where:

For SPOF and DPOF, A is a positive definite symmetric matrix.
For CPOF and ZPOF, A is a positive definite complex Hermitian matrix.

Matrix A is factored using Cholesky factorization, (LL^T or U^TU for SPOF and DPOF and LL^H or U^HU for CPOF and ZPOF). To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with a call to SPOSM, DPOSM, CPOSM, or ZPOSM. To find the inverse of matrix A, follow the call to SPOF or DPOF with a call to SPOICD or DPOICD.

Table 93. Data Types

A Subroutine
Short-precision real SPPF and SPOF
Long-precision real DPPF and DPOF
Short-precision complex CPOF
Long-precision complex ZPOF

Note: The output from SPPF and DPPF should be used only as input to the following subroutines for performing a solve or inverse: SPPS/SPPICD and DPPS/DPPICD, respectively. The output from SPOF, DPOF, CPOF, and ZPOF should be used only as input to the following subroutines for performing a solve or inverse: SPOSM/SPOICD, DPOSM/DPOICD, CPOSM, and ZPOSM, respectively.

Syntax

Fortran	CALL SPPF \| DPPF (`ap`, `n`, `iopt`) CALL SPOF \| DPOF \| CPOF \| ZPOF (`uplo`, `a`, `lda`, `n`)
C and C++	sppf \| dppf (`ap`, `n`, `iopt`); spof \| dpof \| cpof \| zpof (`uplo`, `a`, `lda`, `n`);
PL/I	CALL SPPF \| DPPF (`ap`, `n`, `iopt`); CALL SPOF \| DPOF \| CPOF \| ZPOF (`uplo`, `a`, `lda`, `n`);

On Entry

uplo

indicates whether matrix A is stored in upper or lower storage mode, where:

If uplo = 'U', A is stored in upper storage mode.

If uplo = 'L', A is stored in lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

ap

is array, referred to as AP, in which matrix A, to be factored, is stored in lower-packed storage mode.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 93. See "Notes".

If iopt = 0, the array must have at least n(n+1)/2+n elements.

If iopt = 1, the array must have at least n(n+1)/2 elements.

a

is the positive definite matrix A, to be factored.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 93.

lda

is the leading dimension of the array specified for a.

Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order n of matrix A.

Specified as: a fullword integer; n >= 0.

iopt

determines the type of computation to be performed, where:

If iopt = 0, the matrix is factored using the LDL^T method.

If iopt = 1, the matrix is factored using Cholesky factorization.

Specified as: a fullword integer; iopt = 0 or 1.

On Return

ap

is the transformed matrix A of order n, containing the results of the factorization. See "Notes" and "Function".

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 93.

If iopt = 0, the array contains n(n+1)/2+n elements.

If iopt = 1, the array contains n(n+1)/2 elements.

a

is the transformed matrix A of order n, containing the results of the factorization. See "Function".

Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 93.

Notes

All subroutines accept lowercase letters for the uplo argument.
In the input and output arrays specified for ap, the first n(n+1)/2 elements are matrix elements. The additional n locations, required in the array when iopt = 0, are used for working storage by this subroutine and should not be altered between calls to the factorization and solve subroutines.
On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, they are set to zero.
For a description of the storage modes used for the matrices, see:
- For positive definite symmetric matrices, see "Positive Definite or Negative Definite Symmetric Matrix".
- For positive definite complex Hermitian matrices, see "Positive Definite or Negative Definite Complex Hermitian Matrix".

Function

The functions for these subroutines are described in the sections below.

For SPPF and DPPF

If iopt = 0, the positive definite symmetric matrix A, stored in lower-packed storage mode, is factored using Gaussian elimination, where A is expressed as:

A = LDL^T

where:

L is a unit lower triangular matrix.

L^T is the transpose of matrix L.

D is a diagonal matrix.

If iopt = 1, the positive definite symmetric matrix A is factored using Cholesky factorization, where A is expressed as:

A = LL^T

where L is a lower triangular matrix.

If n is 0, no computation is performed. See references [36] and [38].

For SPOF, DPOF, CPOF, and ZPOF

The positive definite matrix A, stored in upper or lower storage mode, is factored using Cholesky factorization, where A is expressed as:

A = LL^T or A = U^TU for SPOF and DPOF

A = LL^H or A = U^HU for CPOF and ZPOF

where:

L is a lower triangular matrix.

L^T is the transpose of matrix L.

L^H is the conjugate transpose of matrix L.

U is an upper triangular matrix.

U^T is the transpose of matrix U.

U^H is the conjugate transpose of matrix U.

If n is 0, no computation is performed. See references [8], [64], and [36].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is not positive definite (for SPPF and DPPF when iopt = 0).
- Processing continues to the end of the matrix.
- One or more elements of D contain values less than or equal to 0; all elements of D are checked. The index i of the last nonpositive element encountered is identified in the computational error message.
- The return code is set to 1.
- i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2104 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".
Matrix A is not positive definite (for SPPF and DPPF when iopt = 1 and for SPOF, DPOF, CPOF, and ZPOF).
- Processing stops at the first occurrence of a nonpositive definite diagonal element.
- The order i of the first minor encountered having a nonpositive determinant is identified in the computational error message.
- The return code is set to 1.
- i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2115 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

n < 0
iopt <> 0 or 1
uplo <> 'U' or 'L'
lda <= 0
n > lda

Example 1

This example shows a factorization of positive definite symmetric matrix A of order 9, stored in lower-packed storage mode, where on input matrix A is:

             *                                             *
             | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
             | 1.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0 |
             | 1.0  2.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
             | 1.0  2.0  3.0  4.0  4.0  4.0  4.0  4.0  4.0 |
             | 1.0  2.0  3.0  4.0  5.0  5.0  5.0  5.0  5.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  6.0  6.0  6.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0  7.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 |
             *                                             *

On output, all elements of this matrix A are 1.0.
Note: The AP arrays are formatted in a triangular arrangement for readability; however, they are stored in lower-packed storage mode.

Call Statement and Input

           AP  N  IOPT
           |   |   |
CALL SPPF( AP, 9,  0 )

AP = (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0,
      3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0,
      4.0, 4.0, 4.0, 4.0, 4.0, 4.0,
      5.0, 5.0, 5.0, 5.0, 5.0,
      6.0, 6.0, 6.0, 6.0,
      7.0, 7.0, 7.0,
      8.0, 8.0,
      9.0,
       . ,  . ,  . ,  . ,  . ,  . ,  . ,  . ,  .  )

Output

AP = (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0,
      1.0, 1.0,
      1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

Example 2

This example shows a factorization of the same positive definite symmetric matrix A of order 9 used in Example 1, stored in lower-packed storage mode.
Note: The AP arrays are formatted in a triangular arrangement for readability; however, they are stored in lower-packed storage mode.

Call Statement and Input

           AP  N  IOPT
           |   |   |
CALL SPPF( AP, 9,  1 )

AP = (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0,
      3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0,
      4.0, 4.0, 4.0, 4.0, 4.0, 4.0,
      5.0, 5.0, 5.0, 5.0, 5.0,
      6.0, 6.0, 6.0, 6.0,
      7.0, 7.0, 7.0,
      8.0, 8.0,
      9.0)

Output

AP = (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0, 1.0,
      1.0, 1.0, 1.0,
      1.0, 1.0,
      1.0)

Example 3

This example shows a factorization of the same positive definite symmetric matrix A of order 9 used in Example 1, but stored in lower storage mode.

Call Statement and Input

           UPLO  A  LDA  N
            |    |   |   |
CALL SPOF( 'L' , A , 9 , 9 )

        *                                             *
        | 1.0   .    .    .    .    .    .    .    .  |
        | 1.0  2.0   .    .    .    .    .    .    .  |
        | 1.0  2.0  3.0   .    .    .    .    .    .  |
        | 1.0  2.0  3.0  4.0   .    .    .    .    .  |
A    =  | 1.0  2.0  3.0  4.0  5.0   .    .    .    .  |
        | 1.0  2.0  3.0  4.0  5.0  6.0   .    .    .  |
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0   .    .  |
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0   .  |
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 |
        *                                             *

Output

        *                                             *
        | 1.0   .    .    .    .    .    .    .    .  |
        | 1.0  1.0   .    .    .    .    .    .    .  |
        | 1.0  1.0  1.0   .    .    .    .    .    .  |
        | 1.0  1.0  1.0  1.0   .    .    .    .    .  |
A    =  | 1.0  1.0  1.0  1.0  1.0   .    .    .    .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0   .    .    .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        *                                             *

Example 4

This example shows a factorization of the same positive definite symmetric matrix A of order 9 used in Example 1, but stored in upper storage mode.

Call Statement and Input

           UPLO  A  LDA  N
            |    |   |   |
CALL SPOF( 'U' , A , 9 , 9 )

        *                                             *
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        |  .   2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0 |
        |  .    .   3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
        |  .    .    .   4.0  4.0  4.0  4.0  4.0  4.0 |
A    =  |  .    .    .    .   5.0  5.0  5.0  5.0  5.0 |
        |  .    .    .    .    .   6.0  6.0  6.0  6.0 |
        |  .    .    .    .    .    .   7.0  7.0  7.0 |
        |  .    .    .    .    .    .    .   8.0  8.0 |
        |  .    .    .    .    .    .    .    .   9.0 |
        *                                             *

Output

        *                                             *
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        |  .   1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        |  .    .   1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        |  .    .    .   1.0  1.0  1.0  1.0  1.0  1.0 |
A    =  |  .    .    .    .   1.0  1.0  1.0  1.0  1.0 |
        |  .    .    .    .    .   1.0  1.0  1.0  1.0 |
        |  .    .    .    .    .    .   1.0  1.0  1.0 |
        |  .    .    .    .    .    .    .   1.0  1.0 |
        |  .    .    .    .    .    .    .    .   1.0 |
        *                                             *

Example 5

This example shows a factorization of positive definite complex Hermitian matrix A of order 3, stored in lower storage mode, where on input matrix A is:

              *                                         *
              |  (25.0, 0.0)  (-5.0, -5.0)  (10.0, 5.0) |
              |  (-5.0, 5.0)   (51.0, 0.0)  (4.0, -6.0) |
              | (10.0, -5.0)    (4.0, 6.0)  (71.0, 0.0) |
              *                                         *

Note:

On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, they are set to zero.

Call Statement and Input

           UPLO  A  LDA  N
            |    |   |   |
CALL CPOF( 'L' , A , 3 , 3 )

        *                                     *
        |   (25.0, . )      .           .     |
A    =  |  (-5.0, 5.0) (51.0,  . )      .     |
        | (10.0, -5.0)  (4.0, 6.0) (71.0, . ) |
        *                                     *

Output

        *                                     *
        |  (5.0, 0.0)      .           .      |
A    =  | (-1.0, 1.0)  (7.0, 0.0)      .      |
        | (2.0, -1.0)  (1.0, 1.0)  (8.0, 0.0) |
        *                                     *

Example 6

This example shows a factorization of positive definite complex Hermitian matrix A of order 3, stored in upper storage mode, where on input matrix A is:

               *                                     *
               |  (9.0, 0.0)  (3.0, 3.0) (3.0, -3.0) |
               | (3.0, -3.0) (18.0, 0.0) (8.0, -6.0) |
               |  (3.0, 3.0)  (8.0, 6.0) (43.0, 0.0) |
               *                                     *

Note:

On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, they are set to zero.

Call Statement and Input

           UPLO  A  LDA  N
            |    |   |   |
CALL CPOF( 'U' , A , 3 , 3 )

        *                                     *
        | (9.0,  . )   (3.0,3.0)   (3.0,-3.0) |
A    =  |     .       (18.0, . )   (8.0,-6.0) |
        |     .            .      (43.0,  . ) |
        *                                     *

Output

        *                                       *
        | (3.0, 0.0)    (1.0, 1.0)  (1.0, -1.0) |
A    =  |     .         (4.0, 0.0)  (2.0, -1.0) |
        |     .            .         (6.0, 0.0) |
        *                                       *

SPPS and DPPS--Positive Definite Real Symmetric Matrix Solve

These subroutines solve the system Ax = b for x, where A is a positive definite symmetric matrix, and x and b are vectors. The subroutines use the results of the factorization of matrix A, produced by a preceding call to SPPF/SPPFCD or DPPF/DPPFP/DPPFCD, respectively.

Table 94. Data Types

A, b, x Subroutine
Short-precision real SPPS
Long-precision real DPPS

Note: The input to these solve subroutines must be the output from the factorization subroutines SPPF/SPPFCD and DPPF/DPPFP/DPPFCD, respectively.

Syntax

Fortran	CALL SPPS \| DPPS (`ap`, `n`, `bx`, `iopt`)
C and C++	spps \| dpps (`ap`, `n`, `bx`, `iopt`);
PL/I	CALL SPPS \| DPPS (`ap`, `n`, `bx`, `iopt`);

On Entry

ap

is the factorization of matrix A, produced by a preceding call to SPPF/SPPFCD or DPPF/DPPFP/DPPFCD, respectively. Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 94, where:

If iopt = 0, the array must contain n(n+1)/2+n elements.

If iopt = 1, the array must contain n(n+1)/2 elements.

n

is the order of matrix A used in the factorization, and the lengths of vectors b and x. Specified as: a fullword integer; n >= 0.

bx

is the vector b of length n, containing the right-hand side of the system. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 94.

iopt

indicates the type of factorization that was performed on matrix A, where:

If iopt = 0, the matrix was factored using the LDL^T method.

If iopt = 1, the matrix was factored using Cholesky factorization.

Specified as: a fullword integer; iopt = 0 or 1.

On Return

bx: is the solution vector x of length n, containing the results of the computation. Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 94.

Notes

The array data specified for input argument ap for these subroutines must be the same as the corresponding output argument for SPPF/SPPFCD and DPPF/DPPFP/DPPFCD, respectively.
The scalar data specified for input argument n for these subroutines must be the same as that specified for SPPF/SPPFCD and DPPF/DPPFP/DPPFCD, respectively.
When you call these subroutines after calling SPPF or DPPF, the value of input argument iopt must be the same as that specified for SPPF and DPPF.
When you call these subroutines after calling SPPFCD or DPPFCD, the value of input argument iopt must be 0.
When you call these subroutines after calling DPPFP, the value of input argument iopt must be 1.
In the input array specified for ap, the first n(n+1)/2 elements are matrix elements. The additional n locations, required in the array when iopt = 0, are used for working storage by this subroutine and should not be altered between calls to the factorization and solve subroutines.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".
For a description of how a positive definite symmetric matrix is stored in lower-packed storage mode in an array, see "Symmetric Matrix".

Function

The system Ax = b is solved for x, where A is a positive definite symmetric matrix, stored in lower-packed storage mode in array AP, and x and b are vectors. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SPPF/SPPFCD or DPPF/DPPFP/DPPFCD, respectively.

If n is 0, no computation is performed. See references [36] and [38].

Error Conditions

Computational Errors

None
Note: If a call to SPPF, DPPF, SPPFCD, DPPFCD, or DPPFP resulted in a nonpositive definite matrix, error 2104 or 2115, SPPS or DPPS results may be unpredictable or numerically unstable.

Input-Argument Errors

n < 0
iopt <> 0 or 1

Example 1

This example shows how to solve the system Ax = b, where matrix A is the same matrix factored in the "Example 1" for SPPF and DPPF.

Call Statement and Input

            AP   N   BX   IOPT
            |    |   |     |
CALL SPPS ( AP , 9 , BX ,  0 )

AP       =(same as output AP in
"Example 1" for SPPF and DPPF)
BX       =  (9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

Example 2

This example shows how to solve the same system as in Example 1, where matrix A is the same matrix factored in the "Example 2" for SPPF and DPPF.

Call Statement and Input

           AP   N   BX  IOPT
           |    |   |    |
CALL SPPS( AP , 9 , BX , 1 )

AP       =(same as output AP in
"Example 2" for SPPF and DPPF)
BX       =  (9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

SPOSM, DPOSM, CPOSM, and ZPOSM--Positive Definite Real Symmetric or Complex Hermitian Matrix Multiple Right-Hand Side Solve

These subroutines solve the system AX = B for X, using multiple right-hand sides, where X and B are general matrices and:

For SPOSM and DPOSM, A is a positive definite symmetric matrix.
For CPOSM and ZPOSM, A is a positive definite complex Hermitian matrix.

These subroutines use the results of the factorization of matrix A, produced by a preceding call to SPOF/SPOFCD, DPOF/DPOFCD, CPOF, or ZPOF, respectively.

Table 95. Data Types

A, B, X Subroutine
Short-precision real SPOSM
Long-precision real DPOSM
Short-precision complex CPOSM
Long-precision complex ZPOSM

Note: The input to these solve subroutines must be the output from the factorization subroutines SPOF/SPOFCD, DPOF/DPOFCD, CPOF, and ZPOF, respectively.

Syntax

Fortran	CALL SPOSM \| DPOSM \| CPOSM \| ZPOSM (`uplo`, `a`, `lda`, `n`, `b`, `ldb`, `nrhs`)
C and C++	sposm \| dposm \| cposm \| zposm (`uplo`, `a`, `lda`, `n`, `b`, `ldb`, `nrhs`);
PL/I	CALL SPOSM \| DPOSM \| CPOSM \| ZPOSM (`uplo`, `a`, `lda`, `n`, `b`, `ldb`, `nrhs`);

On Entry

uplo

indicates whether the original matrix A is stored in upper or lower storage mode, where:

If uplo = 'U', A is stored in upper storage mode.

If uplo = 'L', A is stored in lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

a

is the factorization of positive definite matrix A, produced by a preceding call to SPOF/SPOFCD, DPOF/DPOFCD, CPOF, or ZPOF, respectively. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 95.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.

b

ldb

is the leading dimension of the array specified for b. Specified as: a fullword integer; ldb > 0 and ldb >= n.

nrhs

is the number of right-hand sides in the system to be solved. Specified as: a fullword integer; nrhs >= 0.

On Return

b: is the matrix B, containing the nrhs solutions to the system in the columns of B. Specified as: an ldb by (at least) nrhs array, containing numbers of the data type indicated in Table 95.

Notes

All subroutines accept lowercase letters for the uplo argument.
The scalar data specified for input arguments uplo, lda, and n for these subroutines must be the same as the corresponding input arguments specified for SPOF/SPOFCD, DPOF/DPOFCD, CPOF, and ZPOF, respectively.
The array data specified for input argument a for these subroutines must be the same as the corresponding output arguments for SPOF/SPOFCD, DPOF/DPOFCD, CPOF, and ZPOF, respectively.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".
For a description of how the matrices are stored:
- For positive definite symmetric matrices, see "Positive Definite or Negative Definite Symmetric Matrix".
- For positive definite complex Hermitian matrices, see "Positive Definite or Negative Definite Complex Hermitian Matrix".

Function

The system AX = B is solved for X, using multiple right-hand sides, where X and B are general matrices, and A is a positive definite symmetric matrix for SPOSM and DPOSM and a positive definite complex Hermitian matrix for CPOSM and ZPOSM. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SPOF/SPOFCD, DPOF/DPOFCD, CPOF, or ZPOF, respectively. For a description of how A is factored, see SPPF, DPPF, SPOF, DPOF, CPOF, and ZPOF--Positive Definite Real Symmetric or Complex Hermitian Matrix Factorization.

If n or nrhs is 0, no computation is performed. See references [8] and [36].

Error Conditions

Computational Errors

None
Note: If the factorization performed by SPOF, DPOF, CPOF, ZPOF, SPOFCD, or DPOFCD failed because matrix A was not positive definite, the results returned by this subroutine are unpredictable, and there may be a divide-by-zero program exception message.

Input-Argument Errors

uplo <> 'U' or 'L'
lda, ldb <= 0
n < 0
n > lda
n > ldb
nrhs < 0

Example 1

This example shows how to solve the system AX = B for two right-hand sides, where matrix A is the same matrix factored in the "Example 3" for SPOF.

Call Statement and Input

            UPLO  A  LDA  N   B  LDB  NRHS
             |    |   |   |   |   |    |
CALL SPOSM( 'L' , A , 9 , 9 , B , 9 ,  2  )

A        =(same as output A in
"Example 3")

        *             *
        |  9.0   45.0 |
        | 17.0   89.0 |
        | 24.0  131.0 |
        | 30.0  170.0 |
B    =  | 35.0  205.0 |
        | 39.0  235.0 |
        | 42.0  259.0 |
        | 44.0  276.0 |
        | 45.0  285.0 |
        *             *

Output

        *          *
        | 1.0  1.0 |
        | 1.0  2.0 |
        | 1.0  3.0 |
        | 1.0  4.0 |
B    =  | 1.0  5.0 |
        | 1.0  6.0 |
        | 1.0  7.0 |
        | 1.0  8.0 |
        | 1.0  9.0 |
        *          *

Example 2

This example shows how to solve the system A^TX = B for two right-hand sides, where matrix A is the input matrix factored in "Example 4" for SPOF.

Call Statement and Input

            UPLO  A  LDA  N   B  LDB  NRHS
             |    |   |   |   |   |    |
CALL SPOSM( 'U' , A , 9 , 9 , B , 9 ,  2  )

A        =(same as output A in
"Example 4")

        *             *
        |  9.0   45.0 |
        | 17.0   89.0 |
        | 24.0  131.0 |
        | 30.0  170.0 |
B    =  | 35.0  205.0 |
        | 39.0  235.0 |
        | 42.0  259.0 |
        | 44.0  276.0 |
        | 45.0  285.0 |
        *             *

Output

        *          *
        | 1.0  1.0 |
        | 1.0  2.0 |
        | 1.0  3.0 |
        | 1.0  4.0 |
B    =  | 1.0  5.0 |
        | 1.0  6.0 |
        | 1.0  7.0 |
        | 1.0  8.0 |
        | 1.0  9.0 |
        *          *

Example 3

This example shows how to solve the system AX = B for two right-hand sides, where matrix A is the same matrix factored in the "Example 5" for CPOF.

Call Statement and Input

            UPLO  A  LDA  N   B  LDB  NRHS
             |    |   |   |   |   |    |
CALL CPOSM( 'L' , A , 3 , 3 , B , 3 ,  2  )

A        =(same as output A in
"Example 5")

        *                                *
        |  (60.0, -55.0)    (70.0, 10.0) |
B    =  |   (34.0, 58.0)  (-51.0, 110.0) |
        | (13.0, -152.0)    (75.0, 63.0) |
        *                                *

Output

        *                          *
        | (2.0, -1.0)   (2.0, 0.0) |
B    =  |  (1.0, 1.0)  (-1.0, 2.0) |
        | (0.0, -2.0)   (1.0, 1.0) |
        *                          *

Example 4

This example shows how to solve the system AX = B for two right-hand sides, where matrix A is the input matrix factored in "Example 6" for CPOF.

Call Statement and Input

            UPLO  A  LDA  N   B  LDB  NRHS
             |    |   |   |   |   |    |
CALL CPOSM( 'U' , A , 3 , 3 , B , 3 ,  2  )

A        =(same as output A in
"Example 6")

        *                              *
        | (33.0, -18.0)   (15.0, -3.0) |
B    =  | (45.0, -45.0)    (8.0, -2.0) |
        |  (152.0, 1.0)  (43.0, -29.0) |
        *                              *

Output

        *                          *
        | (2.0, -1.0)   (2.0, 0.0) |
B    =  | (1.0, -1.0)   (0.0, 1.0) |
        |  (3.0, 0.0)  (1.0, -1.0) |
        *                          *

SPPFCD, DPPFCD, SPOFCD, and DPOFCD--Positive Definite Real Symmetric Matrix Factorization, Condition Number Reciprocal, and Determinant

The SPPFCD and DPPFCD subroutines factor positive definite symmetric matrix A, stored in lower-packed storage mode, using Gaussian elimination (LDL^T). The reciprocal of the condition number and the determinant of matrix A can also be computed. To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SPPS or DPPS, respectively.

The SPOFCD and DPOFCD subroutines factor positive definite symmetric matrix A, stored in upper or lower storage mode, using Cholesky factorization (LL^T or U^TU). The reciprocal of the condition number and the determinant of matrix A can also be computed. To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with a call to SPOSM or DPOSM, respectively. To find the inverse of matrix A, follow the call to these subroutines with a call to SPOICD or DPOICD, respectively.

Table 96. Data Types

A, aux, rcond, det Subroutine
Short-precision real SPPFCD and SPOFCD
Long-precision real DPPFCD and DPOFCD

Note: The output factorization from SPPFCD and DPPFCD should be used only as input to the solve subroutines SPPS and DPPS, respectively. The output from SPOFCD and DPOFCD should be used only as input to the following subroutines for performing a solve or inverse: SPOSM/SPOICD and DPOSM/DPOICD, respectively.

Syntax

Fortran	CALL SPPFCD \| DPPFCD (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`) CALL SPOFCD \| DPOFCD (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`)
C and C++	sppfcd \| dppfcd (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`); spofcd \| dpofcd (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);
PL/I	CALL SPPFCD \| DPPFCD (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`); CALL SPOFCD \| DPOFCD (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);

On Entry

uplo

indicates whether matrix A is stored in upper or lower storage mode, where:

If uplo = 'U', A is stored in upper storage mode.

If uplo = 'L', A is stored in lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

ap

is the array, referred to as AP, in which the matrix A, to be factored, is stored in lower-packed storage mode. Specified as: a one-dimensional array of (at least) length n(n+1)/2+n, containing numbers of the data type indicated in Table 96.

a

is the positive definite symmetric matrix A, to be factored. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 96.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order n of matrix A. Specified as: a fullword integer, where:

For SPPFCD and DPPFCD, n >= 0.

For SPOFCD and DPOFCD, 0 <= n <= lda.

iopt

indicates the type of computation to be performed, where:

If iopt = 0, the matrix is factored.

If iopt = 1, the matrix is factored, and the reciprocal of the condition number is computed.

If iopt = 2, the matrix is factored, and the determinant is computed.

If iopt = 3, the matrix is factored and the reciprocal of the condition number and the determinant are computed.

Specified as: a fullword integer; iopt = 0, 1, 2, or 3.

rcond

See 'On Return'.

det

See 'On Return'.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, is the storage work area used by these subroutines. Its size is specified by naux. Specified as: an area of storage, containing numbers of the data type indicated in Table 96.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, SPPFCD, DPPFCD, SPOFCD, and DPOFCD dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux >= n.

On Return

ap

is the transformed matrix A of order n, containing the results of the factorization. See "Function". Returned as: a one-dimensional array of (at least) length n(n+1)/2+n, containing numbers of the data type indicated in Table 96.

a

is the transformed matrix A of order n, containing the results of the factorization. See "Function". Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 96.

rcond

is the estimate of the reciprocal of the condition number, rcond, of matrix A. Returned as: a number of the data type indicated in Table 96; rcond >= 0.

det

is the vector det, containing the two components det₁ and det₂ of the determinant of matrix A. The determinant is:

where 1 <= det₁ < 10. Returned as: an array of length 2, containing numbers of the data type indicated in Table 96.

Notes

All subroutines accept lowercase letters for the uplo argument.
In your C program, argument rcond must be passed by reference.
When iopt = 0, SPPFCD and DPPFCD provide the same function as a call to SPPF or DPPF, respectively. When iopt = 0, SPOFCD and DPOFCD provide the same function as a call to SPOF or DPOF, respectively.
See "Notes" for information on specifying a value for iopt in the SPPS and DPPS subroutines after calling SPPFCD and DPPFCD, respectively.
In the input and output arrays specified for ap, the first n(n+1)/2 elements are matrix elements. The additional n locations in the array are used for working storage by this subroutine and should not be altered between calls to the factorization and solve subroutines.
For a description of how a positive definite symmetric matrix is stored in lower-packed storage mode in an array, see "Symmetric Matrix". For a description of how a positive definite symmetric matrix is stored in upper or lower storage mode, see "Positive Definite or Negative Definite Symmetric Matrix".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The functions for these subroutines are described in the sections below.

For SPPFCD and DPPFCD

The positive definite symmetric matrix A, stored in lower-packed storage mode, is factored using Gaussian elimination, where A is expressed as:

A = LDL ^T

where:

L is a unit lower triangular matrix.

L^T is the transpose of matrix L.

D is a diagonal matrix.

If n is 0, no computation is performed. See references [36] and [38].

These subroutines call SPPF and DPPF, respectively, to perform the factorization using Gaussian elimination (LDL^T). If you want to use the Cholesky factorization method, you must call SPPF and DPPF directly.

For SPOFCD and DPOFCD

The positive definite symmetric matrix A, stored in upper or lower storage mode, is factored using Cholesky factorization, where A is expressed as:

A = LL ^T or A = U^TU

where:

L is a lower triangular matrix.

L^T is the transpose of matrix L.

U is an upper triangular matrix.

U^T is the transpose of matrix U.

If specified, the estimate of the reciprocal of the condition number and the determinant can also be computed. The estimate of the condition number uses an enhanced version of the algorithm described in references [63] and [64].

If n is 0, no computation is performed. See references [8] and [36].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

Matrix A is not positive definite (for SPPFCD and DPPFCD).
- If matrix A is singular (at least one of the diagonal elements are 0), then rcond and det, if you requested them, are set to 0.
- If matrix A is nonsingular and nonpositive definite (none of the diagonal elements are 0 and at least one diagonal element is negative), then rcond and det, if you requested them, are computed.
- One or more elements of D contain values less than or equal to 0; all elements of D are checked. The index i of the last nonpositive element encountered is identified in the computational error message, issued by SPPF or DPPF, respectively.
- i can be determined at run time by using the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2104 in the ESSL error option table; otherwise, the default value causes your program to be terminated by SPPF or DPPF, respectively, when this error occurs. If your program is not terminated by SPPF or DPPF, respectively, the return code is set to 2. For details, see "What Can You Do about ESSL Computational Errors?".
Matrix A is not positive definite (for SPOFCD and DPOFCD).
- If matrix A is singular (at least one of the diagonal elements are 0), then rcond and det, if you requested them, are set to 0.
- If matrix A is nonsingular and nonpositive definite (none of the diagonal elements are 0 and at least one diagonal element is negative), then rcond and det, if you requested them, are computed.
- Processing stops at the first occurrence of a nonpositive definite diagonal element.
- The order i of the first minor encountered having a nonpositive determinant is identified in the computational error message.
- i can be determined at run time by using the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2115 in the ESSL error option table; otherwise, the default value causes your program to be terminated by SPPF or DPPF, respectively, when this error occurs. If your program is not terminated by SPPF or DPPF, respectively, the return code is set to 2. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

uplo <> 'U' or 'L'
lda <= 0
lda < n
n < 0
iopt <> 0, 1, 2, or 3
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example computes the factorization, reciprocal of the condition number, and determinant of matrix A. The input is the same as used in "Example 1" for SPPF.

The values used to estimate the reciprocal of the condition number are obtained with the following values:

||A||₁ = max(9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0) = 45.0

Estimate of ||A|| = 4.0

On output, the value in det, |A|, is equal to 1.

Call Statement and Input

             AP   N  IOPT  RCOND   DET   AUX  NAUX
             |    |   |      |      |     |    |
CALL DPPFCD( AP , 9 , 3  , RCOND , DET , AUX , 9  )

AP       =(same as input AP in
"Example 1")

Output

AP       =(same as output AP in
"Example 1")
RCOND    =  0.0055555
DET      =  (1.0, 0.0)

Example 2

This example computes the factorization, reciprocal of the condition number, and determinant of matrix A. The input is the same as used in "Example 3" for SPOF.

The values used to estimate the reciprocal of the condition number are obtained with the following values:

||A||₁ = max(9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0) = 45.0

Estimate of ||A|| = 4.0

On output, the value in det, |A|, is equal to 1.

Call Statement and Input

             UPLO A  LDA  N IOPT  RCOND   DET   AUX  NAUX
              |   |   |   |   |     |      |     |    |
CALL SPOFCD( 'L', A , 9 , 9 , 3 , RCOND , DET , AUX , 9  )

A        =(same as input A in
"Example 3")

Output

A        =(same as output A in
"Example 3")
RCOND    =  0.0055555
DET      =  (1.0, 0.0)

Example 3

This example computes the factorization, reciprocal of the condition number, and determinant of matrix A. The input is the same as used in "Example 4" for SPOF.

The values used to estimate the reciprocal of the condition number are obtained with the following values:

||A||₁ = max(9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0) = 45.0

Estimate of ||A|| = 4.0

On output, the value in det, |A|, is equal to 1.

Call Statement and Input

             UPLO A  LDA  N IOPT  RCOND   DET   AUX  NAUX
              |   |   |   |   |     |      |     |    |
CALL SPOFCD( 'U', A , 9 , 9 , 3 , RCOND , DET , AUX , 9  )

A        =(same as input A in
"Example 4")

Output

A        =(same as output A in
"Example 4")
RCOND    =  0.0055555
DET      =  (1.0, 0.0)

SGEICD and DGEICD--General Matrix Inverse, Condition Number Reciprocal, and Determinant

These subroutines find the inverse, the reciprocal of the condition number, and the determinant of matrix A.

Table 97. Data Types

A, aux, rcond, det Subroutine
Short-precision real SGEICD
Long-precision real DGEICD

Note: If you call these subroutines with iopt = 0, 1, 2, or 3 the input must be the output from the factorization subroutines SGEF/SGEFCD/SGETRF or DGEF/DGEFCD/DGEFP/DGETRF, respectively.

Syntax

Fortran	CALL SGEICD \| DGEICD (`a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`)
C and C++	sgeicd \| dgeicd (`a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);
PL/I	CALL SGEICD \| DGEICD (`a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);

On Entry

a

has the following meaning, where:

If iopt = 0, 1, 2, or 3, it is matrix A of order n, whose inverse, reciprocal of condition number, and determinant are computed.

If iopt = 4, it is the transformed matrix A of order n, resulting from the factorization performed in a previous call to SGEF/SGEFCD or DGEF/DGEFCD/DGEFP, respectively, whose inverse is computed.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 97.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order of matrix A. Specified as: a fullword integer; 0 <= n <= lda.

iopt

indicates the type of computation to be performed, where:

If iopt = 0, the inverse is computed for matrix A.

If iopt = 1, the inverse and the reciprocal of the condition number are computed for matrix A.

If iopt = 2, the inverse and the determinant are computed for matrix A.

If iopt = 3, the inverse, the reciprocal of the condition number, and the determinant are computed for matrix A.

If iopt = 4, the inverse is computed using the factored matrix A.

Specified as: a fullword integer; iopt = 0, 1, 2, 3, 4.

rcond

See 'On Return'.

det

See 'On Return'.

aux

has the following meaning, and its size is specified by naux:

If iopt = 0, 1, 2, or 3, then if naux = 0 and error 2015 is unrecoverable, aux is ignored. Otherwise, it is the storage work area used by this subroutine.

If iopt = 4, aux has the following meaning:

For SGEICD, the first n locations in aux must contain the ipvt integer vector of length n, resulting from a previous call to SGEF, SGETRF, or SGEFCD.
For DGEICD, the first ceiling(n/2) locations in aux must contain the ipvt integer vector of length n, resulting from a previous call to DGEF, DGETRF, DGEFCD, or DGEFP.

Specified as: an area of storage, containing numbers of the data type indicated in Table 97.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If iopt <> 4, then if naux = 0 and error 2015 is unrecoverable, SGEICD and DGEICD dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise naux must have the following value:

For the RS/6000 POWER or PowerPC processors, naux >= 100n.

For the RS/6000 POWER2 processors, naux >= 200n.
Note: naux values specified for releases prior to ESSL Version 2 Release 2 will still work, but you may not achieve optimal performance.

On Return

a

is the resulting inverse of matrix A of order n. Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 97.

rcond

is the reciprocal of the condition number, rcond, of matrix A. Returned as: a real number of the data type indicated in Table 97; rcond >= 0.

det

is the vector det, containing the two components det₁ and det₂ of the determinant of matrix A. The determinant is:

where 1 <= det₁ < 10. Returned as: an array of length 2, containing numbers of the data type indicated in Table 97.

Notes

In your C program, argument rcond must be passed by reference.
If iopt = 4, the following input arguments for SGEICD and DGEICD must be set to the same values in the previous call to SGEF/SGEFCD or DGEF/DGEFCD/DGEFP, respectively:

For _GEF_ For _GEICD
Input arguments n and lda Input arguments n and lda
Output arguments a and ipvt Input arguments a and aux
You have the option of having the value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

For _GEF_	For _GEICD
Input arguments `n` and `lda`	Input arguments `n` and `lda`
Output arguments `a` and `ipvt`	Input arguments `a` and `aux`

Function

The inverse, the reciprocal of the condition number, and the determinant of a general square matrix A are computed using partial pivoting to preserve accuracy, where:

A^-1 is the inverse of matrix A, where AA^-1 = A^-1A = I, and I is the identity matrix.
1/(||A||₁)(||A^-1||₁) is the reciprocal of the condition number, where ||A||₁ is the one-norm of matrix A.
|A| is the determinant of matrix A, where |A| is expressed as:

The iopt argument is used to determine the combination of output items produced by this subroutine: the inverse, the reciprocal of the condition number, and the determinant.

If n is 0, no computation is performed. See references [36], [38], and [44].

Error Conditions

Resource Errors

If iopt = 0, 1, 2, or 3, then error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

Matrix A is singular or nearly singular.

The index i of the first pivot element having a value equal to 0, is identified in the computational error message.
These subroutines return 0 for rcond and det, if you requested them.
The return code is set to 2.
i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2105 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

lda <= 0
n < 0
n > lda
iopt <> 0, 1, 2, 3, or 4
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example computes the inverse, the reciprocal of the condition number, and the determinant of matrix A. The values used to compute the reciprocal of the condition number in this example are obtained with the following values:

||A||₁ = max(6.0, 8.0, 10.0, 12.0, 13.0, 14.0, 15.0, 15.0, 15.0) = 15.0

||A^-1||₁ = 1226.33

On output, the value in det, |A|, is equal to 336.

Call Statement and Input

             A  LDA  N  IOPT  RCOND   DET   AUX   NAUX
             |   |   |   |      |      |     |     |
CALL DGEICD( A , 9 , 9 , 3  , RCOND , DET , AUX , 293 )
 
        *                                                *
        | 1.0  1.0  1.0  1.0  0.0  0.0   0.0   0.0   0.0 |
        | 1.0  1.0  1.0  1.0  1.0  0.0   0.0   0.0   0.0 |
        | 4.0  1.0  1.0  1.0  1.0  1.0   0.0   0.0   0.0 |
        | 0.0  5.0  1.0  1.0  1.0  1.0   1.0   0.0   0.0 |
A    =  | 0.0  0.0  6.0  1.0  1.0  1.0   1.0   1.0   0.0 |
        | 0.0  0.0  0.0  7.0  1.0  1.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  8.0  1.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  0.0  9.0   1.0   1.0   1.0 |
        | 0.0  0.0  0.0  0.0  0.0  0.0  10.0  11.0  12.0 |
        *                                                *

Output

        *                                                                      *
        |    0.333   -0.667   0.333  0.000  0.000  0.000   0.042 -0.042  0.000 |
        |   56.833  -52.167  -1.167 -0.500 -0.500 -0.357   6.836 -0.479 -0.500 |
        |  -55.167   51.833   0.833  0.500  0.500  0.214  -6.735  0.521  0.500 |
        |   -1.000    1.000   0.000  0.000  0.000  0.143  -0.143  0.000  0.000 |
A    =  |   -1.000    1.000   0.000  0.000  0.000  0.000   0.000  0.000  0.000 |
        |   -1.000    1.000   0.000  0.000  0.000  0.000  -0.125  0.125  0.000 |
        | -226.000  206.000   5.000  3.000  2.000  1.429 -27.179  1.750  2.000 |
        |  560.000 -520.000 -10.000 -6.000 -4.000 -2.857  67.857 -5.000 -5.000 |
        | -325.000  305.000   5.000  3.000  2.000  1.429 -39.554  3.125  3.000 |
        *                                                                      *
 
RCOND    =  0.00005436
DET      =  (3.36, 2.00)

Example 2

This example computes the inverse of matrix A, where iopt = 4 and matrix A is the transformed matrix factored in "Example 1" by SGEF. The input contents of AUX, shown here, is the same as the output contents of IPVT in that example.

Call Statement and Input

             A  LDA  N  IOPT  RCOND   DET   AUX   NAUX
             |   |   |   |      |      |     |     |
CALL SGEICD( A , 9 , 9 , 4  , RCOND , DET , AUX , 300 )

A        =(same as output A in
"Example 1")
AUX      =  (3, 4, 5, 6, 7, 8, 9, 8, 9)

Output

        *                                                                      *
        |    0.333   -0.667   0.333  0.000  0.000  0.000   0.042 -0.042  0.000 |
        |   56.833  -52.167  -1.167 -0.500 -0.500 -0.357   6.836 -0.479 -0.500 |
        |  -55.167   51.833   0.833  0.500  0.500  0.214  -6.735  0.521  0.500 |
        |   -1.000    1.000   0.000  0.000  0.000  0.143  -0.143  0.000  0.000 |
A    =  |   -1.000    1.000   0.000  0.000  0.000  0.000   0.000  0.000  0.000 |
        |   -1.000    1.000   0.000  0.000  0.000  0.000  -0.125  0.125  0.000 |
        | -226.000  206.000   5.000  3.000  2.000  1.429 -27.179  1.750  2.000 |
        |  560.000 -520.000 -10.000 -6.000 -4.000 -2.857  67.857 -5.000 -5.000 |
        | -325.000  305.000   5.000  3.000  2.000  1.429 -39.554  3.125  3.000 |
        *                                                                      *

SPPICD, DPPICD, SPOICD, and DPOICD--Positive Definite Real Symmetric Matrix Inverse, Condition Number Reciprocal, and Determinant

These subroutines find the inverse, the reciprocal of the condition number, and the determinant of positive definite symmetric matrix A using Cholesky factorization, where:

For SPPICD and DPPICD, A is stored in lower-packed storage mode.
For SPOICD and DPOICD, A is stored in upper or lower storage mode.

Table 98. Data Types

A, aux, rcond, det Subroutine
Short-precision real SPPICD and SPOICD
Long-precision real DPPICD and DPOICD

Note: If you call these subroutines with iopt = 4, the input must be the output from the factorization subroutines SPPF, DPPF, SPOF/SPOFCD, or DPOF/DPOFCD, respectively, where Cholesky factorization was performed.

Syntax

Fortran	CALL SPPICD \| DPPICD (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`) CALL SPOICD \| DPOICD (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`)
C and C++	sppicd \| dppicd (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`); spoicd \| dpoicd (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);
PL/I	CALL SPPICD \| DPPICD (`ap`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`); CALL SPOICD \| DPOICD (`uplo`, `a`, `lda`, `n`, `iopt`, `rcond`, `det`, `aux`, `naux`);

On Entry

uplo

indicates whether matrix A is stored in upper or lower storage mode, where:

If uplo = 'U', A is stored in upper storage mode.

If uplo = 'L', A is stored in lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

ap

is the array, referred to as AP, where:

If iopt = 0, 1, 2, or 3, then AP contains the positive definite real symmetric matrix A, whose inverse, condition number reciprocal, and determinant are computed, where matrix A is stored in lower-packed storage mode.

If iopt = 4, then AP contains the transformed matrix A of order n, resulting from the Cholesky factorization performed in a previous call to SPPF or DPPF, respectively, whose inverse is computed.

Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 98.

a

has the following meaning, where:

If iopt = 0, 1, 2, or 3, it is the positive definite real symmetric matrix A, whose inverse, condition number reciprocal, and determinant are computed, where matrix A is stored in upper or lower storage mode.

If iopt = 4, it is the transformed matrix A of order n, containing results of the factorization from a previous call to SPOF/SPOFCD or DPOF/DPOFCD, respectively, whose inverse is computed.

Specified as: an n by (at least) n array, containing numbers of the data type indicated in Table 98.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

n

is the order n of matrix A. Specified as: a fullword integer; n >= 0.

iopt

indicates the type of computation to be performed, where:

If iopt = 0, the inverse is computed for matrix A.

If iopt = 1, the inverse and the reciprocal of the condition number are computed for matrix A.

If iopt = 2, the inverse and the determinant are computed for matrix A.

If iopt = 3, the inverse, the reciprocal of the condition number, and the determinant are computed for matrix A.

If iopt = 4, the inverse is computed for the (Cholesky) factored matrix A.

Specified as: a fullword integer; iopt = 0, 1, 2, 3, or 4.

rcond

See 'On Return'.

det

See 'On Return'.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is the storage work area used by this subroutine. Its size is specified by naux. Specified as: an area of storage, containing numbers of the data type indicated in Table 98.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, SPPICD, DPPICD, SPOICD, AND DPOICD dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux >= n.

On Return

ap

is the resulting array, referred to as AP, containing the inverse of the matrix in lower-packed storage mode. Returned as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 98.

a

is the transformed matrix A of order n, containing the inverse of the matrix in upper or lower storage mode. Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 98.

rcond

is the reciprocal of the condition number, rcond, of matrix A. Returned as: a real number of the data type indicated in Table 98; rcond >= 0.

det

is the vector det, containing the two components det₁ and det₂ of the determinant of matrix A. The determinant is:

where 1 <= det₁ < 10. Returned as: an array of length 2, containing numbers of the data type indicated in Table 98.

Notes

For these subroutines, when you specify iopt = 4, you must do the following:
- For SPPICD and DPPICD, use Cholesky factorization in the previous call to SPPF and DPPF, respectively.
- For SPOICD and DPOICD, specify the same storage mode for matrix A that was specified in the previous call to SPOF/SPOFCD and DPOF/DPOFCD, respectively.
- The scalar data specified for input arguments uplo, lda, and n for these subroutines must be the same as the corresponding input arguments specified for SPOF/SPOFCD and DPOF/DPOFCD, respectively.
All subroutines accept lowercase letters for the uplo argument.
In your C program, argument rcond must be passed by reference.
For a description of how a positive definite symmetric matrix is stored in lower-packed storage mode in an array, see "Symmetric Matrix". For a description of how a positive definite symmetric matrix is stored in upper or lower storage mode, see "Positive Definite or Negative Definite Symmetric Matrix".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

These subroutines find the inverse, the reciprocal of the condition number, and the determinant of positive definite symmetric matrix A using Cholesky factorization, where:

A^-1 is the inverse of matrix A, where AA^-1 = A^-1A = I, and I is the identity matrix.
1/(||A||₁)(||A^-1||₁) is the reciprocal of the condition number, where ||A||₁ is the one-norm of matrix A.
|A| is the determinant of matrix A, where |A| is expressed as:

The iopt argument is used to determine the combination of output items produced by this subroutine: the inverse, the reciprocal of the condition number, and the determinant.

If n is 0, no computation is performed. See references [36], [38], and [44].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.
Unable to allocate internal work area.

Computational Errors

Matrix A is not positive definite.

These subroutines do not perform the inverse, determinant, and reciprocal of the condition number computations.
For iopt = 1, 2, or 3, the leading minor of order i has a nonpositive determinant. The order i is identified in the computational error message, issued by SPPF, DPPF, SPOF, or DPOF, respectively.
For iopt = 4 for SPPICD and DPPICD, if the Cholesky factorization performed by SPPF or DPPF, respectively, failed due to a nonpositive definite matrix A, the results from STPI or DTPI, respectively, are unpredictable, and a computational error message may be issued.
For iopt = 4 for SPOICD and DPOICD, if the factorization performed by SPOF/SPOFCD or DPOF/DPOFCD, respectively, failed due to a nonpositive definite matrix A, the results from STRI or DTRI, respectively, are unpredictable, and a computational error message may be issued.
i can be determined at run time by using the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2115 in the ESSL error option table; otherwise, the default value causes your program to be terminated by SPPF, DPPF, SPOF, or DPOF, respectively, when this error occurs. If your program is not terminated by SPPF, DPPF, SPOF, or DPOF, respectively, the return code is set to 2. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

uplo <> 'U' or 'L'
n < 0
lda <= 0
lda < n
iopt <> 0, 1, 2, 3, or 4
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example uses SPPICD to compute the inverse, reciprocal of the condition number, and determinant of matrix A. Where A is:

             *                                             *
             | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
             | 1.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0 |
             | 1.0  2.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
             | 1.0  2.0  3.0  4.0  4.0  4.0  4.0  4.0  4.0 |
             | 1.0  2.0  3.0  4.0  5.0  5.0  5.0  5.0  5.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  6.0  6.0  6.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  7.0  7.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  8.0 |
             | 1.0  2.0  3.0  4.0  5.0  6.0  7.0  8.0  9.0 |
             *                                             *

The values used to compute the reciprocal of the condition number in this example are obtained with the following values:

||A||₁ = max(9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0) = 45.0

||A^-1||₁ = 4.0

On output, the value in det, |A|, is equal to 1, and RCOND = 1/180.
Note: The AP arrays are formatted in a triangular arrangement for readability; however, they are stored in lower-packed storage mode.

Call Statement and Input

             AP   N  IOPT RCOND   DET   AUX  NAUX
             |    |   |     |      |     |    |
CALL SPPICD( AP , 9 , 3 , RCOND , DET , AUX , 9  )
 
AP   =   (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
          2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0,
          3.0, 3.0, 3.0, 3.0, 3.0, 3.0, 3.0,
          4.0, 4.0, 4.0, 4.0, 4.0, 4.0,
          5.0, 5.0, 5.0, 5.0, 5.0,
          6.0, 6.0, 6.0, 6.0,
          7.0, 7.0, 7.0,
          8.0, 8.0,
          9.0)

Output

AP   =   (2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0,
          2.0, -1.0, 0.0,
          2.0, -1.0,
          1.0)
 
RCOND    =  0.005556
DET      =  (1.0, 0.0)

Example 2

This example uses SPPICD to compute the inverse of matrix A, where iopt = 4, and matrix A is the transformed matrix factored in "Example 1" by SPPF.
Note: The AP arrays are formatted in a triangular arrangement for readability; however, they are stored in lower-packed storage mode.

Call Statement and Input

            AP   N  IOPT RCOND   DET   AUX NAUX
            |    |   |     |      |     |    |
CALL SPPICD(AP , 9 , 4 , RCOND , DET , AUX , 9)

AP       =(same as output AP in
"Example 2" for SPPF)

Output

AP   =   (2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0, 0.0,
          2.0, -1.0, 0.0, 0.0,
          2.0, -1.0, 0.0,
          2.0, -1.0,
          1.0)

Example 3

This example uses SPOICD to compute the inverse, reciprocal of the condition number, and determinant of the same matrix A used in Example 1; however, matrix A is stored in upper storage mode in this example.

The values used to compute the reciprocal of the condition number in this example are obtained with the following values:

||A||₁ = max(9.0, 17.0, 24.0, 30.0, 35.0, 39.0, 42.0, 44.0, 45.0) = 45.0

||A^-1||₁ = 4.0

On output, the value in det, |A|, is equal to 1, and RCOND = 1/180.

Call Statement and Input

             UPLO  A  LDA   N  IOPT RCOND   DET   AUX  NAUX
              |    |   |    |   |     |      |     |    |
CALL SPOICD( 'U' , A , 9 ,  9 , 3 , RCOND , DET , AUX , 9  )
 
        *                                             *
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
        |  .   2.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0 |
        |  .    .   3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
        |  .    .    .   4.0  4.0  4.0  4.0  4.0  4.0 |
A    =  |  .    .    .    .   5.0  5.0  5.0  5.0  5.0 |
        |  .    .    .    .    .   6.0  6.0  6.0  6.0 |
        |  .    .    .    .    .    .   7.0  7.0  7.0 |
        |  .    .    .    .    .    .    .   8.0  8.0 |
        |  .    .    .    .    .    .    .    .   9.0 |
        *                                             *

Output

        *                                                     *
        | 2.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 |
        |  .    2.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
        |  .     .    2.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
        |  .     .     .    2.0  -1.0   0.0   0.0   0.0   0.0 |
A    =  |  .     .     .     .    2.0  -1.0   0.0   0.0   0.0 |
        |  .     .     .     .     .    2.0  -1.0   0.0   0.0 |
        |  .     .     .     .     .     .    2.0  -1.0   0.0 |
        |  .     .     .     .     .     .     .    2.0  -1.0 |
        |  .     .     .     .     .     .     .     .    1.0 |
        *                                                     *
 
RCOND    =  0.005555556
DET      =  (1.0, 0.0)

Example 4

This example uses SPOICD to compute the inverse of matrix A, where iopt = 4, and matrix A is the transformed matrix factored in "Example 1" by SPOF.

Call Statement and Input

             UPLO  A  LDA   N  IOPT RCOND   DET   AUX  NAUX
              |    |   |    |   |     |      |     |    |
CALL SPOICD( 'U' , A , 9 ,  9 , 4 , RCOND , DET , AUX , 9  )

A        =(same as output A in
"Example 4" for SPOF)

Output

        *                                                     *
        | 2.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0   0.0 |
        |  .    2.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
        |  .     .    2.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
        |  .     .     .    2.0  -1.0   0.0   0.0   0.0   0.0 |
A    =  |  .     .     .     .    2.0  -1.0   0.0   0.0   0.0 |
        |  .     .     .     .     .    2.0  -1.0   0.0   0.0 |
        |  .     .     .     .     .     .    2.0  -1.0   0.0 |
        |  .     .     .     .     .     .     .    2.0  -1.0 |
        |  .     .     .     .     .     .     .     .    1.0 |
        *                                                     *

STRSV, DTRSV, CTRSV, ZTRSV, STPSV, DTPSV, CTPSV, and ZTPSV--Solution of a Triangular System of Equations with a Single Right-Hand Side

STRSV, DTRSV, STPSV, and DTPSV perform one of the following solves for a triangular system of equations with a single right-hand side, using the vector x and triangular matrix A or its transpose:

Solution Equation
1. x <-- A^-1x Ax = b
2. x <-- A^-Tx A^Tx = b

CTRSV, ZTRSV, CTPSV, and ZTPSV perform one of the following solves for a triangular system of equations with a single right-hand side, using the vector x and and triangular matrix A, its transpose, or its conjugate transpose:

Solution Equation
1. x <-- A^-1x Ax = b
2. x <-- A^-Tx A^Tx = b
3. x <-- A^-Tx A^Hx = b

Matrix A can be either upper or lower triangular, where:

For the _TRSV subroutines, it is stored in upper- or lower-triangular storage mode, respectively.
For the _TPSV subroutines, it is stored in upper- or lower-triangular-packed storage mode, respectively.

Note: The term b used in the systems of equations listed above represents the right-hand side of the system. It is important to note that in these subroutines the right-hand side of the equation is actually provided in the input-output argument x.

Table 99. Data Types

A, x Subroutine
Short-precision real STRSV and STPSV
Long-precision real DTRSV and DTPSV
Short-precision complex CTRSV and CTPSV
Long-precision complex ZTRSV and ZTPSV

Syntax

Fortran	CALL STRSV \| DTRSV \| CTRSV \| ZTRSV (`uplo`, `transa`, `diag`, `n`, `a`, `lda`, `x`, `incx`) CALL STPSV \| DTPSV \| CTPSV \| ZTPSV (`uplo`, `transa`, `diag`, `n`, `ap`, `x`, `incx`)
C and C++	strsv \| dtrsv \| ctrsv \| ztrsv (`uplo`, `transa`, `diag`, `n`, `a`, `lda`, `x`, `incx`); stpsv \| dtpsv \| ctpsv \| ztpsv (`uplo`, `transa`, `diag`, `n`, `ap`, `x`, `incx`);
PL/I	CALL STRSV \| DTRSV \| CTRSV \| ZTRSV (`uplo`, `transa`, `diag`, `n`, `a`, `lda`, `x`, `incx`); CALL STPSV \| DTPSV \| CTPSV \| ZTPSV (`uplo`, `transa`, `diag`, `n`, `ap`, `x`, `incx`);

On Entry

uplo

indicates whether matrix A is an upper or lower triangular matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

transa

indicates the form of matrix A used in the system of equations, where:

If transa = 'N', A is used, resulting in solution 1.

If transa = 'T', A^T is used, resulting in solution 2.

If transa = 'C', A^H is used, resulting in solution 3.

Specified as: a single character. It must be 'N', 'T', or 'C'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

n

is the order of triangular matrix A. Specified as: a fullword integer; n >= 0 and n <= lda.

a

is the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular storage mode, respectively. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 99.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

ap

is the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular-packed storage mode, respectively. Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 99.

x

is the vector x of length n, containing the right-hand side of the triangular system to be solved. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 99.

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

On Return

x: is the solution vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 99.

Notes

These subroutines accept lowercase letters for the uplo, transa, and diag arguments.
For STRSV, DTRSV, STPSV, and DTPSV, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.
Matrix A and vector x must have no common elements; otherwise, results are unpredictable.
ESSL assumes certain values in your array for parts of a triangular matrix. As a result, you do not have to set these values. For unit diagonal matrices, the elements of the diagonal are assumed to be 1.0 for real matrices and (1.0, 0.0) for complex matrices. When using upper- or lower-triangular storage, the unreferenced elements in the lower and upper triangular part, respectively, are assumed to be zero.
For a description of triangular matrices and how they are stored in upper- and lower-triangular storage mode and in upper- and lower-triangular-packed storage mode, see "Triangular Matrix".

Function

These subroutines solve a triangular system of equations with a single right-hand side. The solution x may be any of the following, where triangular matrix A, its transpose, or its conjugate transpose is used, and where A can be either upper- or lower-triangular:

1. x <-- A^-1x

2. x <-- A^-Tx

3. x <-- A^-Tx (only for CTRSV, ZTRSV, CTPSV, and ZTPSV)

where:

x is a vector of length n.

A is an upper or lower triangular matrix of order n. For _TRSV, it is stored in upper- or lower-triangular storage mode, respectively. For _TPSV, it is stored in upper- or lower-triangular-packed storage mode, respectively.

If n is 0, no computation is performed. See references [32], [36], and [38].

Error Conditions

Computational Errors

None

Input-Argument Errors

uplo <> 'L' or 'U'
transa <> 'T', 'N', or 'C'
diag <> 'N' or 'U'
n < 0
lda <= 0
lda < n
incx = 0

Example 1

This example shows the solution x <-- A^-1x. Matrix A is a real 4 by 4 lower unit triangular matrix, stored in lower-triangular storage mode. Vector x is a vector of length 4.
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input

            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL STRSV( 'L' , 'N' , 'U' , 4 , A , 4 , X , 1  )
 
        *                  *
        |  .    .    .   . |
        | 1.0   .    .   . |
A    =  | 2.0  3.0   .   . |
        | 3.0  4.0  3.0  . |
        *                  *
 
X        =  (1.0, 3.0, 11.0, 24.0)

Output

X        =  (1.0, 2.0, 3.0, 4.0)

Example 2

This example shows the solution x <-- A^-Tx. Matrix A is a real 4 by 4 upper nonunit triangular matrix, stored in upper-triangular storage mode. Vector x is a vector of length 4.

Call Statement and Input

            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL STRSV( 'U' , 'T' , 'N' , 4 , A , 4 , X , 1  )
 
        *                    *
        | 1.0  2.0  3.0  2.0 |
A    =  |  .   2.0  2.0  5.0 |
        |  .    .   3.0  3.0 |
        |  .    .    .   1.0 |
        *                    *
 
X        =  (5.0, 18.0, 32.0, 41.0)

Output

X        =  (5.0, 4.0, 3.0, 2.0)

Example 3

This example shows the solution x <-- A^-Tx. Matrix A is a complex 4 by 4 upper unit triangular matrix, stored in upper-triangular storage mode. Vector x is a vector of length 4.
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of (1.0, 0.0) for the diagonal elements.

Call Statement and Input

            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL CTRSV( 'U' , 'C' , 'U' , 4 , A , 4 , X , 1  )
 
        *                                     *
        | . (2.0, 2.0) (3.0,  3.0) (2.0, 2.0) |
A    =  | .     .      (2.0,  2.0) (5.0, 5.0) |
        | .     .          .       (3.0, 3.0) |
        | .     .          .           .      |
        *                                     *
 
X        =  ((5.0, 5.0), (24.0, 4.0), (49.0, 3.0), (80.0, 2.0))

Output

X        =  ((5.0, 5.0), (4.0, 4.0), (3.0, 3.0), (2.0, 2.0))

Example 4

This example shows the solution x <-- A^-1x. Matrix A is a real 4 by 4 lower unit triangular matrix, stored in lower-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                          *                    *
                          | 1.0   .    .    .  |
                          | 1.0  1.0   .    .  |
                          | 2.0  3.0  1.0   .  |
                          | 3.0  4.0  3.0  1.0 |
                          *                    *

Note:

Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input

            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL STPSV( 'L' , 'N' , 'U' , 4 , AP , X , 1  )
 
AP       =  ( . , 1.0, 2.0, 3.0, . , 3.0, 4.0, . , 3.0, . )
X        =  (1.0, 3.0, 11.0, 24.0)

Output

X        =  (1.0, 2.0, 3.0, 4.0)

Example 5

This example shows the solution x <-- A^-Tx. Matrix A is a real 4 by 4 upper nonunit triangular matrix, stored in upper-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                         *                    *
                         | 1.0  2.0  3.0  2.0 |
                         |  .   2.0  2.0  5.0 |
                         |  .    .   3.0  3.0 |
                         |  .    .    .   1.0 |
                         *                    *

Call Statement and Input

            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL STPSV( 'U' , 'T' , 'N' , 4 , AP , X , 1  )
 
AP       =  (1.0, 2.0, 2.0, 3.0, 2.0, 3.0, 2.0, 5.0, 3.0, 1.0)
X        =  (5.0, 18.0, 32.0, 41.0)

Output

X        =  (5.0, 4.0, 3.0, 2.0)

Example 6

This example shows the solution x <-- A^-Tx. Matrix A is a complex 4 by 4 upper unit triangular matrix, stored in upper-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                *                                                *
                | (1.0, 0.0)  (2.0, 2.0)  (3.0, 3.0)  (2.0, 2.0) |
                |     .       (1.0, 0.0)  (2.0, 2.0)  (5.0, 5.0) |
                |     .           .       (1.0, 0.0)  (3.0, 3.0) |
                |     .           .           .       (1.0, 0.0) |
                *                                                *

Note:

Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of (1.0, 0.0) for the diagonal elements.

Call Statement and Input

            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL CTPSV( 'U' , 'C' , 'U' , 4 , AP , X , 1  )
 
AP       =  ( . , (2.0, 2.0), . , (3.0, 3.0), (2.0, 2.0), . ,
             (2.0, 2.0), (5.0, 5.0), (3.0, 3.0), . )
X        =  ((5.0, 5.0), (24.0, 4.0), (49.0, 3.0), (80.0, 2.0))

Output

X        =  ((5.0, 5.0), (4.0, 4.0), (3.0, 3.0), (2.0, 2.0))

STRSM, DTRSM, CTRSM, and ZTRSM--Solution of Triangular Systems of Equations with Multiple Right-Hand Sides

STRSM and DTRSM perform one of the following solves for a triangular system of equations with multiple right-hand sides, using scalar alpha, rectangular matrix B, and triangular matrix A or its transpose:

Solution Equation
1. B <-- alpha(A^-1)B AX = alphaB
2. B <-- alpha(A^-T)B A^TX = alphaB
3. B <-- alphaB(A^-1) XA = alphaB
4. B <-- alphaB(A^-T) XA^T = alphaB

CTRSM and ZTRSM perform one of the following solves for a triangular system of equations with multiple right-hand sides, using scalar alpha, rectangular matrix B, and triangular matrix A, its transpose, or its conjugate transpose:

Solution Equation
1. B <-- alpha(A^-1)B AX = alphaB
2. B <-- alpha(A^-T)B A^TX = alphaB
3. B <-- alphaB(A^-1) XA = alphaB
4. B <-- alphaB(A^-T) XA^T = alphaB
5. B <-- alpha(A^-T)B A^HX = alphaB
6. B <-- alphaB(A^-T) XA^H = alphaB

Note: The term X used in the systems of equations listed above represents the output solution matrix. It is important to note that in these subroutines the solution matrix is actually returned in the input-output argument b.

Table 100. Data Types

A, B, alpha Subroutine
Short-precision real STRSM
Long-precision real DTRSM
Short-precision complex CTRSM
Long-precision complex ZTRSM

Syntax

Fortran	CALL STRSM \| DTRSM \| CTRSM \| ZTRSM (`side`, `uplo`, `transa`, `diag`, `m`, `n`, `alpha`, `a`, `lda`, `b`, `ldb`)
C and C++	strsm \| dtrsm \| ctrsm \| ztrsm (`side`, `uplo`, `transa`, `diag`, `m`, `n`, `alpha`, `a`, `lda`, `b`, `ldb`);
PL/I	CALL STRSM \| DTRSM \| CTRSM \| ZTRSM (`side`, `uplo`, `transa`, `diag`, `m`, `n`, `alpha`, `a`, `lda`, `b`, `ldb`);

On Entry

side

indicates whether the triangular matrix A is located to the left or right of rectangular matrix B in the system of equations, where:

If side = 'L', A is to the left of B, resulting in solution 1, 2, or 5.

If side = 'R', A is to the right of B, resulting in solution 3, 4, or 6.

Specified as: a single character. It must be 'L' or 'R'.

uplo

indicates whether matrix A is an upper or lower triangular matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

transa

indicates the form of matrix A used in the system of equations, where:

If transa = 'N', A is used, resulting in solution 1 or 3.

If transa = 'T', A^T is used, resulting in solution 2 or 4.

If transa = 'C', A^H is used, resulting in solution 5 or 6.

Specified as: a single character. It must be 'N', 'T', or 'C'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

m

is the number of rows in rectangular matrix B, and:

If side = 'L', m is the order of triangular matrix A.

Specified as: a fullword integer, where:

If side = 'L', 0 <= m <= lda and m <= ldb.

If side = 'R', 0 <= m <= ldb.

n

is the number of columns in rectangular matrix B, and:

If side = 'R', n is the order of triangular matrix A.

Specified as: a fullword integer; n >= 0, and:

If side = 'R', n <= lda.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 100.

a

is the triangular matrix A, of which only the upper or lower triangular portion is used, where:

If side = 'L', A is order m.

If side = 'R', A is order n.

Specified as: a two-dimensional array, containing numbers of the data type indicated in Table 100, where:

If side = 'L', its size must be lda by (at least) m.

If side = 'R', it size must be lda by (at least) n.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0, and:

If side = 'L', lda >= m.

If side = 'R', lda >= n.

b

is the m by n rectangular matrix B, which contains the right-hand sides of the triangular system to be solved. Specified as: an ldb by (at least) n array, containing numbers of the data type indicated in Table 100.

ldb

is the leading dimension of the array specified for b. Specified as: a fullword integer; ldb > 0 and ldb >= m.

On Return

b: is the m by n matrix B, containing the results of the computation.
Returned as: an ldb by (at least) n array, containing numbers of the data type indicated in Table 100.

Notes

These subroutines accept lowercase letters for the transa, side, diag, and uplo arguments.
For STRSM and DTRSM, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.
Matrices A and B must have no common elements or results are unpredictable.
If matrix A is upper triangular (uplo = 'U'), these subroutines refer to only the upper triangular portion of the matrix. If matrix A is lower triangular, (uplo = 'L'), these subroutines refer to only the lower triangular portion of the matrix. The unreferenced elements are assumed to be zero.
The elements of the diagonal of a unit triangular matrix are always one, so you do not need to set these values. The ESSL subroutines always assume that the values in these positions are 1.0 for STRSM and DTRSM and (1.0, 0.0) for CTRSM and ZTRSM.
For a description of triangular matrices and how they are stored, see "Triangular Matrix".

Function

These subroutines solve a triangular system of equations with multiple right-hand sides. The solution B may be any of the following, where A is a triangular matrix and B is a rectangular matrix:

1. B <-- alpha(A^-1)B

2. B <-- alpha(A^-T)B

3. B <-- alphaB(A^-1)

4. B <-- alphaB(A^-T)

5. B <-- alpha(A^-H)B (only for CTRSM and ZTRSM)

6. B <-- alphaB(A^-H) (only for CTRSM and ZTRSM)

where:

alpha is a scalar.

B is an m by n rectangular matrix.

A is an upper or lower triangular matrix, where:

If side = 'L', it has order m, and equation 1, 2, or 5 is performed.

If side = 'R', it has order n, and equation 3, 4, or 6 is performed.

If n or m is 0, no computation is performed. See references [32] and [36].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

None
Note: If the triangular matrix A is singular, the results returned by this subroutine are unpredictable, and there may be a divide-by-zero program exception message.

Input-Argument Errors

m < 0
n < 0
lda, ldb <= 0
side <> 'L' or 'R'
uplo <> 'L' or 'U'
transa <> 'T', 'N', or 'C'
diag <> 'N' or 'U'
side = 'L' and m > lda
m > ldb
side = 'R' and n > lda

Example 1

This example shows the solution B <-- alpha(A^-1)B, where A is a real 5 by 5 upper triangular matrix that is not unit triangular, and B is a real 5 by 3 rectangular matrix.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N   ALPHA   A  LDA  B  LDB
             |     |     |      |    |   |     |     |   |   |   |
CALL STRSM( 'L' , 'U' , 'N'  , 'N' , 5 , 3 ,  1.0  , A , 7 , B , 6 )

        *                             *
        | 3.0  -1.0   2.0   2.0   1.0 |
        |  .   -2.0   4.0  -1.0   3.0 |
        |  .     .   -3.0   0.0   2.0 |
A    =  |  .     .     .    4.0  -2.0 |
        |  .     .     .     .    1.0 |
        |  .     .     .     .     .  |
        |  .     .     .     .     .  |
        *                             *

        *                    *
        |   6.0  10.0   -2.0 |
        | -16.0  -1.0    6.0 |
B    =  |  -2.0   1.0   -4.0 |
        |  14.0   0.0  -14.0 |
        |  -1.0   2.0    1.0 |
        |    .     .      .  |
        *                    *

Output

        *                 *
        |  2.0  3.0   1.0 |
        |  5.0  5.0   4.0 |
B    =  |  0.0  1.0   2.0 |
        |  3.0  1.0  -3.0 |
        | -1.0  2.0   1.0 |
        |   .    .     .  |
        *                 *

Example 2

This example shows the solution B <-- alpha(A^-T)B, where A is a real 5 by 5 upper triangular matrix that is not unit triangular, and B is a real 5 by 4 rectangular matrix.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N    ALPHA  A  LDA  B  LDB
             |     |     |      |    |   |     |     |   |   |   |
CALL STRSM( 'L' , 'U' , 'T'  , 'N' , 5 , 4 ,  1.0  , A , 7 , B , 6 )

        *                              *
        | -1.0  -4.0  -2.0   2.0   3.0 |
        |   .   -2.0   2.0   2.0   2.0 |
        |   .     .   -3.0  -1.0   4.0 |
A    =  |   .     .     .    1.0   0.0 |
        |   .     .     .     .   -2.0 |
        |   .     .     .     .     .  |
        |   .     .     .     .     .  |
        *                              *

        *                          *
        | -1.0  -2.0   -3.0   -4.0 |
        |  2.0  -2.0  -14.0  -12.0 |
B    =  | 10.0   5.0   -8.0   -7.0 |
        | 14.0  15.0    1.0    8.0 |
        | -3.0   4.0    3.0   16.0 |
        |   .     .      .      .  |
        *                          *

Output

        *                        *
        |  1.0   2.0   3.0   4.0 |
        |  3.0   3.0  -1.0   2.0 |
B    =  | -2.0  -1.0   0.0   1.0 |
        |  4.0   4.0  -3.0  -3.0 |
        |  2.0   2.0   2.0   2.0 |
        |   .     .     .     .  |
        *                        *

Example 3

This example shows the solution B <-- alphaB(A^-1), where A is a real 5 by 5 lower triangular matrix that is not unit triangular, and B is a real 3 by 5 rectangular matrix.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N    ALPHA  A  LDA  B  LDB
             |     |     |      |    |   |     |     |   |   |   |
CALL STRSM( 'R' , 'L' , 'N'  , 'N' , 3 , 5 ,  1.0  , A , 7 , B , 4 )

        *                            *
        | 2.0   .     .     .     .  |
        | 2.0  3.0    .     .     .  |
        | 2.0  1.0   1.0    .     .  |
A    =  | 0.0  3.0   0.0  -2.0    .  |
        | 2.0  4.0  -1.0   2.0  -1.0 |
        |  .    .     .     .     .  |
        |  .    .     .     .     .  |
        *                            *

        *                             *
        | 10.0   4.0   0.0  0.0   1.0 |
B    =  | 10.0  14.0  -4.0  6.0  -3.0 |
        | -8.0   2.0  -5.0  4.0  -2.0 |
        |   .     .     .    .     .  |
        *                             *

Output

        *                              *
        |  3.0   4.0  -1.0  -1.0  -1.0 |
B    =  |  2.0   1.0  -1.0   0.0   3.0 |
        | -2.0  -1.0  -3.0   0.0   2.0 |
        |   .     .     .     .     .  |
        *                              *

Example 4

This example shows the solution B <-- alphaB(A^-1), where A is a real 6 by 6 upper triangular matrix that is unit triangular, and B is a real 1 by 6 rectangular matrix.
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal element.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N    ALPHA  A  LDA  B  LDB
             |     |     |      |    |   |     |     |   |   |   |
CALL STRSM( 'R' , 'U' , 'N'  , 'U' , 1 , 6 ,  1.0  , A , 7 , B , 2 )

        *                               *
        | .  2.0  -3.0  1.0   2.0   4.0 |
        | .   .    0.0  1.0   1.0  -2.0 |
        | .   .     .   4.0  -1.0   1.0 |
A    =  | .   .     .    .    0.0  -1.0 |
        | .   .     .    .     .    2.0 |
        | .   .     .    .     .     .  |
        | .   .     .    .     .     .  |
        *                               *

        *                                 *
B    =  | 1.0  4.0  -2.0  10.0  2.0  -6.0 |
        |  .    .     .     .    .     .  |
        *                                 *

Output

        *                                *
B    =  | 1.0  2.0  1.0  3.0  -1.0  -2.0 |
        |  .    .    .    .     .     .  |
        *                                *

Example 5

This example shows the solution B <-- alphaB(A^-1), where A is a complex 5 by 5 lower triangular matrix that is not unit triangular, and B is a complex 3 by 5 rectangular matrix.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N    ALPHA   A  LDA  B  LDB
             |     |     |      |    |   |      |     |   |   |   |
CALL CTRSM( 'R' , 'L' , 'N'  , 'N' , 3 , 5 ,  ALPHA , A , 7 , B , 4 )
 
ALPHA    =  (1.0, 0.0)

        *                                                                *
        | (2.0, -3.0)     .            .            .            .       |
        | (2.0, -4.0) (3.0, -1.0)      .            .            .       |
        | (2.0,  2.0) (1.0,  2.0)  (1.0,  1.0)      .            .       |
A    =  | (0.0,  0.0) (3.0, -1.0)  (0.0, -1.0) (-2.0,  1.0)      .       |
        | (2.0,  2.0) (4.0,  0.0) (-1.0,  2.0)  (2.0, -4.0) (-1.0, -4.0) |
        |     .           .            .            .            .       |
        |     .           .            .            .            .       |
        *                                                                *

        *                                                                      *
        | (22.0, -41.0)  (7.0, -26.0)  (9.0, 0.0)  (-15.0, -3.0)  (-15.0, 8.0) |
B    =  | (29.0, -18.0) (24.0, -10.0)  (9.0, 6.0) (-12.0, -24.0) (-19.0, -8.0) |
        |  (-15.0, 2.0) (-3.0, -21.0) (-2.0, 4.0)  (-4.0, -12.0) (-10.0, -6.0) |
        |        .           .             .            .            .         |
        *                                                                      *

Output

        *                                                                 *
        |  (3.0, 0.0)   (4.0, 0.0) (-1.0, -2.0) (-1.0, -1.0) (-1.0, -4.0) |
B    =  | (2.0, -1.0)   (1.0, 2.0) (-1.0, -3.0)   (0.0, 2.0)  (3.0, -4.0) |
        | (-2.0, 1.0) (-1.0, -3.0)  (-3.0, 1.0)   (0.0, 0.0)  (2.0, -2.0) |
        |     .            .            .             .             .     |
        *                                                                 *

Example 6

This example shows the solution B <-- alpha(A^-T)B, where A is a complex 5 by 5 upper triangular matrix that is not unit triangular, and B is a complex 5 by 1 rectangular matrix.

Call Statement and Input

            SIDE  UPLO TRANSA  DIAG  M   N    ALPHA   A  LDA  B  LDB
             |     |     |      |    |   |      |     |   |   |   |
CALL CTRSM( 'L' , 'U' , 'C'  , 'N' , 5 , 1 ,  ALPHA , A , 6 , B , 6 )
 
ALPHA    =  (1.0, 0.0)

        *                                                                *
        | (-4.0, 1.0) (4.0, -3.0)  (-1.0, 3.0)   (0.0, 0.0)  (-1.0, 0.0) |
        |      .      (-2.0, 0.0) (-3.0, -1.0) (-2.0, -1.0)   (4.0, 3.0) |
A    =  |      .           .       (-5.0, 3.0) (-3.0, -3.0) (-5.0, -5.0) |
        |      .           .            .       (4.0, -4.0)   (2.0, 0.0) |
        |      .           .            .           .        (2.0, -1.0) |
        |      .           .            .           .             .      |
        *                                                                *

        *               *
        | (-8.0, -19.0) |
        |  (8.0,  21.0) |
B    =  | (44.0,  -8.0) |
        | (13.0,  -7.0) |
        | (19.0,   2.0) |
        |      .        |
        *               *

Output

        *             *
        |  (3.0, 4.0) |
        | (-4.0, 2.0) |
B    =  | (-5.0, 0.0) |
        |  (1.0, 3.0) |
        |  (3.0, 1.0) |
        |      .      |
        *             *

STRI, DTRI, STPI, and DTPI--Triangular Matrix Inverse

These subroutines find the inverse of triangular matrix A:

A <-- A^-1

Matrix A can be either upper or lower triangular, where:

For the _TRI subroutines, it is stored in upper- or lower-triangular storage mode, respectively.
For the _TPI subroutines, it is stored in upper- or lower-triangular-packed storage mode, respectively.

Table 101. Data Types

A Subroutine
Short-precision real STRI and STPI
Long-precision real DTRI and DTPI

Syntax

Fortran	CALL STRI \| DTRI (`uplo`, `diag`, `a`, `lda`, `n`) CALL STPI \| DTPI (`uplo`, `diag`, `ap`, `n`)
C and C++	stri \| dtri (`uplo`, `diag`, `a`, `lda`, `n`); stpi \| dtpi (`uplo`, `diag`, `ap`, `n`);
PL/I	CALL STRI \| DTRI (`uplo`, `diag`, `a`, `lda`, `n`); CALL STPI \| DTPI (`uplo`, `diag`, `ap`, `n`);

On Entry

uplo

indicates whether matrix A is an upper or lower triangular matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

a

lda

is the leading dimension of the arrays specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

ap

n

is the order of matrix A. Specified as: a fullword integer; n >= 0, where:

On Return

a: is the inverse of the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular storage mode, respectively. Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 101.
ap: is the inverse of the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular-packed storage mode, respectively. Returned as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 101.

Notes

These subroutines accept lowercase letters for the uplo and diag arguments.
If matrix A is upper triangular (uplo = 'U'), these subroutines refer to only the upper triangular portion of the matrix. If matrix A is lower triangular, (uplo = 'L'), these subroutines refer to only the lower triangular portion of the matrix. The unreferenced elements are assumed to be zero.
The elements of the diagonal of a unit triangular matrix are always one, so you do not need to set these values.
For a description of triangular matrices and how they are stored in upper- and lower-triangular storage mode and in upper- and lower-triangular-packed storage mode, see "Triangular Matrix".

Function

These subroutines find the inverse of triangular matrix A, where A is either upper or lower triangular:

A <-- A^-1

where:

A is the triangular matrix of order n.

A^-1 the inverse of the triangular matrix of order n.

If n is 0, no computation is performed. See references [8] and [36].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is singular.

One or more of the diagonal elements of matrix A are zero. The first column, i, of matrix A, in which a zero diagonal element is found, is identified in the computational error message.
The return code is set to 1.
i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2145 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

uplo <> 'U' or 'L'
diag <> 'U' or 'N'
n < 0
lda <= 0
lda < n

Example 1

This example shows how the inverse of matrix A is computed, where A is a 5 by 5 upper triangular matrix that is not unit triangular and is stored in upper-triangular storage mode. Matrix A is:

                    *                                *
                    | 1.00  3.00  4.00   5.00   6.00 |
                    | 0.00  2.00  8.00   9.00   1.00 |
                    | 0.00  0.00  4.00   8.00   4.00 |
                    | 0.00  0.00  0.00  -2.00   6.00 |
                    | 0.00  0.00  0.00   0.00  -1.00 |
                    *                                *

and where the following inverse matrix is computed. Matrix A^-1 is:

                  *                                   *
                  | 1.00  -1.50   2.00   3.75   35.00 |
                  | 0.00   0.50  -1.00  -1.75  -14.00 |
                  | 0.00   0.00   0.25   1.00    7.00 |
                  | 0.00   0.00   0.00  -0.50   -3.00 |
                  | 0.00   0.00   0.00   0.00   -1.00 |
                  *                                   *

Call Statement and Input

           UPLO  DIAG  A  LDA  N
            |     |    |   |   |
CALL STRI( 'U' , 'N' , A , 5 , 5)

        *                                *
        | 1.00  3.00  4.00   5.00   6.00 |
        |  .    2.00  8.00   9.00   1.00 |
A    =  |  .     .    4.00   8.00   4.00 |
        |  .     .     .    -2.00   6.00 |
        |  .     .     .      .    -1.00 |
        *                                *

Output

        *                                   *
        | 1.00  -1.50   2.00   3.75   35.00 |
        |  .     0.50  -1.00  -1.75  -14.00 |
A    =  |  .      .     0.25   1.00    7.00 |
        |  .      .      .    -0.50   -3.00 |
        |  .      .      .      .     -1.00 |
        *                                   *

Example 2

This example shows how the inverse of matrix A is computed, where A is a 5 by 5 lower triangular matrix that is unit triangular and is stored in lower-triangular storage mode. Matrix A is:

                       *                         *
                       | 1.0  0.0  0.0  0.0  0.0 |
                       | 3.0  1.0  0.0  0.0  0.0 |
                       | 4.0  8.0  1.0  0.0  0.0 |
                       | 5.0  9.0  8.0  1.0  0.0 |
                       | 6.0  1.0  4.0  6.0  1.0 |
                       *                         *

and where the following inverse matrix is computed. Matrix A^-1 is:

                   *                                 *
                   |    1.0     0.0   0.0   0.0  0.0 |
                   |   -3.0     1.0   0.0   0.0  0.0 |
                   |   20.0    -8.0   1.0   0.0  0.0 |
                   | -138.0    55.0  -8.0   1.0  0.0 |
                   |  745.0  -299.0  44.0  -6.0  1.0 |
                   *                                 *

Note:

Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input

           UPLO  DIAG  A  LDA  N
            |     |    |   |   |
CALL STRI( 'L' , 'U' , A , 5 , 5)

        *                       *
        |  .    .    .    .   . |
        | 3.0   .    .    .   . |
A    =  | 4.0  8.0   .    .   . |
        | 5.0  9.0  8.0   .   . |
        | 6.0  1.0  4.0  6.0  . |
        *                       *

Output

        *                               *
        |     .       .     .     .   . |
        |   -3.0      .     .     .   . |
A    =  |   20.0    -8.0    .     .   . |
        | -138.0    55.0  -8.0    .   . |
        |  745.0  -299.0  44.0  -6.0  . |
        *                               *

Example 3

This example shows how the inverse of matrix A is computed, where A is the same matrix shown in Example 1 and is stored in upper-triangular-packed storage mode. The inverse matrix computed here is the same as the inverse matrix shown in Example 1 and is stored in upper-triangular-packed storage mode.

Call Statement and Input

           UPLO  DIAG  AP   N
            |     |    |    |
CALL STPI( 'U' , 'N' , AP , 5)

AP       =  (1.00, 3.00, 2.00, 4.00, 8.00, 4.00, 5.00, 9.00, 8.00,
             -2.00, 6.00, 1.00, 4.00, 6.00, -1.00)

Output

AP       =  (1.00, -1.50, 0.50, 2.00, -1.00, 0.25, 3.75, -1.75, 1.00,
             -0.50, 35.00, -14.00, 7.00, -3.00, -1.00)

Example 4

This example shows how the inverse of matrix A is computed, where A is the same matrix shown in Example 2 and is stored in lower-triangular-packed storage mode. The inverse matrix computed here is the same as the inverse matrix shown in Example 2 and is stored in lower-triangular-packed storage mode.
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input

           UPLO  DIAG  AP   N
            |     |    |    |
CALL STPI( 'L' , 'U' , AP , 5)

AP       =  ( . , 3.0, 4.0, 5.0, 6.0, . , 8.0, 9.0, 1.0, . , 8.0, 4.0,
             . , 6.0, . )

Output

AP       =  ( . , -3.0, 20.0, -138.0, 745.0, . , -8.0, 55.0, -299.0,
             . , -8.0, 44.0, . , -6.0, . )

Banded Linear Algebraic Equation Subroutines

This section contains the banded linear algebraic equation subroutine descriptions.

SGBF and DGBF--General Band Matrix Factorization

These subroutines factor general band matrix A, stored in general-band storage mode, using Gaussian elimination. To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGBS or DGBS, respectively.

Table 102. Data Types

A Subroutine
Short-precision real SGBF
Long-precision real DGBF

Note: The output from these factorization subroutines should be used only as input to the solve subroutines SGBS and DGBS, respectively.

Syntax

Fortran	CALL SGBF \| DGBF (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`)
C and C++	sgbf \| dgbf (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`);
PL/I	CALL SGBF \| DGBF (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`);

On Entry

agb: is the general band matrix A of order n, stored in general-band storage mode, to be factored. It has an upper band width mu and a lower band width ml. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 102, where lda >= 2ml+mu+16.
lda: is the leading dimension of the array specified for agb. Specified as: a fullword integer; lda > 0 and lda >= 2ml+mu+16.
n: is the order of the matrix A. Specified as: a fullword integer; n > ml and n > mu.
ml: is the lower band width ml of the matrix A. Specified as: a fullword integer; 0 <= ml < n.
mu: is the upper band width mu of the matrix A. Specified as: a fullword integer; 0 <= mu < n.
ipvt: See 'On Return'.

On Return

agb: is the transformed matrix A of order n, containing the results of the factorization. See "Function". Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 102.
ipvt: is the integer vector ipvt of length n, containing the pivot information necessary to construct matrix L from the information contained in the output array agb. Returned as: a one-dimensional array of (at least) length n, containing fullword integers.

Notes

ipvt is not a permutation vector in the strict sense. It is used to record column interchanges in L due to partial pivoting and to improve performance.
The entire lda by n array specified for agb must remain unchanged between calls to the factorization and solve subroutines.
This subroutine can be used for tridiagonal matrices (ml = mu = 1); however, the tridiagonal subroutines SGTF/DGTF and SGTS/DGTS are faster.
For a description of how a general band matrix is stored in general-band storage mode in an array, see "General Band Matrix".

Function

The general band matrix A, stored in general-band storage mode, is factored using Gaussian elimination with partial pivoting to compute the LU factorization of A, where:

ipvt is a vector containing the pivoting information.

L is a unit lower triangular band matrix.

U is an upper triangular band matrix.

The transformed matrix A contains U in packed format, along with the multipliers necessary to construct, with the help of ipvt, a matrix L, such that A = LU. This factorization can then be used by SGBS or DGBS, respectively, to solve the system of equations. See reference [38].

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is singular.

One or more columns of L and the corresponding diagonal of U contain all zeros (all columns of L are checked). The last column, i, of L with a corresponding U = 0 diagonal element is identified in the computational error message.
The return code is set to 1.
i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2103 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

lda <= 0
ml < 0
ml >= n
mu < 0
mu >= n
lda < 2ml+mu+16

Example

This example shows a factorization of a general band matrix A of order 9, with a lower band width of 2 and an upper band width of 3. On input matrix A is:

            *                                                *
            | 1.0  1.0  1.0  1.0  0.0  0.0   0.0   0.0   0.0 |
            | 1.0  1.0  1.0  1.0  1.0  0.0   0.0   0.0   0.0 |
            | 4.0  1.0  1.0  1.0  1.0  1.0   0.0   0.0   0.0 |
            | 0.0  5.0  1.0  1.0  1.0  1.0   1.0   0.0   0.0 |
            | 0.0  0.0  6.0  1.0  1.0  1.0   1.0   1.0   0.0 |
            | 0.0  0.0  0.0  7.0  1.0  1.0   1.0   1.0   1.0 |
            | 0.0  0.0  0.0  0.0  8.0  1.0   1.0   1.0   1.0 |
            | 0.0  0.0  0.0  0.0  0.0  9.0   1.0   1.0   1.0 |
            | 0.0  0.0  0.0  0.0  0.0  0.0  10.0  11.0  12.0 |
            *                                                *

Matrix A is stored in general-band storage mode in the two-dimensional array AGB of size LDA by N, where LDA = 2ml+mu+16 = 23. The array AGB is declared as AGB(1:23,1:9).
Note: Matrix A is the same matrix used in the examples in subroutines SGEF and DGEF (see "Example 1") and SGEFCD and DGEFCD (see "Example").

Call Statement and Input

           AGB  LDA   N   ML  MU  IPVT )
            |    |    |   |   |    |
CALL SGBF( AGB , 23 , 9 , 2 , 3 , IPVT )

        *                                                                           *
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  1.0000  1.0000  1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  1.0000  1.0000  1.0000  1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  1.0000  1.0000  1.0000  1.0000  1.0000   1.0000   1.0000   1.0000 |
        | 1.0000  1.0000  1.0000  1.0000  1.0000  1.0000   1.0000   1.0000  12.0000 |
        | 1.0000  1.0000  1.0000  1.0000  1.0000  1.0000   1.0000  11.0000   0.0000 |
        | 4.0000  5.0000  6.0000  7.0000  8.0000  9.0000  10.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
AGB  =  | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000 |
        *                                                                           *

Output

        *                                                                             *
        | 0.0000  0.0000  0.0000  0.0000   0.0000   1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  0.0000  0.0000   1.0000   1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  0.0000  1.0000   1.0000   1.0000   1.0000   1.0000   1.0000 |
        | 0.0000  0.0000  1.0000  1.0000   1.0000   1.0000   1.0000   1.0000  12.0000 |
        | 0.0000  1.0000  1.0000  1.0000   1.0000   1.0000   1.0000  11.0000   0.3111 |
        | 0.2500  0.2000  0.1600  0.1400   0.1250   0.1100   0.1000   5.5380  -325.00 |
        | 0.0000  0.1500  0.0000  0.0714   0.0000  -0.0556  -0.0306   0.9385   0.0000 |
        | 0.2500  0.1500  0.1000  0.0714  -0.0714  -0.0694  -0.0194   0.0000   0.0000 |
        | 0.2500  0.0000  0.1000  0.0000   0.0536   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
AGB  =  | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        | 0.0000  0.0000  0.0000  0.0000   0.0000   0.0000   0.0000   0.0000   0.0000 |
        *                                                                             *

IPVT     =  (2, -65534, -131070, -196606, -262142, -327678, -327678,
             -327680, -327680)

SGBS and DGBS--General Band Matrix Solve

These subroutines solve the system Ax = b for x, where A is a general band matrix, and x and b are vectors. They use the results of the factorization of matrix A, produced by a preceding call to SGBF or DGBF, respectively.

Table 103. Data Types

A, b, x Subroutine
Short-precision real SGBS
Long-precision real DGBS

Note: The input to these solve subroutines must be the output from the factorization subroutines SGBF and DGBF, respectively.

Syntax

Fortran	CALL SGBS \| DGBS (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`, `bx`)
C and C++	sgbs \| dgbs (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`, `bx`);
PL/I	CALL SGBS \| DGBS (`agb`, `lda`, `n`, `ml`, `mu`, `ipvt`, `bx`);

On Entry

agb

is the factorization of general band matrix A, produced by a preceding call to SGBF or DGBF. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 103, where lda >= 2ml+mu+16.

lda

is the leading dimension of the array specified for agb. Specified as: a fullword integer; lda > 0 and lda >= 2ml+mu+16.

n

is the order of the matrix A. Specified as: a fullword integer; n > ml and n > mu.

ml

is the lower band width ml of the matrix A. Specified as: a fullword integer; 0 <= ml < n.

mu

is the upper band width mu of the matrix A. Specified as: a fullword integer; 0 <= mu < n.

ipvt

is the integer vector ipvt of length n, produced by a preceding call to SGBF or DGBF. It contains the pivot information necessary to construct matrix L from the information contained in the array specified for agb.

Specified as: a one-dimensional array of (at least) length n, containing fullword integers.

bx

is the vector b of length n, containing the right-hand side of the system. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 103.

On Return

bx: is the solution vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 103.

Notes

The scalar data specified for input arguments lda, n, ml, and mu for these subroutines must be the same as that specified for SGBF and DGBF, respectively.
The array data specified for input arguments agb and ipvt for these subroutines must be the same as the corresponding output arguments for SGBF and DGBF, respectively.
The entire lda by n array specified for agb must remain unchanged between calls to the factorization and solve subroutines.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".
This subroutine can be used for tridiagonal matrices (ml = mu = 1); however, the tridiagonal subroutines, SGTF/DGTF and SGTS/DGTS, are faster.
For a description of how a general band matrix is stored in general-band storage mode in an array, see "General Band Matrix".

Function

The real system Ax = b is solved for x, where A is a real general band matrix, stored in general-band storage mode, and x and b are vectors. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SGBF or DGBF, respectively. The transformed matrix A, used by this computation, consists of the upper triangular matrix U and the multipliers necessary to construct L using ipvt, as defined in "Function". See reference [38].

Error Conditions

Computational Errors

Note:

If the factorization performed by SGBF or DGBF failed due to a singular matrix argument, the results returned by this subroutine are unpredictable, and there may be a divide-by-zero program exception message.

Input-Argument Errors

lda <= 0
ml < 0
ml >= n
mu < 0
mu >= n
lda < 2ml+mu+16

Example

This example shows how to solve the system Ax = b, where general band matrix A is the same matrix factored in "Example" for SGBF and DGBF. The input for AGB and IPVT in this example is the same as the output for that example.

Call Statement and Input

           AGB  LDA   N   ML  MU  IPVT   BX
            |    |    |   |   |    |     |
CALL SGBS( AGB , 23 , 9 , 2 , 3 , IPVT , BX )

IPVT     =  (2, -65534, -131070, -196606, -262142, -327678, -327678,
             -327680, -327680)
BX       =  (4.0000, 5.0000, 9.0000, 10.0000, 11.0000, 12.0000,
             12.0000, 12.0000, 33.0000)
AGB      =(same as output AGB in
"Example")

Output

BX       =  (1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000, 1.0000,
             0.9999, 1.0001)

SPBF, DPBF, SPBCHF, and DPBCHF--Positive Definite Symmetric Band Matrix Factorization

These subroutines factor positive definite symmetric band matrix A, stored in lower-band-packed storage mode, using:

Gaussian elimination for SPBF and DPBF
Cholesky factorization for SPBCHF and DPBCHF

To solve the system of equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SPBS, DPBS, SPBCHS, or DPBCHS, respectively.

Table 104. Data Types

A Subroutine
Short-precision real SPBF and SPBCHF
Long-precision real DPBF and DPBCHF

Notes:

The output from these factorization subroutines should be used only as input to the solve subroutines SPBS, DPBS, SPBCHS, and DPBCHS, respectively.
For optimal performance:
- For wide band widths, use _PBCHF.
- For narrow band widths, use either _PBF or _PBCHF.
- For very narrow band widths:
  - Use either SPBF or SPBCHF.
  - Use DPBF.

Syntax

Fortran	CALL SPBF \| DPBF \| SPBCHF \| DPBCHF (`apb`, `lda`, `n`, `m`)
C and C++	spbf \| dpbf \| spbchf \| dpbchf (`apb`, `lda`, `n`, `m`);
PL/I	CALL SPBF \| DPBF \| SPBCHF \| DPBCHF (`apb`, `lda`, `n`, `m`);

On Entry

apb: is the positive definite symmetric band matrix A of order n, stored in lower-band-packed storage mode, to be factored. It has a half band width of m. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 104. See "Notes".
lda: is the leading dimension of the array specified for apb. Specified as: a fullword integer; lda > 0 and lda > m.
n: is the order n of matrix A. Specified as: a fullword integer; n > m.
m: is the half band width of the matrix A. Specified as: a fullword integer; 0 <= m < n.

On Return

apb: is the transformed matrix A of order n, containing the results of the factorization. See "Function". Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 104. For further details, see "Notes".

Notes

These subroutines can be used for tridiagonal matrices (m = 1); however, the tridiagonal subroutines, SPTF/DPTF and SPTS/DPTS, are faster.
For SPBF and DPBF when m > 0, location APB(2,n) is sometimes set to 0.
For a description of how a positive definite symmetric band matrix is stored in lower-band-packed storage mode in an array, see "Positive Definite Symmetric Band Matrix".

Function

The positive definite symmetric band matrix A, stored in lower-band-packed storage mode, is factored using Gaussian elimination in SPBF and DPBF and Cholesky factorization in SPBCHF and DPBCHF. The transformed matrix A contains the results of the factorization in packed format. This factorization can then be used by SPBS, DPBS, SPBCHS, and DPBCHS, respectively, to solve the system of equations.

For performance reasons, divides are done in a way that reduces the effective exponent range for which DPBF works properly, when processing narrow band widths; therefore, you may want to scale your problem.

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

Matrix A is not positive definite (for SPBF and DPBF).
- One or more elements of D contain values less than or equal to 0; all elements of D are checked. The index i of the last nonpositive element encountered is identified in the computational error message.
- The return code is set to 1.
- i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2104 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "Coding Your Program".
Matrix A is not positive definite (for SPBCHF and DPBCHF).
- The leading minor of order i has a nonpositive determinant. The order i is identified in the computational error message.
- The return code is set to 1.
- i can be determined at run time by using the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2115 in the ESSL error option table; otherwise, the default value causes your program to be terminate when this error occurs. For details, see "Coding Your Program".

Input-Argument Errors

lda <= 0
m < 0
m >= n
m >= lda

Example 1

This example shows a factorization of a real positive definite symmetric band matrix A of order 9, using Gaussian elimination, where on input, matrix A is:

             *                                             *
             | 1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  2.0  2.0  1.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  2.0  3.0  2.0  1.0  0.0  0.0  0.0  0.0 |
             | 0.0  1.0  2.0  3.0  2.0  1.0  0.0  0.0  0.0 |
             | 0.0  0.0  1.0  2.0  3.0  2.0  1.0  0.0  0.0 |
             | 0.0  0.0  0.0  1.0  2.0  3.0  2.0  1.0  0.0 |
             | 0.0  0.0  0.0  0.0  1.0  2.0  3.0  2.0  1.0 |
             | 0.0  0.0  0.0  0.0  0.0  1.0  2.0  3.0  2.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  1.0  2.0  3.0 |
             *                                             *

and on output, matrix A is:

             *                                             *
             | 1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  0.0  0.0  0.0 |
             | 0.0  0.0  1.0  1.0  1.0  1.0  1.0  0.0  0.0 |
             | 0.0  0.0  0.0  1.0  1.0  1.0  1.0  1.0  0.0 |
             | 0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  1.0 |
             | 0.0  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  1.0  1.0  1.0 |
             *                                             *

where array location APB(2,9) is set to 0.0.

Call Statement and Input

           APB  LDA  N   M
            |    |   |   |
CALL SPBF( APB , 3 , 9 , 2 )
 
        *                                             *
        | 1.0  2.0  3.0  3.0  3.0  3.0  3.0  3.0  3.0 |
APB  =  | 1.0  2.0  2.0  2.0  2.0  2.0  2.0  2.0   .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
        *                                             *

Output

        *                                             *
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
APB  =  | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.0 |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
        *                                             *

Example 2

This example shows a Cholesky factorization of the same matrix used in Example 1.

Call Statement and Input

             APB  LDA  N   M
              |    |   |   |
CALL SPBCHF( APB , 3 , 9 , 2 )

APB      =(same as input APB in Example 1)

Output

        *                                             *
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
APB  =  | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0   .  |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0   .    .  |
        *                                             *

SPBS, DPBS, SPBCHS, and DPBCHS--Positive Definite Symmetric Band Matrix Solve

These subroutines solve the system Ax = b for x, where A is a positive definite symmetric band matrix, and x and b are vectors. They use the results of the factorization of matrix A, produced by a preceding call to SPBF, DPBF, SPBCHF, and DPBCHF, respectively, where:

Gaussian elimination was used by SPBF and DPBF.
Cholesky factorization was used by SPBCHF and DPBCHF.

Table 105. Data Types

A, b, x Subroutine
Short-precision real SPBS and SPBCHS
Long-precision real DPBS and DPBCHS

Notes:

The input to these solve subroutines must be the output from the factorization subroutines SPBF, DPBF, SPBCHF, and DPBCHF, respectively.
For performance tradeoffs, see SPBF, DPBF, SPBCHF, and DPBCHF--Positive Definite Symmetric Band Matrix Factorization.

Syntax

Fortran	CALL SPBS \| DPBS \| SPBCHS \| DPBCHS (`apb`, `lda`, `n`, `m`, `bx`)
C and C++	spbs \| dpbs \| spbchs \| dpbchs (`apb`, `lda`, `n`, `m`, `bx`);
PL/I	CALL SPBS \| DPBS \| SPBCHS \| DPBCHS (`apb`, `lda`, `n`, `m`, `bx`);

On Entry

apb: is the factorization of matrix A, produced by a preceding call to SPBF or DPBF. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 105. See "Notes".
lda: is the leading dimension of the array specified for apb. Specified as: a fullword integer; lda > 0 and lda > m.
n: is the order n of matrix A. Specified as: a fullword integer; n > m.
m: is the half band width of the matrix A. Specified as: a fullword integer; 0 <= m < n.
bx: is the vector b of length n, containing the right-hand side of the system. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 105.

On Return

bx: is the solution vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 105.

Notes

The scalar data specified for input arguments lda, n, and m for these subroutines must be the same as that specified for SPBF, DPBF, SPBCHF, and DPBCHF, respectively.
The array data specified for input argument apb for these subroutines must be the same as the corresponding output argument for SPBF, DPBF, SPBCHF, and DPBCHF, respectively.
These subroutines can be used for tridiagonal matrices (m = 1); however, the tridiagonal subroutines, SPTF/DPTF and SPTS/DPTS, are faster.
The vectors and matrices used in this computation must have no common elements; otherwise, results are unpredictable. See "Concepts".
For a description of how a positive definite symmetric band matrix is stored in lower-band-packed storage mode in an array, see "Positive Definite Symmetric Band Matrix".

Function

The system Ax = b is solved for x, where A is a positive definite symmetric band matrix, stored in lower-band-packed storage mode, and x and b are vectors. These subroutines use the results of the factorization of matrix A, produced by a preceding call to SPBF, DPBF, SPBCHF, or DPBCHF, respectively.

Error Conditions

Computational Errors

None
Note: If the factorization subroutine resulted in a nonpositive definite matrix, error 2104 for SPBF and DPBF or error 2115 for SPBCHF and DPBCHF, results of these subroutines may be unpredictable.

Input-Argument Errors

lda <= 0
m < 0
m >= n
m >= lda

Example 1

This example shows how to solve the system Ax = b, where matrix A is the same matrix factored in the "Example 1" for SPBF and DPBF, using Gaussian elimination.

Call Statement and Input

           APB  LDA  N   M   BX
            |    |   |   |   |
CALL SPBS( APB , 3 , 9 , 2 , BX )

APB      =(same as output APB in
"Example 1")
BX       =  (3.0, 6.0, 9.0, 9.0, 9.0, 9.0, 9.0, 8.0, 6.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

This example shows how to solve the system Ax = b, where matrix A is the same matrix factored in the "Example 2" for SPBCHF and DPBCHF, using Cholesky factorization.

Call Statement and Input

             APB  LDA  N   M   BX
              |    |   |   |   |
CALL SPBCHS( APB , 3 , 9 , 2 , BX )

APB      =(same as output APB in
"Example 2")
BX       =  (3.0, 6.0, 9.0, 9.0, 9.0, 9.0, 9.0, 8.0, 6.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)

SGTF and DGTF--General Tridiagonal Matrix Factorization

These subroutines compute the standard Gaussian factorization with partial pivoting for tridiagonal matrix A, stored in tridiagonal storage mode. To solve a tridiagonal system with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGTS or DGTS, respectively.

Table 106. Data Types

c, d, e, f Subroutine
Short-precision real SGTF
Long-precision real DGTF

Note: The output from these factorization subroutines should be used only as input to the solve subroutines SGTS and DGTS, respectively.

Syntax

Fortran	CALL SGTF \| DGTF (`n`, `c`, `d`, `e`, `f`, `ipvt`)
C and C++	sgtf \| dgtf (`n`, `c`, `d`, `e`, `f`, `ipvt`);
PL/I	CALL SGTF \| DGTF (`n`, `c`, `d`, `e`, `f`, `ipvt`);

On Entry

n: is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.
c: is the vector c, containing the lower subdiagonal of matrix A in positions 2 through n in an array, referred to as C. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
d: is the vector d, containing the main diagonal of matrix A, in positions 1 through n in an array, referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
e: is the vector e, containing the upper subdiagonal of matrix A, in positions 1 through n-1 in an array, referred to as E. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
f: See 'On Return'.
ipvt: See 'On Return'.

On Return

c: is the vector c, containing part of the factorization of matrix A in positions 1 through n in an array, referred to as C. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
d: is the vector d, containing part of the factorization of matrix A in an array, referred to as D. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
e: is the vector e, containing part of the factorization of the matrix A in positions 1 through n in an array, referred to as E. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
f: is the vector f, containing part of the factorization of matrix A in the first n positions in an array, referred to as F. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 106.
ipvt: is the integer vector ipvt of length n, containing the pivot information. Returned as: a one-dimensional array of (at least) length n, containing fullword integers.

Notes

For a description of how tridiagonal matrices are stored, see "General Tridiagonal Matrix".
ipvt is not a permutation vector in the strict sense. It is used to record column interchanges in the tridiagonal matrix due to partial pivoting.
The factorization matrix A is stored in nonstandard format.

Function

The standard Gaussian elimination with partial pivoting of tridiagonal matrix A is computed. The factorization is returned by overwriting input arrays C, D, and E, and by writing into output array F, along with pivot information in vector ipvt. This factorization can then be used by SGTS or DGTS, respectively, to solve tridiagonal systems of linear equations. See references [43], [51], [52], and [84]. If n is 0, no computation is performed.

Error Conditions

Computational Errors

Matrix A is singular or nearly singular.

A pivot element has a value that cannot be reciprocated or is equal to 0. The index i of the element is identified in the computational error message.
The return code is set to 1.
i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2105 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

n < 0

Example

This example shows how to factor the following tridiagonal matrix A of order 4:

                          *                    *
                          | 2.0  2.0  0.0  0.0 |
                          | 1.0  3.0  2.0  0.0 |
                          | 0.0  1.0  3.0  2.0 |
                          | 0.0  0.0  1.0  3.0 |
                          *                    *

Call Statement and Input

           N   C   D   E   F   IPVT
           |   |   |   |   |    |
CALL DGTF( 4 , C , D , E , F , IPVT )
 
C        =  ( . , 1.0, 1.0, 1.0)
D        =  (2.0, 3.0, 3.0, 3.0)
E        =  (2.0, 2.0, 2.0, . )

Output

C        =  ( . , -0.5, -0.5, -0.5)
D        =  (-0.5, -0.5, -0.5, -0.5)
E        =  (2.0, 2.0, 2.0, . )
IPVT     =  (X'00', X'00', X'00', X'00')

Notes

F is stored in an internal format and is passed unchanged to the solve subroutine.
A "." means you do not have to store a value in that position in the array. However, these storage positions are required and may be overwritten during the computation.

SGTS and DGTS--General Tridiagonal Matrix Solve

These subroutines solve a tridiagonal system of linear equations using the factorization of tridiagonal matrix A, stored in tridiagonal storage mode, produced by SGTF or DGTF, respectively.

Table 107. Data Types

c, d, e, f, b, x Subroutine
Short-precision real SGTS
Long-precision real DGTS

Note: The input to these solve subroutines must be the output from the factorization subroutines SGTF and DGTF, respectively.

Syntax

Fortran	CALL SGTS \| DGTS (`n`, `c`, `d`, `e`, `f`, `ipvt`, `bx`)
C and C++	sgts \| dgts (`n`, `c`, `d`, `e`, `f`, `ipvt`, `bx`);
PL/I	CALL SGTS \| DGTS (`n`, `c`, `d`, `e`, `f`, `ipvt`, `bx`);

On Entry

n: is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.
c: is the vector c, containing part of the factorization of matrix A from SGTF or DGTF, respectively, in an array, referred to as C. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 107.
d: is the vector d, containing part of the factorization of matrix A from SGTF or DGTF, respectively, in an array, referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 107.
e: is the vector e, containing part of the factorization of matrix A from SGTF or DGTF, respectively, in an array, referred to as E. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 107.
f: is the vector f, containing part of the factorization of matrix A from SGTF or DGTF, respectively, in an array, referred to as F. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 107.
ipvt: is the integer vector ipvt of length n, containing the pivot information, produced by a preceding call to SGTF and DGTF, respectively. Specified as: a one-dimensional array of (at least) length n, containing fullword integers.
bx: is the vector b of length n, containing the right-hand side of the system in the first n positions in an array, referred to as BX. Specified as: a one-dimensional array of (at least) length n+1, containing numbers of the data type indicated in Table 107. For details on specifying the length, see "Notes".

On Return

bx: is the solution vector x (at least) of length n, containing the solution of the tridiagonal system in the first n positions in an array, referred to as BX. Returned as: a one-dimensional array, of (at least) length (n+1), containing numbers of the data type indicated in Table 107. For details about the length, see "Notes".

Notes

For a description of how tridiagonal matrices are stored, see "General Tridiagonal Matrix".
Array BX can have a length of n if memory location BX(n+1) is addressable--that is, not in read-protected storage. If it is in read-protected storage, array BX must have a length of n+1. In both cases, the vector b (on input) and vector x (on output) reside in positions 1 through n in array BX. Array location BX(n+1) is not altered by these subroutines.

Function

Given the factorization produced by SGTF or DGTF, respectively, these subroutines use the standard forward elimination and back substitution to solve the tridiagonal system Ax = b, where A is a general tridiagonal matrix. See references [43], [51], [52], and [84].

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example

This example solves the tridiagonal system Ax = b, where matrix A is the same matrix factored in "Example" for SGTF and DGTF, and where:

            b = (4.0, 6.0, 6.0, 4.0)
            x = (1.0, 1.0, 1.0, 1.0)

Call Statement and Input

           N   C   D   E   F   IPVT   BX
           |   |   |   |   |    |     |
CALL DGTS( 4 , C , D , E , F , IPVT , BX )

C        =(same as output C in "Example")
D        =(same as output D in "Example")
E        =(same as output E in "Example")
F        =(same as output F in "Example")
IPVT     =(same as output IPVT in "Example")
BX       =(4.0, 6.0, 6.0, 4.0, . )

Output

BX       =  (1.0, 1.0, 1.0, 1.0, . )

SGTNP, DGTNP, CGTNP, and ZGTNP--General Tridiagonal Matrix Combined Factorization and Solve with No Pivoting

These subroutines solve the tridiagonal system Ax = b using Gaussian elimination, where tridiagonal matrix A is stored in tridiagonal storage mode.

Table 108. Data Types

c, d, e, b, x Subroutine
Short-precision real SGTNP
Long-precision real DGTNP
Short-precision complex CGTNP
Long-precision complex ZGTNP

Note: In general, these subroutines provide better performance than the _GTNPF and _GTNPS subroutines; however, in the following instances, you get better performance by using _GTNPF and _GTNPS:

For small n
When performing a single factorization followed by multiple solves

Syntax

Fortran	CALL SGTNP \| DGTNP \| CGTNP \| ZGTNP (`n`, `c`, `d`, `e`, `bx`)
C and C++	sgtnp \| dgtnp \| cgtnp \| zgtnp (`n`, `c`, `d`, `e`, `bx`);
PL/I	CALL SGTNP \| DGTNP \| CGTNP \| ZGTNP (`n`, `c`, `d`, `e`, `bx`);

On Entry

n: is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.
c: is the vector c, containing the lower subdiagonal of matrix A in positions 2 through n in an array, referred to as C. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 108. On output, C is overwritten; that is, the original input is not preserved.
d: is the vector d, containing the main diagonal of matrix A in positions 1 through n in an array, referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 108. On output, D is overwritten; that is, the original input is not preserved.
e: is the vector e, containing the upper subdiagonal of matrix A in positions 1 through n-1 in an array, referred to as E. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 108. On output, E is overwritten; that is, the original input is not preserved.
bx: is the vector b, containing the right-hand side of the system in the first n positions in an array, referred to as BX. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 108.

On Return

bx: is the solution vector x of length n, containing the solution of the tridiagonal system in the first n positions in an array, referred to as BX. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 108.

Note

For a description of how tridiagonal matrices are stored, see "General Tridiagonal Matrix".

Function

The solution of the tridiagonal system Ax = b is computed by Gaussian elimination.

No pivoting is done. Therefore, these subroutines should not be used when pivoting is necessary to maintain the numerical accuracy of the solution. Overflow may occur if small main diagonal elements are generated. Underflow or accuracy loss may occur if large main diagonal elements are generated.

For performance reasons, complex divides are done without scaling. Computing the inverse in this way restricts the range of numbers for which the ZGTNP subroutine works properly.

For performance reasons, divides are done in a way that reduces the effective exponent range for which DGTNP and ZGTNP work properly; therefore, you may want to scale your problem, such that the diagonal elements are close to 1.0 for DGTNP and (1.0, 0.0) for ZGTNP.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a factorization of the real tridiagonal matrix A, of order 4:

                         *                     *
                         | 7.0  4.0  0.0   0.0 |
                         | 1.0  8.0  5.0   0.0 |
                         | 0.0  2.0  9.0   6.0 |
                         | 0.0  0.0  3.0  10.0 |
                         *                     *

It then finds the solution of the tridiagonal system Ax = b, where b is:

                        (11.0, 14.0, 17.0, 13.0)

and x is:

                        (1.0, 1.0, 1.0, 1.0)

On output, arrays C, D, and E are overwritten.

Call Statement and Input

            N   C   D   E   BX
            |   |   |   |   |
CALL DGTNP( 4 , C , D , E , BX )
 
C        =  ( . , 1.0, 2.0, 3.0)
D        =  (7.0, 8.0, 9.0, 10.0)
E        =  (4.0, 5.0, 6.0, . )
BX       =  (11.0, 14.0, 17.0, 13.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0)

Example 2

This example shows a factorization of the complex tridiagonal matrix A, of order 4:

         *                                                       *
         | (7.0,  7.0)  (4.0,  4.0)  (0.0,  0.0)   (0.0,  0.0)   |
         | (1.0,  1.0)  (8.0,  8.0)  (5.0,  5.0)   (0.0,  0.0)   |
         | (0.0,  0.0)  (2.0,  2.0)  (9.0,  9.0)   (6.0,  6.0)   |
         | (0.0,  0.0)  (0.0,  0.0)  (3.0,  3.0)  (10.0, 10.0)   |
         *                                                       *

It then finds the solution of the tridiagonal system Ax = b, where b is:

          ((-11.0,19.0), (-14.0,50.0), (-17.0,93.0), (-13.0,85.0))

and x is:

          ((1.0,-1.0), (2.0,-2.0), (3.0,-3.0), (4.0,-4.0))

On output, arrays C, D, and E are overwritten.

Call Statement and Input

            N   C   D   E   BX
            |   |   |   |   |
CALL ZGTNP( 4 , C , D , E , BX )
 
C        =  ( . , (1.0, 1.0), (2.0, 2.0), (3.0, 3.0))
D        =  ((7.0, 7.0), (8.0, 8.0), (9.0, 9.0), (10.0, 10.0))
E        =  ((4.0, 4.0), (5.0, 5.0), (6.0, 6.0), . )
BX       =  ((-11.0, 19.0), (-14.0, 50.0), (-17.0, 93.0), (-13.0, 85.0))

Output

BX       =  ((0.0, 1.0), (1.0, 2.0), (2.0, 3.0), (3.0, 4.0))

SGTNPF, DGTNPF, CGTNPF, and ZGTNPF--General Tridiagonal Matrix Factorization with No Pivoting

These subroutines factor tridiagonal matrix A, stored in tridiagonal storage mode, using Gaussian elimination. To solve a tridiagonal system of linear equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SGTNPS, DGTNPS, CGTNPS, or ZGTNPS, respectively.

Table 109. Data Types

c, d, e Subroutine
Short-precision real SGTNPF
Long-precision real DGTNPF
Short-precision complex CGTNPF
Long-precision complex ZGTNPF

Notes:

The output from these factorization subroutines should be used only as input to the solve subroutines SGTNPS, DGTNPS, CGTNPS, and ZGTNPS, respectively.
In general, the _GTNP subroutines provide better performance than the _GTNPF and _GTNPS subroutines; however, in the following instances, you get better performance by using _GTNPF and _GTNPS:
- For small n
- When performing a single factorization followed by multiple solves

Syntax

Fortran	CALL SGTNPF \| DGTNPF \| CGTNPF \| ZGTNPF (`n`, `c`, `d`, `e`, `iopt`)
C and C++	sgtnpf \| dgtnpf \| cgtnpf \| zgtnpf (`n`, `c`, `d`, `e`, `iopt`);
PL/I	CALL SGTNPF \| DGTNPF \| CGTNPF \| ZGTNPF (`n`, `c`, `d`, `e`, `iopt`);

On Entry

n

is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.

c

is the vector c, containing the lower subdiagonal of matrix A in positions 2 through n in an array, referred to as C. Specified as: a one-dimensional array, of (at least) length n, containing numbers of the data type indicated in Table 109.

d

is the vector d, containing the main diagonal of matrix A in positions 1 through n in an array, referred to as D. Specified as: a one-dimensional array, of (at least) length n, containing numbers of the data type indicated in Table 109.

e

is the vector e, containing the upper subdiagonal of matrix A in positions 1 through n-1 in an array, referred to as E. Specified as: a one-dimensional array, of (at least) length n, containing numbers of the data type indicated in Table 109.

iopt

indicates the type of computation to be performed, where:

If iopt = 0 or 1, Gaussian elimination is used to factor the matrix.

Specified as: a fullword integer; iopt = 0 or 1.

On Return

c: is the vector c, containing part of the factorization of matrix A in positions 1 through n in an array, referred to as C. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 109.
d: is the vector d, containing part of the factorization of matrix A in an array, referred to as D. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 109.
e: is the vector e, containing part of the factorization of matrix A in positions 1 through n in an array, referred to as E. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 109. It has the same length as E on entry.

Note

For a description of how tridiagonal matrices are stored, see "General Tridiagonal Matrix".

Function

The factorization of a diagonally-dominant tridiagonal matrix A is computed using Gaussian elimination, This factorization can then be used by SGTNPS, DGTNPS, CGTNPS, or ZGTNPS respectively, to solve the tridiagonal systems of linear equations. See reference [71].

No pivoting is done by these subroutines. Therefore, these subroutines should not be used when pivoting is necessary to maintain the numerical accuracy of the solution. Overflow may occur if small main diagonal elements are generated. Underflow or accuracy loss may occur if large main diagonal elements are generated.

For performance reasons, complex divides are done without scaling. Computing the inverse in this way restricts the range of numbers for which ZGTNPF works properly.

For performance reasons, divides are done in a way that reduces the effective exponent range for which DGTNPF and ZGTNPF work properly; therefore, you may want to scale your problem, such that the diagonal elements are close to 1.0 for DGTNPF and (1.0, 0.0) for ZGTNPF.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0
iopt <> 0 or 1

Example 1

This example shows a factorization of the tridiagonal matrix A, of order 4:

                          *                    *
                          | 1.0  1.0  0.0  0.0 |
                          | 1.0  2.0  1.0  0.0 |
                          | 0.0  1.0  3.0  1.0 |
                          | 0.0  0.0  1.0  1.0 |
                          *                    *

Call Statement and Input

             N   C   D   E   IOPT
             |   |   |   |    |
CALL DGTNPF( 4 , C , D , E ,  0   )
 
C        =  ( . , 1.0, 1.0, 1.0)
D        =  (1.0, 2.0, 3.0, 1.0)
E        =  (1.0, 1.0, 1.0, .  )

Output

C        =  ( . , -1.0, -1.0, 1.0)
D        =  (-1.0, -1.0, -1.0, -1.0)
E        =  (1.0, 1.0, -1.0, . )

Example 2

This example shows a factorization of the tridiagonal matrix A, of order 4:

             *                                               *
             | (7.0, 7.0) (4.0, 4.0) (0.0, 0.0)   (0.0, 0.0) |
             | (1.0, 1.0) (8.0, 8.0) (5.0, 5.0)   (0.0, 0.0) |
             | (0.0, 0.0) (2.0, 2.0) (9.0, 9.0)   (6.0, 6.0) |
             | (0.0, 0.0) (0.0, 0.0) (3.0, 3.0) (10.0, 10.0) |
             *                                               *

Call Statement and Input

             N   C   D   E   IOPT
             |   |   |   |    |
CALL ZGTNPF( 4 , C , D , E ,  0   )
 
C        =  ( . , (1.0, 1.0), (2.0, 2.0), (3.0, 3.0))
D        =  ((7.0, 7.0), (8.0, 8.0), (9.0, 9.0), (10.0, 10.0))
E        =  ((4.0, 4.0), (5.0, 5.0), (6.0, 6.0), . )

Output

C        =  ( . , (-0.142, 0.0), (-0.269, 0.0), (3.0, 3.0))
D        =  ((-0.0714, 0.0714), (-0.0673, 0.0673), (-0.0854, 0.0854),
             (-0.05, 0.05))
E        =  ((4.0, 4.0), (5.0, 5.0), (-0.6, 0.0), . )

Notes

A "." means you do not have to store a value in that position in the array. However, these storage positions are required and may be overwritten during the computation.

SGTNPS, DGTNPS, CGTNPS, and ZGTNPS--General Tridiagonal Matrix Solve with No Pivoting

These subroutines solve a tridiagonal system of equations using the factorization of matrix A, stored in tridiagonal storage mode, produced by SGTNPF, DGTNPF, CGTNPF, or ZGTNPF, respectively.

Table 110. Data Types

c, d, e, b, x Subroutine
Short-precision real SGTNPS
Long-precision real DGTNPS
Short-precision complex CGTNPS
Long-precision complex ZGTNPS

Note: The input to these solve subroutines must be the output from the factorization subroutines SGTNPF, DGTNPF, CGTNPF, and ZGTNPF, respectively.

Syntax

Fortran	CALL SGTNPS \| DGTNPS \| CGTNPS \| ZGTNPS (`n`, `c`, `d`, `e`, `bx`)
C and C++	sgtnps \| dgtnps \| cgtnps \| zgtnps (`n`, `c`, `d`, `e`, `bx`);
PL/I	CALL SGTNPS \| DGTNPS \| CGTNPS \| ZGTNPS (`n`, `c`, `d`, `e`, `bx`);

On Entry

n: is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.
c: is the vector c, containing part of the factorization of matrix A from SGTNPF, DGTNPF, CGTNPF, and ZGTNPF, respectively, in an array, referred to as C. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 110.
d: is the vector d, containing part of the factorization of matrix A from SGTNPF, DGTNPF, CGTNPF, and ZGTNPF, respectively, in an array, referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 110.
e: is the vector e, containing part of the factorization of matrix A from SGTNPF, DGTNPF, CGTNPF, and ZGTNPF, respectively, in an array, referred to as E. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 110.
bx: is the vector b, containing the right-hand side of the system in the first n positions in an array, referred to as BX. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 110.

On Return

bx: is the solution vector x of length n, containing the solution of the tridiagonal system in the first n positions in an array, referred to as BX. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 110.

Note

For a description of how tridiagonal matrices are stored, see "General Tridiagonal Matrix".

Function

The solution of tridiagonal system Ax = b is computed using the factorization produced by SGTNPF, DGTNPF, CGTNPF, or ZGTNPF, respectively. The factorization is based on Gaussian elimination. See reference [71].

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example finds the solution of tridiagonal system Ax = b, where matrix A is the same matrix factored in "Example 1" for SGTNPF and DGTNPF. b is:

                    (2.0, 4.0, 5.0, 2.0)

and x is:

                    (1.0, 1.0, 1.0, 1.0)

Call Statement and Input

             N   C   D   E   BX
             |   |   |   |   |
CALL DGTNPS( 4 , C , D , E , BX )

C        =(same as output C in "Example 1")
D        =(same as output D in "Example 1")
E        =(same as output E in "Example 1")
BX       =(2.0, 4.0, 5.0, 2.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0)

Example 2

This example finds the solution of tridiagonal system Ax = b, where matrix A is the same matrix factored in "Example 2" for CGTNPF and ZGTNPF. b is:

          ((-11.0,19.0), (-14.0,50.0), (-17.0,93.0), (-13.0,85.0))

and x is:

          ((0.0,1.0), (1.0,2.0), (2.0,3.0), (3.0,4.0))

Call Statement and Input

             N   C   D   E   BX
             |   |   |   |   |
CALL ZGTNPS( 4 , C , D , E , BX )

C        =(same as output C in "Example 2")
D        =(same as output D in "Example 2")
E        =(same as output E in "Example 2")
BX       =((-11.0, 19.0), (-14.0, 50.0), (-17.0, 93.0), (-13.0, 85.))

Output

BX       =  ((0.0, 1.0), (1.0, 2.0), (2.0, 3.0), (3.0, 4.0))

SPTF and DPTF--Positive Definite Symmetric Tridiagonal Matrix Factorization

These subroutines factor symmetric tridiagonal matrix A, stored in symmetric-tridiagonal storage mode, using Gaussian elimination. To solve a tridiagonal system of linear equations with one or more right-hand sides, follow the call to these subroutines with one or more calls to SPTS or DPTS, respectively.

Table 111. Data Types

c, d Subroutine
Short-precision real SPTF
Long-precision real DPTF

Note: The output from these factorization subroutines should be used only as input to the solve subroutines SPTS and DPTS, respectively.

Syntax

Fortran	CALL SPTF \| DPTF (`n`, `c`, `d`, `iopt`)
C and C++	sptf \| dptf (`n`, `c`, `d`, `iopt`);
PL/I	CALL SPTF \| DPTF (`n`, `c`, `d`, `iopt`);

On Entry

n

is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.

c

is the vector c, containing the off-diagonal of matrix A in positions 2 through n in an array, referred to as C. Specified as: a one-dimensional array, of (at least) length n, containing numbers of the data type indicated in Table 111.

d

is the vector d, containing the main diagonal of matrix A in positions 1 through n in an array referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 111.

iopt

indicates the type of computation to be performed, where:

If iopt = 0 or 1, Gaussian elimination is used to factor the matrix.

Specified as: a fullword integer; iopt = 0 or 1.

On Return

c: is the vector c, containing part of the factorization of matrix A in an array, referred to as C. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 111.
d: is the vector d, containing part of the factorization of matrix A in positions 1 through n in an array, referred to as D. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 111. It has the same length as D on entry.

Note

For a description of how positive definite symmetric tridiagonal matrices are stored, see "Positive Definite Symmetric Tridiagonal Matrix".

Function

The factorization of positive definite symmetric tridiagonal matrix A is computed using Gaussian elimination. This factorization can then be used by SPTS or DPTS, respectively, to solve the tridiagonal systems of linear equations. See reference [71].

No pivoting is done. Therefore, these subroutines should not be used when pivoting is necessary to maintain the numerical accuracy of the solution. Overflow may occur if small pivots are generated.

For performance reasons, divides are done in a way that reduces the effective exponent range for which DPTF works properly; therefore, you may want to scale your problem, such that the diagonal elements are close to 1.0 for DPTF.

Error Conditions

Computational Errors

None
Note: There is no test for positive definiteness in these subroutines.

Input-Argument Errors

n < 0
iopt <> 0 or 1

Example

This example shows a factorization of the tridiagonal matrix A, of order 4:

                          *                    *
                          | 1.0  1.0  0.0  0.0 |
                          | 1.0  2.0  1.0  0.0 |
                          | 0.0  1.0  3.0  1.0 |
                          | 0.0  0.0  1.0  1.0 |
                          *                    *

Call Statement and Input

           N   C   D  IOPT
           |   |   |   |
CALL DPTF( 4 , C , D , 0  )
 
C        =  ( . , 1.0, 1.0, 1.0)
D        =  (1.0, 2.0, 3.0, 1.0)

Output

C        =  ( . , -1.0, -1.0, -1.0)
D        =  (-1.0, -1.0, -1.0, -1.0)

Notes

A "." means you do not have to store a value in that position in the array. However, these storage positions are required and may be overwritten during the computation.

SPTS and DPTS--Positive Definite Symmetric Tridiagonal Matrix Solve

These subroutines solve a positive definite symmetric tridiagonal system of equations using the factorization of matrix A, stored in symmetric-tridiagonal storage mode, produced by SPTF and DPTF, respectively.

Table 112. Data Types

c, d, b, x Subroutine
Short-precision real SPTS
Long-precision real DPTS

Note: The input to these solve subroutines must be the output from the factorization subroutines SPTF and DPTF, respectively.

Syntax

Fortran	CALL SPTS \| DPTS (`n`, `c`, `d`, `bx`)
C and C++	spts \| dpts (`n`, `c`, `d`, `bx`);
PL/I	CALL SPTS \| DPTS (`n`, `c`, `d`, `bx`);

On Entry

n: is the order n of tridiagonal matrix A. Specified as: a fullword integer; n >= 0.
c: is the vector c, containing part of the factorization of matrix A from SPTF or DPTF, respectively, in an array, referred to as C. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 112.
d: is the vector d, containing part of the factorization of matrix A from SPTF or DPTF, respectively, in an array, referred to as D. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 112.
bx: is the vector b, containing the right-hand side of the system in the first n positions in an array, referred to as BX. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 112.

On Return

bx: is the solution vector x of length n, containing the solution of the tridiagonal system in the first n positions in an array, referred to as BX. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 112.

Note

For a description of how tridiagonal matrices are stored, see "Positive Definite or Negative Definite Symmetric Matrix".

Function

The solution of positive definite symmetric tridiagonal system Ax = b is computed using the factorization produced by SPTF or DPTF, respectively. The factorization is based on Gaussian elimination. See reference [71].

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example

This example finds the solution of tridiagonal system Ax = b, where matrix A is the same matrix factored in "Example" for SPTF and DPTF. b is:

                  (2.0, 4.0, 5.0, 2.0)

and x is:

                  (1.0, 1.0, 1.0, 1.0)

Call Statement and Input

           N   C   D   BX
           |   |   |   |
CALL DPTS( 4 , C , D , BX )
 
C        =  ( . , -1.0, -1.0, -1.0)
D        =  (-1.0, -1.0, -1.0, -1.0)
BX       =  (2.0, 4.0, 5.0, 2.0)

Output

BX       =  (1.0, 1.0, 1.0, 1.0)

STBSV, DTBSV, CTBSV, and ZTBSV--Triangular Band Equation Solve

STBSV and DTBSV solve one of the following triangular banded systems of equations with a single right-hand side, using the vector x and triangular band matrix A or its transpose:

Solution Equation
1. x <-- A^-1x Ax = b
2. x <-- A^-Tx A^Tx = b

CTBSV and ZTBSV solve one of the following triangular banded systems of equations with a single right-hand side, using the vector x and triangular band matrix A, its transpose, or its conjugate transpose:

Solution Equation
1. x <-- A^-1x Ax = b
2. x <-- A^-Tx A^Tx = b
3. x <-- A^-Hx A^Hx = b

Matrix A can be either upper or lower triangular and is stored in upper- or lower-triangular-band-packed storage mode, respectively.

Table 113. Data Types

A, x Subprogram
Short-precision real STBSV
Long-precision real DTBSV
Short-precision complex CTBSV
Long-precision complex ZTBSV

Syntax

Fortran	CALL STBSV \| DTBSV \| CTBSV \| ZTBSV (`uplo`, `trans`, `diag`, `n`, `k`, `a`, `lda`, `x`, `incx`)
C and C++	stbsv \| dtbsv \| ctbsv \| ztbsv (`uplo`, `trans`, `diag`, `n`, `k`, `a`, `lda`, `x`, `incx`);
PL/I	CALL STBSV \| DTBSV \| CTBSV \| ZTBSV (`uplo`, `trans`, `diag`, `n`, `k`, `a`, `lda`, `x`, `incx`);

On Entry

uplo

indicates whether matrix A is an upper or lower triangular band matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

trans

indicates the form of matrix A used in the system of equations, where:

If trans = 'N', A is used, resulting in solution 1.

If trans = 'T', A^T is used, resulting in solution 2.

If trans = 'C', A^H is used, resulting in solution 3.

Specified as: a single character. It must be 'N', 'T', or 'C'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

n

is the order of triangular band matrix A. Specified as: a fullword integer; n >= 0.

k

is the upper or lower band width k of the matrix A. Specified as: a fullword integer; k >= 0.

a

is the upper or lower triangular band matrix A of order n, stored in upper- or lower-triangular-band-packed storage mode, respectively. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 113.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= k+1.

x

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

On Return

x: is the solution vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 113.

Notes

These subroutines accept lowercase letters for the uplo, trans, and diag arguments.
For STBSV and DTBSV, if you specify 'C' for the trans argument, it is interpreted as though you specified 'T'.
Matrix A and vector x must have no common elements; otherwise, results are unpredictable.
For unit triangular matrices, the elements of the diagonal are assumed to be 1.0 for real matrices and (1.0, 0.0) for complex matrices, and you do not need to set these values in the array.
For both upper and lower triangular band matrices, if you specify k >= n, ESSL assumes, for purposes of the computation only, that the upper or lower band width of matrix A is n-1; that is, it processes matrix A of order n, as though it is a (nonbanded) triangular matrix. However, ESSL uses the original value for k for the purposes of finding the locations of element a₁₁ and all other elements in the array specified for A, as described in "Triangular Band Matrix". For an illustration of this technique, see "Example 3".
For a description of triangular band matrices and how they are stored in upper- and lower-triangular-band-packed storage mode, see "Triangular Band Matrix".
If you are using a lower triangular band matrix, it may save your program some time if you use this alternate approach instead of using lower-triangular-band-packed storage mode. Leave matrix A in full-matrix storage mode when you pass it to ESSL and specify the lda argument to be lda+1, which is the leading dimension of matrix A plus 1. ESSL then processes the matrix elements in the same way as though you had set them up in lower-triangular-band-packed storage mode.

Function

These subroutines solve a triangular banded system of equations with a single right-hand side. The solution, x, may be any of the following, where triangular band matrix A, its transpose, or its conjugate transpose is used, and where A can be either upper- or lower-triangular:

1. x <-- A^-1x

2. x <-- A^-Tx

3. x <-- A^-Hx (for CTBSV and ZTBSV only)

where:

x is a vector of length n.

A is an upper or lower triangular band matrix of order n, stored in upper- or lower-triangular-band-packed storage mode, respectively.

See references [34], [46], and [38]. If n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0
k < 0
lda <= 0
lda < k+1
incx = 0
uplo <> 'L' or 'U'
trans <> 'T', 'N', or 'C'
diag <> 'N' or 'U'

Example 1

This example shows the solution x <-- A^-1x. Matrix A is a real 9 by 9 upper triangular band matrix with an upper band width of 2 that is not unit triangular, stored in upper-triangular-band-packed storage mode. Vector x is a vector of length 9, where matrix A is:

             *                                             *
             | 1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 0.0  4.0  2.0  3.0  0.0  0.0  0.0  0.0  0.0 |
             | 0.0  0.0  4.0  1.0  1.0  0.0  0.0  0.0  0.0 |
             | 0.0  0.0  0.0  4.0  2.0  2.0  0.0  0.0  0.0 |
             | 0.0  0.0  0.0  0.0  3.0  1.0  1.0  0.0  0.0 |
             | 0.0  0.0  0.0  0.0  0.0  3.0  2.0  2.0  0.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  3.0  1.0  0.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  2.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  1.0 |
             *                                             *

Call Statement and Input

            UPLO TRANS  DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBSV( 'U' , 'N' , 'N' , 9 , 2 , A , 3 , X , 1  )

        *                                             *
        |  .    .   1.0  3.0  1.0  2.0  1.0  2.0  0.0 |
A    =  |  .   1.0  2.0  1.0  2.0  1.0  2.0  1.0  2.0 |
        | 1.0  4.0  4.0  4.0  3.0  3.0  3.0  2.0  1.0 |
        *                                             *
 
X        =  (2.0, 7.0, 1.0, 8.0, 2.0, 8.0, 1.0, 8.0, 3.0)

Output

X        =  (1.0, 1.0, 0.0, 1.0, 0.0, 2.0, 0.0, 1.0, 3.0)

Example 2

This example shows the solution x <-- A^-Tx, solving the same system as in Example 1. Matrix A is a real 9 by 9 lower triangular band matrix with a lower band width of 2 that is not unit triangular, stored in lower-triangular-band-packed storage mode. Vector x is a vector of length 9 where matrix A is:

             *                                             *
             | 1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  4.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  2.0  4.0  0.0  0.0  0.0  0.0  0.0  0.0 |
             | 0.0  3.0  1.0  4.0  0.0  0.0  0.0  0.0  0.0 |
             | 0.0  0.0  1.0  2.0  3.0  0.0  0.0  0.0  0.0 |
             | 0.0  0.0  0.0  2.0  1.0  3.0  0.0  0.0  0.0 |
             | 0.0  0.0  0.0  0.0  1.0  2.0  3.0  0.0  0.0 |
             | 0.0  0.0  0.0  0.0  0.0  2.0  1.0  2.0  0.0 |
             | 0.0  0.0  0.0  0.0  0.0  0.0  0.0  2.0  1.0 |
             *                                             *

Call Statement and Input

            UPLO TRANS  DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBSV( 'L' , 'T' , 'N' , 9 , 2 , A , 3 , X , 1  )

        *                                             *
        | 1.0  4.0  4.0  4.0  3.0  3.0  3.0  2.0  1.0 |
A    =  | 1.0  2.0  1.0  2.0  1.0  2.0  1.0  2.0   .  |
        | 1.0  3.0  1.0  2.0  1.0  2.0  0.0   .    .  |
        *                                             *

X        =(same as input X in Example 1)

Output

X        =(same as output X in Example 1)

Example 3

This example shows the solution x <-- A^-Tx, where k > n. Matrix A is a real 4 by 4 upper triangular band matrix with an upper band width of 3, even though k is specified as 5. It is not unit triangular and is stored in upper-triangular-band-packed storage mode. Vector x is a vector of length 4 where matrix A is:

                          *                    *
                          | 1.0  2.0  3.0  2.0 |
                          | 0.0  2.0  2.0  5.0 |
                          | 0.0  0.0  3.0  3.0 |
                          | 0.0  0.0  0.0  1.0 |
                          *                    *

Call Statement and Input

            UPLO TRANS  DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBSV( 'U' , 'T' , 'N' , 4 , 5 , A , 6 , X , 1  )

        *                    *
        |  .    .    .    .  |
        |  .    .    .    .  |
A    =  |  .    .    .   2.0 |
        |  .    .   3.0  5.0 |
        |  .   2.0  2.0  3.0 |
        | 1.0  2.0  3.0  1.0 |
        *                    *
 
X        =  (5.0, 18.0, 32.0, 41.0)

Output

X        =  (5.0, 4.0, 3.0, 2.0)

Example 4

This example shows the solution x <-- A^-Tx. Matrix A is a complex 7 by 7 lower triangular band matrix with a lower band width of 3 that is not unit triangular, stored in lower-triangular-band-packed storage mode. Vector x is a vector of length 7. Matrix A is:

      *                                                                              *
      | (1.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) |
      | (1.0, 2.0) (2.0, 1.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) |
      | (1.0, 3.0) (2.0, 2.0) (3.0, 1.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) |
      | (1.0, 4.0) (2.0, 3.0) (3.0, 3.0) (4.0, 1.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) |
      | (0.0, 0.0) (2.0, 4.0) (3.0, 3.0) (4.0, 2.0) (2.0, 1.0) (0.0, 0.0) (0.0, 0.0) |
      | (0.0, 0.0) (0.0, 0.0) (3.0, 3.0) (4.0, 3.0) (5.0, 1.0) (3.0, 1.0) (0.0, 0.0) |
      | (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (4.0, 4.0) (5.0, 2.0) (6.0, 1.0) (2.0, 1.0) |
      *                                                                              *

Call Statement and Input

            UPLO TRANS  DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL CTBSV( 'L' , 'T' , 'N' , 7 , 3 , A , 4 , X , 1  )

        *                                                                              *
        | (1.0, 0.0) (2.0, 1.0) (3.0, 1.0) (4.0, 1.0) (2.0, 1.0) (3.0, 1.0) (2.0, 1.0) |
A    =  | (1.0, 2.0) (2.0, 2.0) (3.0, 3.0) (4.0, 2.0) (5.0, 1.0) (6.0, 1.0)     .      |
        | (1.0, 3.0) (2.0, 3.0) (3.0, 3.0) (4.0, 3.0) (5.0, 2.0)     .          .      |
        | (1.0, 4.0) (2.0, 4.0) (3.0, 3.0) (4.0, 4.0)     .          .          .      |
        *                                                                              *

X        =  ((2.0, 2.0), (7.0, 1.0), (1.0, 1.0), (8.0, 1.0),
             (2.0, 0.0), (8.0, 1.0), (1.0, 2.0))

Output

X        =  ((-12.048, -13.136), (6.304, -1.472), (-1.880, 1.040),
             (2.600, -1.800), (-2.160, 1.880), (0.800, -1.400),
             (0.800, 0.600))

Sparse Linear Algebraic Equation Subroutines

This section contains the sparse linear algebraic equation subroutine descriptions.

DGSF--General Sparse Matrix Factorization Using Storage by Indices, Rows, or Columns

This subroutine factors sparse matrix A by Gaussian elimination, using a modified Markowitz count with threshold pivoting. The sparse matrix can be stored by indices, rows, or columns. To solve the system of equations, follow the call to this subroutine with a call to DGSS.

Syntax

Fortran	CALL DGSF (`iopt`, `n`, `nz`, `a`, `ia`, `ja`, `lna`, `iparm`, `rparm`, `oparm`, `aux`, `naux`)
C and C++	dgsf (`iopt`, `n`, `nz`, `a`, `ia`, `ja`, `lna`, `iparm`, `rparm`, `oparm`, `aux`, `naux`);
PL/I	CALL DGSF (`iopt`, `n`, `nz`, `a`, `ia`, `ja`, `lna`, `iparm`, `rparm`, `oparm`, `aux`, `naux`);

On Entry

iopt

indicates the storage technique used for sparse matrix A, where:

If iopt = 0, it is stored by indices.

If iopt = 1, it is stored by rows.

If iopt = 2, it is stored by columns.

Specified as: a fullword integer; iopt = 0, 1, or 2.

n

is the order n of sparse matrix A. Specified as: a fullword integer; n >= 0.

nz

is the number of elements in sparse matrix A, stored in an array, referred to as A. Specified as: a fullword integer; nz > 0.

a

is the sparse matrix A, to be factored, stored in an array, referred to as A. Specified as: an array of length lna, containing long-precision real numbers.

ia

is the array, referred to as IA, where:

If iopt = 0, it contains the row numbers that correspond to the elements in array A.

If iopt = 1, it contains the row pointers.

If iopt = 2, it contains the row numbers that correspond to the elements in array A.

Specified as: an array of length lna, containing fullword integers; IA(i) >= 1. See "Sparse Matrix" for more information on storage techniques.

ja

is the array, referred to as JA, where:

If iopt = 0, it contains the column numbers that correspond to the elements in array A.

If iopt = 1, it contains the column numbers that correspond to the elements in array A.

If iopt = 2, it contains the column pointers.

Specified as: an array of length lna, containing fullword integers; JA(i) >= 1. See "Sparse Matrix" for more information on storage techniques.

lna

is the length of the arrays specified for a, ia, and ja. Specified as: a fullword integer; lna > 2nz. If you do not specify a sufficient amount, it results in an error. See "Error Conditions".

The size of lna depends on the structure of the input matrix. The requirement that lna > 2nz does not guarantee a successful run of the program. If the input matrix is expected to have many fill-ins, lna should be set larger. Larger lna may result in a performance improvement.

For details on how lna relates to storage compressions, see "Performance and Accuracy Considerations".

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) determines whether the default values for iparm and rparm are used by this subroutine.
If IPARM(1) = 0, the following default values are used:

IPARM(2) = 10
IPARM(3) = 1
IPARM(4) = 0
RPARM(1) = 10^-12
RPARM(2) = 0.1

If IPARM(1) = 1, the default values are not used.
IPARM(2) determines the number of minimal Markowitz counts that are examined to determine a pivot. (See reference [95].)
IPARM(3) has the following meaning, where:
If IPARM(3) = 0, this subroutine checks the values in arrays IA and JA.
If IPARM(3) = 1, this subroutine assumes that the input values are correct in arrays IA and JA.
IPARM(4) has the following meaning, where:
If IPARM(4) = 0, this computation is not performed.
If IPARM(4) = 1, this subroutine computes:

The absolute value of the smallest pivot element
The absolute value of the largest element in U.

These values are stored in OPARM(2) and OPARM(3), respectively.
IPARM(5) is reserved.

integers, where the iparm values must be:

IPARM(1) = 0 or 1

IPARM(2) >= 1

IPARM(3) = 0 or 1

IPARM(4) = 0 or 1

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) contains the lower bound of the absolute value of all elements in the matrix. If a pivot element is less than this number, the matrix is reported as singular. Any computed element whose absolute value is less than this number is set to 0.
RPARM(2) is the threshold pivot tolerance used to control the choice of pivots.
RPARM(3) is reserved.
RPARM(4) is reserved.
RPARM(5) is reserved.

Specified as: a one-dimensional array of (at least) length 5, containing long-precision real numbers, where the rparm values must be:

RPARM(1) >= 0.0

0.0 <= RPARM(2) <= 1.0

For additional information about rparm, see "Performance and Accuracy Considerations".

oparm

See 'On Return'.

aux

is the storage work area used by this subroutine. Its size is specified by naux. Specified as: an area of storage, containing long-precision real numbers.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer; naux >= 10n+100.

On Return

a

is the transformed array, referred to as A, containing the factored matrix A, required as input to DGSS. Returned as: a one-dimensional array of length lna, containing long-precision real numbers.

ia

is the transformed array, referred to as IA, required as input to DGSS. Returned as: a one-dimensional array of length lna, containing fullword integers.

ja

is the transformed array, referred to as JA, required as input to DGSS. Returned as: a one-dimensional array of length lna, containing fullword integers.

oparm

is an array of parameters, OPARM(i), where:

OPARM(1) is the amount of fill-ins for the sparse processing portion of the algorithm.
OPARM(2) contains the absolute value of the smallest pivot element of the matrix. This value is computed and set only if IPARM(4) = 1.
OPARM(3) contains the absolute value of the largest element encountered in U after the factorization. This value is computed and set only if IPARM(4) = 1.
OPARM(4) is reserved.
OPARM(5) is reserved.

Returned as: a one-dimensional array of length 5, containing long-precision real numbers.

aux

is the storage work area used by this subroutine. It contains the information required as input for DGSS. Specified as: an area of storage, containing long-precision real numbers.

Notes

For a description of the three storage techniques used by this subroutine for sparse matrices, see "Sparse Matrix".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The matrix A is factored by Gaussian elimination, using a modified Markowitz count with threshold pivoting to compute the sparse LU factorization of A:

LU = PAQ

where:

A is a general sparse matrix of order n, stored by indices, columns, or rows in arrays A, IA, and JA.

L is a unit lower triangular matrix.

U is an upper triangular matrix.

P is a permutation matrix.

Q is a permutation matrix.

To solve the system of equations, follow the call to this subroutine with a call to DGSS. If n is 0, no computation is performed. See references [10], [47], and [87].

Error Conditions

Computational Errors

If this subroutine has to perform storage compressions, an attention message is issued. When this occurs, the performance of this subroutine is affected. The performance can be improved by increasing the value specified for lna.
The following errors with their corresponding return codes can occur in this subroutine. Where a value of i is indicated, it can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for that particular error code in the ESSL error option table; otherwise, the default value causes your program to terminate when the error occurs. For details, see "What Can You Do about ESSL Computational Errors?".
- For error 2117, return code 2 indicates that the pivot element in a column, i, is smaller than the value specified in RPARM(1).
- For error 2118, return code 3 indicates that pivot element in a row, i, is smaller than the value specified in RPARM(1).
- For error 2120, return code 4 indicates that a row, i, is found empty on factorization. The matrix is singular.
- For error 2121, return code 5 indicates that a column is found empty on factorization. The matrix is singular.
- For error 2119, return code 6 indicates that the storage space indicated by lna is insufficient.
- For error 2122, return code 7 indicates that no pivot element was found in the active submatrix.

Input-Argument Errors

iopt <> 0, 1, or 2
n < 0
nz <= 0
lna <= 2nz
IPARM(1) <> 0 or 1
IPARM(2) <= 0
IPARM(3) <> 0 or 1
IPARM(4) <> 0 or 1
RPARM(1) < 0.0
RPARM(2) < 0.0 or RPARM(2) > 1.0
iopt = 1 and ia(i) >= ia (i+1), i = 1, n
iopt = 2 and ja(i) >= ja(i+1), i = 1, n
iopt = 0 or 1 and ja(i) < 1 or ja(i) > n, i = 1, nz
iopt = 0 or 1 and ia(i) < 1 or ia(i) > n, i = 1, nz
There are duplicate indices in a row or column of the input matrix.
The matrix is singular if a row or column of the input matrix is empty.
naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example

This example factors 5 by 5 sparse matrix A, which is stored by indices in arrays A, IA, and JA. The three storage techniques are shown in this example, and the output is the same regardless of the storage technique used. The matrix is factored using Gaussian elimination with threshold pivoting. Matrix A is:

                       *                         *
                       | 2.0  0.0  4.0  0.0  0.0 |
                       | 1.0  1.0  0.0  0.0  3.0 |
                       | 0.0  0.0  3.0  4.0  0.0 |
                       | 2.0  2.0  0.0  1.0  5.0 |
                       | 0.0  0.0  1.0  1.0  0.0 |
                       *                         *

Note:

In this example, only nonzero elements are used as input to the matrix.

Call Statement and Input (Storage-By-Indices)

          IOPT  N  NZ  A  IA  JA  LNA  IPARM  RPARM  OPARM  AUX  NAUX
           |    |   |  |   |   |   |     |      |      |     |    |
CALL DGSF( 0  , 5, 13, A, IA, JA, 27 , IPARM, RPARM, OPARM, AUX, 150 )

A        =  (2.0, 1.0, 1.0, 3.0, 4.0, 1.0, 5.0, 2.0, 2.0, 1.0, 1.0,
             4.0, 3.0, . , . , . , . , . , . , . , . , . , . , . , . ,
             . , . )
IA       =  (1, 2, 2, 3, 3, 4, 4, 4, 4, 5, 5, 1, 2, . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
JA       =  (1, 1, 2, 3, 4, 4, 5, 1, 2, 3, 4, 3, 5, . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
IPARM    =  (1, 3, 1, 1)
RPARM    =  (1.D-12, 0.1D0)

Call Statement and Input (Storage-By-Rows)

          IOPT  N  NZ  A  IA  JA  LNA  IPARM  RPARM  OPARM  AUX  NAUX
           |    |   |  |   |   |   |     |      |      |     |    |
CALL DGSF( 1  , 5, 13, A, IA, JA, 27 , IPARM, RPARM, OPARM, AUX, 150 )

A        =  (2.0, 4.0, 1.0, 1.0, 3.0, 3.0, 4.0, 2.0, 2.0, 1.0, 5.0,
             1.0, 1.0, . , . , . , . , . , . , . , . , . , . , . , . ,
             . , . )
IA       =  (1, 3, 6, 8, 12, 14, . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . )
JA       =  (1, 3, 1, 2, 5, 3, 4, 1, 2, 4, 5, 3, 4, . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
IPARM    =  (1, 3, 1, 1)
RPARM    =  (1.D-12, 0.1D0)

Call Statement and Input (Storage-By-Columns)

          IOPT  N  NZ  A  IA  JA  LNA  IPARM  RPARM  OPARM  AUX  NAUX
           |    |   |  |   |   |   |     |      |      |     |    |
CALL DGSF( 2  , 5, 13, A, IA, JA, 27 , IPARM, RPARM, OPARM, AUX, 150 )

A        =  (2.0, 1.0, 2.0, 1.0, 2.0, 4.0, 3.0, 1.0, 4.0, 1.0, 1.0,
             3.0, 5.0, . , . , . , . , . , . , . , . , . , . , . , . ,
             . , . )
IA       =  (1, 2, 4, 2, 4, 1, 3, 5, 3, 4, 5, 2, 4, . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
JA       =  (1, 4, 6, 9, 12, 14, . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . )
IPARM    =  (1, 3, 0, 1)
RPARM    =  (1.D-12, 0.1D0)

Output

A        =  (0.5, . , 0.3, 1.0, . , 1.0, . , 3.0, . , . , . , 1.0,
             1.0, . , . , . , . , . , . , . , -1.7, -0.5, -1.0, -1.0,
             4.0, -3.0, -4.0)
IA       =  (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, . , . , . , . ,
             . , . , . , 2, 1, 1, 3, 3, 5, 5)
JA       =  (1, 0, 5, 2, 0, 4, 0, 2, 0, 0, 0, 3, 4, . , . , . , . ,
             . , . , . , 4, 2, 4, 4, 1, 3, 1)
OPARM    =  (1.000000, 0.333333, 3.000000)

DGSS--General Sparse Matrix or Its Transpose Solve Using Storage by Indices, Rows, or Columns

This subroutine solves either of the following systems:

Ax = b

A^Tx = b

where A is a sparse matrix, A^T is the transpose of sparse matrix A, and x and b are vectors. DGSS uses the results of the factorization of matrix A, produced by a preceding call to DGSF.
Note: The input to this solve subroutine must be the output from the factorization subroutine, DGSF.

Syntax

Fortran	CALL DGSS (`jopt`, `n`, `a`, `ia`, `ja`, `lna`, `bx`, `aux`, `naux`)
C and C++	dgss (`jopt`, `n`, `a`, `ia`, `ja`, `lna`, `bx`, `aux`, `naux`);
PL/I	CALL DGSS (`jopt`, `n`, `a`, `ia`, `ja`, `lna`, `bx`, `aux`, `naux`);

On Entry

jopt

indicates the type of computation to be performed, where:

If jopt = 0, Ax = b is solved, where the right-hand side is not sparse.

If jopt = 1, A^Tx = b is solved, where the right-hand side is not sparse.

If jopt = 10, Ax = b is solved, where the right-hand side is sparse.

If jopt = 11, A^Tx = b is solved, where the right-hand side is sparse.

Specified as: a fullword integer; jopt = 0, 1, 10, or 11.

n

is the order n of sparse matrix A. Specified as: a fullword integer; n >= 0.

a

is the factorization of sparse matrix A, stored in array A, produced by a preceding call to DGSF. Specified as: an array of length lna, containing long-precision real numbers.

ia

is the array, referred to as IA, produced by a preceding call to DGSF. Specified as: an array of length lna, containing fullword integers.

ja

is the array, referred to as JA, produced by a preceding call to DGSF. Specified as: an array of length lna, containing fullword integers.

lna

is the length of the arrays A, IA, and JA. In DGSS, lna must be identical to the value specified in DGSF; otherwise, results are unpredictable. Specified as: a fullword integer; lna > 0.

bx

is the vector b of length n, containing the right-hand side of the system. Specified as: a one-dimensional array of (at least) length n, containing long-precision real numbers.

aux

is the storage work area passed to this subroutine by a preceding call to DGSF. Its size is specified by naux. Specified as: an area of storage, containing long-precision real numbers.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer; naux >= 10n+100.

On Return

ia: is the transformed array, referred to as IA, which can be used as input in subsequent calls to this subroutine. This may result in a performance increase. Specified as: an array of length lna, containing fullword integers.
bx: is the solution vector x of length n, containing the results of the computation. Specified as: a one-dimensional array, containing long-precision real numbers.

Notes

The input arguments n, lna, and naux, must be the same as those specified for DGSF. Whereas, the input arguments a, ia, ja, and aux must be those produced on output by DGSF. Otherwise, results are unpredictable.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The system Ax = b is solved for x, where A is a sparse matrix and x and b are vectors. Depending on the value specified for the jopt argument, DGSS can also solve the system A^Tx = b, where A^T is the transpose of sparse matrix A.

If the value specified for the jopt argument is 0 or 10, the following equation is solved:

Ax = b

If the value specified for the jopt argument is 1 or 11, the following equation is solved:

A^Tx = b

DGSS uses the results of the factorization of matrix A, produced by a preceding call to DGSF. The transformed matrix A consists of the upper triangular matrix U and the lower triangular matrix L.

See references [10], [47], and [87].

Error Conditions

Computational Errors

None

Input-Argument Errors

jopt <> 0, 1, 10, or 11
n < 0
lna <= 0
naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example shows how to solve the system Ax = b, where matrix A is a 5 by 5 sparse matrix. The right-hand side is not sparse.
Note: The input for this subroutine is the same as the output from DGSF, except for BX.
Matrix A is:

                *                         *
                | 2.0  0.0  4.0  0.0  0.0 |
                | 1.0  1.0  0.0  0.0  3.0 |
                | 0.0  0.0  3.0  4.0  0.0 |
                | 2.0  2.0  0.0  1.0  5.0 |
                | 0.0  0.0  1.0  1.0  0.0 |
                *                         *

Call Statement and Input

          JOPT  N   A   IA   JA   LNA   BX   AUX   NAUX
           |    |   |    |    |    |     |    |     |
CALL DGSS( 0  , 5 , A , IA , JA , 27  , BX , AUX , 150 )

A        =  (0.5, . , 0.3, 1.0, . , 1.0, . , 3.0, . , . , . , 1.0,
             1.0, . , . , . , . , . , . , . , -1.7, -0.5, -1.0, -1.0,
             4.0, -3.0, -4.0)
IA       =  (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, . , . , . , . ,
             . , . , . , 2, 1, 1, 3, 3, 5, 5)
JA       =  (1, 0, 5, 2, 0, 4, 0, 2, 0, 0, 0, 3, 4, . , . , . , . ,
             . , . , . , 4, 2, 4, 4, 1, 3, 1)
BX       =  (1.0, 1.0, 1.0, 1.0, 1.0)

Output

IA       =  (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, . , . , . , . ,
             . , . , . , 2, 1, 1, 3, 3, 5, 5)
BX       =  (-5.500000, 9.500000, 3.000000, -2.000000, -1.000000)

Example 2

This example shows how to solve the system A^Tx = b, using the same matrix A used in Example 1. The input is also the same as in Example 1, except for the jopt argument. The right-hand side is not sparse.

Call Statement and Input

          JOPT  N   A   IA   JA   LNA   BX   AUX   NAUX
           |    |   |    |    |    |     |    |     |
CALL DGSS( 1  , 5 , A , IA , JA , 27  , BX , AUX , 150 )
 
BX       =  (1.0, 1.0, 1.0, 1.0, 1.0)

Output

IA       =  (0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, . , . , . , . ,
             . , . , . , 2, 1, 1, 3, 3, 5, 5)
BX       =  (0.000000, -3.000000, -2.000000, 2.000000, 7.000000)

Example 3

This example shows how to solve the system Ax = b, using the same matrix A as in Examples 1 and 2. The input is also the same as in Examples 1 and 2, except for the jopt and bx arguments. The right-hand side is sparse.

Call Statement and Input

          JOPT  N   A   IA   JA   LNA   BX   AUX   NAUX
           |    |   |    |    |    |     |    |     |
CALL DGSS( 10 , 5 , A , IA , JA , 27  , BX , AUX , 150 )
 
BX       =  (0.0, 0.0, 0.0, 1.0, 0.0)

Output

IA       =  (1, 4, 2, 5, 3, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 2, 1, 1, 3, 3, 5, 5)
BX       =  (0.000000, 3.000000, 0.000000, 0.000000, -1.000000)

Example 4

This example shows how to solve the system A^Tx = b, using the same matrix A as in Examples 1, 2, and 3. The input is also the same as in Examples 1, 2, and 3, except for the jopt argument. The right-hand side is sparse.

Call Statement and Input

          JOPT  N   A   IA   JA   LNA   BX   AUX   NAUX
           |    |   |    |    |    |     |    |     |
CALL DGSS( 11 , 5 , A , IA , JA , 27  , BX , AUX , 150 )
 
BX       =  (0.0, 0.0, 0.0, 1.0, 0.0)

Output

IA       =  (1, 4, 2, 5, 3, 5, 2, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
             0, 2, 1, 1, 3, 3, 5, 5)
BX       =   (0.000000, 0.000000, 1.000000, 0.000000, -3.000000 )

DGKFS--General Sparse Matrix or Its Transpose Factorization, Determinant, and Solve Using Skyline Storage Mode

This subroutine can perform either or both of the following functions for general sparse matrix A, stored in skyline storage mode, and for vectors x and b:

Factor A and, optionally, compute the determinant of A.
Solve the system Ax = b or A^Tx = b using the results of the factorization of matrix A, produced on this call or a preceding call to this subroutine.

You also have the choice of using profile-in or diagonal-out skyline storage mode for A on input or output.
Note: The input to the solve performed by this subroutine must be the output from the factorization performed by this subroutine.

Syntax

Fortran	CALL DGKFS (`n`, `au`, `nu`, `idu`, `al`, `nl`, `idl`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`)
C and C++	dgkfs (`n`, `au`, `nu`, `idu`, `al`, `nl`, `idl`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`);
PL/I	CALL DGKFS (`n`, `au`, `nu`, `idu`, `al`, `nl`, `idl`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`);

On Entry

n

is the order of general sparse matrix A. Specified as: a fullword integer; n >= 0.

au

is the array, referred to as AU, containing one of three forms of the upper triangular part of general sparse matrix A, depending on the type of computation performed, where:

If you are doing a factor and solve or a factor only, and if IPARM(3) = 0, then AU contains the unfactored upper triangle of general sparse matrix A.
If you are doing a factor only, and if IPARM(3) > 0, then AU contains the partially factored upper triangle of general sparse matrix A. The first IPARM(3) columns in the upper triangle of A are already factored. The remaining columns are factored in this computation.
If you are doing a solve only, then AU contains the factored upper triangle of general sparse matrix A, produced by a preceding call to this subroutine.

In each case:

If IPARM(4) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(4) = 1, profile-in skyline storage mode is used for A.

Specified as: a one-dimensional array of (at least) length nu, containing long-precision real numbers.

nu

is the length of array AU. Specified as: a fullword integer; nu >= 0 and nu >= (IDU(n+1)-1).

idu

is the array, referred to as IDU, containing the relative positions of the diagonal elements of matrix A (in one of its three forms) in array AU. Specified as: a one-dimensional array of (at least) length n+1, containing fullword integers.

al

is the array, referred to as AL, containing one of three forms of the lower triangular part of general sparse matrix A, depending on the type of computation performed, where:

If you are doing a factor and solve or a factor only, and if IPARM(3) = 0, then AL contains the unfactored lower triangle of general sparse matrix A.
If you are doing a factor only, and if IPARM(3) > 0, then AL contains the partially factored lower triangle of general sparse matrix A. The first IPARM(3) rows in the lower triangle of A are already factored. The remaining rows are factored in this computation.
If you are doing a solve only, then AL contains the factored lower triangle of general sparse matrix A, produced by a preceding call to this subroutine.

Note: In all these cases, entries in AL for diagonal elements of A are not assumed to have meaningful values.

In each case:

If IPARM(4) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(4) = 1, profile-in skyline storage mode is used for A.

Specified as: a one-dimensional array of (at least) length nl, containing long-precision real numbers.

nl

is the length of array AL. Specified as: a fullword integer; nl >= 0 and nl >= (IDL(n+1)-1).

idl

is the array, referred to as IDL, containing the relative positions of the diagonal elements of matrix A (in one of its three forms) in array AL. Specified as: a one-dimensional array of (at least) length n+1, containing fullword integers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) indicates whether certain default values for iparm and rparm are used by this subroutine, where:
If IPARM(1) = 0, the following default values are used. For restrictions, see "Notes".

IPARM(2) = 0
IPARM(3) = 0
IPARM(4) = 0
IPARM(5) = 0
IPARM(10) = 0
IPARM(11) = -1
IPARM(12) = -1
IPARM(13) = -1
IPARM(14) = -1
IPARM(15) = 0
RPARM(10) = 10^-12

If IPARM(1) = 1, the default values are not used.
IPARM(2) indicates the type of computation performed by this subroutine. The following table gives the IPARM(2) values for each variation:

Type of Computation Ax = b Ax = b and Determinant(A) A^Tx = b A^Tx = b and Determinant(A)
Factor and Solve 0 10 100 110
Factor Only 1 11 N/A N/A
Solve Only 2 N/A 102 N/A
IPARM(3) indicates whether a full or partial factorization is performed on matrix A, where:
If IPARM(3) = 0, and:

If you are doing a factor and solve or a factor only, then a full factorization is performed for matrix A on rows and columns 1 through n.
If you are doing a solve only, this argument has no effect on the computation, but must be set to 0.

If IPARM(3) > 0, and you are doing a factor only, then a partial factorization is performed on matrix A. Rows 1 through IPARM(3) of columns 1 through IPARM(3) in matrix A must be in factored form from a preceding call to this subroutine. The factorization is performed on rows IPARM(3)+1 through n and columns IPARM(3)+1 through n. For an illustration, see "Notes".
IPARM(4) indicates the input storage mode used for matrix A. This determines the arrangement of data in arrays AU, IDU, AL, and IDL on input, where:
If IPARM(4) = 0, diagonal-out skyline storage mode is used.
If IPARM(4) = 1, profile-in skyline storage mode is used.
IPARM(5) indicates the output storage mode used for matrix A. This determines the arrangement of data in arrays AU, IDU, AL, and IDL on output, where:
If IPARM(5) = 0, diagonal-out skyline storage mode is used.
If IPARM(5) = 1, profile-in skyline storage mode is used.
IPARM(6) through IPARM(9) are reserved.
IPARM(10) has the following meaning, where:
If you are doing a factor and solve or a factor only, then IPARM(10) indicates whether certain default values for iparm and rparm are used by this subroutine, where:

If IPARM(10) = 0, the following default values are used. For restrictions, see "Notes".

IPARM(11) = -1
IPARM(12) = -1
IPARM(13) = -1
IPARM(14) = -1
IPARM(15) = 0
RPARM(10) = 10^-12

If IPARM(10) = 1, the default values are not used.

If you are doing a solve only, this argument is not used.
IPARM(11) through IPARM(15) have the following meaning, where:
If you are doing a factor and solve or a factor only, then IPARM(11) through IPARM(15) control the type of processing to apply to pivot elements occurring in regions 1 through 5, respectively. The pivot elements are u_kk for k = 1, n when doing a full factorization, and they are k = IPARM(3)+1, n when doing a partial factorization. The region in which a pivot element falls depends on the sign and magnitude of the pivot element. The regions are determined by RPARM(10). For a description of the regions and associated pivot values, see "Notes". For each region i for i = 1,5, where the pivot occurs in region i, the processing applied to the pivot element is determined by IPARM(10+i), where:

If IPARM(10+i) = -1, the pivot element is trapped and computational error 2126 is generated. See "Error Conditions".
If IPARM(10+i) = 0, for i = 1, 2, 4, and 5, processing continues normally.
Note: A value of 0 is not permitted for region 3, because if processing continues, a divide-by-zero exception occurs.

If IPARM(10+i) = 1, the pivot element is replaced with the value in RPARM(10+i), and processing continues normally.

If you are doing a solve only, these arguments are not used.
IPARM(16) through IPARM(25), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 25, containing fullword integers, where:

IPARM(1) = 0 or 1

IPARM(2) = 0, 1, 2, 10, 11, 100, 102, or 110

If IPARM(2) = 0, 2, 10, 100, 102, or 110, then IPARM(3) = 0

If IPARM(2) = 1 or 11, then 0 <= IPARM(3) <= n

IPARM(4), IPARM(5) = 0 or 1

If IPARM(2) = 0, 1, 10, 11, 100, or 110, then:

IPARM(10) = 0 or 1

IPARM(11), IPARM(12) = -1, 0, or 1

IPARM(13) = -1 or 1

IPARM(14), IPARM(15) = -1, 0, or 1

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) through RPARM(9) are reserved.
RPARM(10) has the following meaning, where:
If you are doing a factor and solve or a factor only, RPARM(10) is the tolerance value for small pivots. This sets the bounds for the pivot regions, where pivots are processed according to the options you specify for the five regions in IPARM(11) through IPARM(15), respectively. The suggested value is 10^-15 <= IPARM(10) <= 1.
If you are doing a solve only, this argument is not used.
RPARM(11) through RPARM(15) have the following meaning, where:
If you are doing a factor and solve or a factor only, RPARM(11) through RPARM(15) are the fix-up values to use for the pivots in regions 1 through 5, respectively. For each RPARM(10+i) for i = 1,5, where the pivot occurs in region i:

If IPARM(10+i) = 1, the pivot is replaced with RPARM(10+i), where |RPARM(10+i)| should be a sufficiently large nonzero value to avoid overflow when calculating the reciprocal of the pivot. The suggested value is 10^-15 <= |RPARM(10+i)| <= 1.
If IPARM(10+i) <> 1, RPARM(10+i) is not used.

If you are doing a solve only, these arguments are not used.
RPARM(16) through RPARM(25), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 25, containing long-precision real numbers, where if IPARM(2) = 0, 1, 10, 11, 100, or 110, then:

RPARM(10) >= 0.0

RPARM(11) through RPARM(15) <> 0.0

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is the storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing long-precision real numbers.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, DGKFS dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise,

If you are doing a factor only, use naux >= 5n.

If you are doing a factor and solve or a solve only, use naux >= 5n+4mbx.

bx

has the following meaning, where:

If you are doing a factor and solve or a solve only, bx is the array, containing the mbx right-hand side vectors b of the system Ax = b or A^Tx = b. Each vector b is length n and is stored in the corresponding column of the array.

If you are doing a factor only, this argument is not used in the computation.

Specified as: an ldbx by (at least) mbx array, containing long-precision real numbers.

ldbx

has the following meaning, where:

If you are doing a factor and solve or a solve only, ldbx is the leading dimension of the array specified for bx.

If you are doing a factor only, this argument is not used in the computation.

Specified as: a fullword integer; ldbx >= n and:

If mbx <> 0, then ldbx > 0.

If mbx = 0, then ldbx >= 0.

mbx

has the following meaning, where:

If you are doing a factor and solve or a solve only, mbx is the number of right-hand side vectors, b, in the array specified for bx.

If you are doing a factor only, this argument is not used in the computation.

Specified as: a fullword integer; mbx >= 0.

On Return

au

is the array, referred to as AU, containing the upper triangular part of the LU factored form of general sparse matrix A, where:

If IPARM(5) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(5) = 1, profile-in skyline storage mode is used for A.

(If mbx = 0 and you are doing a solve only, then au is unchanged on output.) Returned as: a one-dimensional array of (at least) length nu, containing long-precision real numbers.

idu

is the array, referred to as IDU, containing the relative positions of the diagonal elements of the factored output matrix A in array AU. (If mbx = 0 and you are doing a solve only, then idu is unchanged on output.) Returned as: a one-dimensional array of (at least) length n+1, containing fullword integers.

al

is the array, referred to as AL, containing the lower triangular part of the LU factored form of general sparse matrix A, where:

If IPARM(5) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(5) = 1, profile-in skyline storage mode is used for A.
Note: You should assume that entries in AL for diagonal elements of A do not have meaningful values.

(If mbx = 0 and you are doing a solve only, then al is unchanged on output.) Returned as: a one-dimensional array of (at least) length nl, containing long-precision real numbers.

idl

is the array, referred to as IDL, containing the relative positions of the diagonal elements of the factored output matrix A in array AL. (If mbx = 0 and you are doing a solve only, then idl is unchanged on output.) Returned as: a one-dimensional array of (at least) length n+1, containing fullword integers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) through IPARM(15) are unchanged.
IPARM(16) has the following meaning, where:
If you are doing a factor and solve or a factor only, and:

If IPARM(16) = -1, your factorization did not complete successfully, resulting in computational error 2126.
If IPARM(16) > 0, it is the row number k, in which the maximum absolute value of the ratio a_kk/u_kk occurred, where:

If IPARM(3) = 0, k can be any of the rows, 1 through n, in the full factorization.
If IPARM(3) > 0, k can be any of the rows, IPARM(3)+1 through n, in the partial factorization.

If you are doing a solve only, this argument is not used in the computation and is unchanged.
IPARM(17) through IPARM(20) are reserved.
IPARM(21) through IPARM(25) have the following meaning, where:
If you are doing a factor and solve or a factor only, IPARM(21) through IPARM(25) have the following meanings for each region i for i = 1,5, respectively:

If IPARM(20+i) = -1, your factorization did not complete successfully, resulting in computational error 2126.
If IPARM(20+i) >= 0, it is the number of pivots in region i for the columns that were factored in matrix A, where:

If IPARM(3) = 0, columns 1 through n were factored in the full factorization.
If IPARM(3) > 0, columns IPARM(3)+1 through n were factored in the partial factorization.

If you are doing a solve only, these arguments are not used in the computation and are unchanged.

Returned as: a one-dimensional array of (at least) length 25, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) through RPARM(15) are unchanged.
RPARM(16) has the following meaning, where:
If you are doing a factor and solve or a factor only, and:

If RPARM(16) = 0.0, your factorization did not complete successfully, resulting in computational error 2126.
If |RPARM(16)| > 0.0, it is the ratio for row k, a_kk/u_kk, having the maximum absolute value. Row k is indicated in IPARM(16), and:

If IPARM(3) = 0, the ratio corresponds to one of the rows, 1 through n, in the full factorization.
If IPARM(3) > 0, the ratio corresponds to one of the rows, IPARM(3)+1 through n, in the partial factorization.

If you are doing a solve only, this argument is not used in the computation and is unchanged.
RPARM(17) and RPARM(18) have the following meaning, where:
If you are computing the determinant of matrix A, then RPARM(17) is the mantissa, detbas, and RPARM(18) is the power of 10, detpwr, used to express the value of the determinant: detbas(10^detpwr), where 1 <= detbas < 10. Also:

If IPARM(3) = 0, the determinant is computed for columns 1 through n in the full factorization.
If IPARM(3) > 0, the determinant is computed for columns IPARM(3)+1 through n in the partial factorization.

If you are not computing the determinant of matrix A, these arguments are not used in the computation and are unchanged.
RPARM(19) through RPARM(25) are reserved.

containing long-precision real numbers.

bx

has the following meaning, where:

If you are doing a factor and solve or a solve only, bx is the array, containing the mbx solution vectors x of the system Ax = b or A^Tx = b. Each vector x is length n and is stored in the corresponding column of the array. (If mbx = 0, then bx is unchanged on output.)

If you are doing a factor only, this argument is not used in the computation and is unchanged.

Returned as: an ldbx by (at least) mbx array, containing long-precision real numbers.

Notes

If you set either IPARM(1) = 0 or IPARM(10) = 0, indicating you want to use the default values for IPARM(11) through IPARM(15) and RPARM(10), then:
- Matrix A must be positive definite.
- No pivots are fixed, using RPARM(11) through RPARM(15) values.
- No small pivots are tolerated; that is, the value should be |pivot| > RPARM(10).
Many of the input and output parameters for iparm and rparm are defined for the five pivot regions handled by this subroutine. The limits of the regions are based on RPARM(10), as shown in Figure 11. The pivot values in each region are:

Region 1: pivot < -RPARM(10)
Region 2: -RPARM(10) <= pivot < 0
Region 3: pivot = 0
Region 4: 0 < pivot <= RPARM(10)
Region 5: pivot > RPARM(10)

Figure 11. Five Pivot Regions
The IPARM(4) and IPARM(5) arguments allow you to specify the same or different skyline storage modes for your input and output arrays for matrix A. This allows you to change storage modes as needed. However, if you are concerned with performance, you should use diagonal-out skyline storage mode for both input and output, if possible, because there is less overhead.
For a description of how sparse matrices are stored in skyline storage mode, see "Profile-In Skyline Storage Mode" and "Diagonal-Out Skyline Storage Mode".
Following is an illustration of the portion of matrix A factored in the partial factorization when IPARM(3) > 0. In this case, the subroutine assumes that rows and columns 1 through IPARM(3) are already factored and that rows and columns IPARM(3)+1 through n are to be factored in this computation.

You use the partial factorization function when, for design or storage reasons, you must factor the matrix A in stages. When doing a partial factorization, you must use the same skyline storage mode for all parts of the matrix as it is progressively factored.
Your various arrays must have no common elements; otherwise, results are unpredictable.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

This subroutine can factor, compute the determinant of, and solve general sparse matrix A, stored in skyline storage mode. For all computations, input matrix A can be stored in either diagonal-out or profile-in skyline storage mode. Output matrix A can also be stored in either of these modes and can be different from the mode used for input.

Matrix A is factored into the following form using specified pivot processing:

A = LU

where:

U is an upper triangular matrix.

L is a lower triangular matrix.

The transformed matrix A, factored into its LU form, is stored in packed format in arrays AU and AL. The inverse of the diagonal of matrix U is stored in the corresponding elements of array AU. The off-diagonal elements of the upper triangular matrix U are stored in the corresponding off-diagonal elements of array AU. The off-diagonal elements of the lower triangular matrix L are stored in the corresponding off-diagonal elements of array AL. (The diagonal elements stored in array AL do not have meaningful values.)

The partial factorization of matrix A, which you can do when you specify the factor-only option, assumes that the first IPARM(3) rows and columns are already factored in the input matrix. It factors the remaining n-IPARM(3) rows and columns in matrix A. (See "Notes" for an illustration.) It updates only the elements in arrays AU and AL corresponding to the part of matrix A that is factored.

The determinant can be computed with any of the factorization computations. With a full factorization, you get the determinant for the whole matrix. With a partial factorization, you get the determinant for only that part of the matrix factored in this computation.

The system Ax = b or A^Tx = b, having multiple right-hand sides, is solved for x, using the transformed matrix A produced by this call or a subsequent call to this subroutine.

See references [9], [12], [25], [47], and [65]. If n is 0, no computation is performed. If mbx is 0, no solve is performed.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.
Unable to allocate internal work area.

Computational Errors

If a pivot occurs in region i for i = 1,5 and IPARM(10+i) = 1, the pivot value is replaced with RPARM(10+i), an attention message is issued, and processing continues.
Unacceptable pivot values occurred in the factorization of matrix A.
- One or more diagonal elements of U contains unacceptable pivots and no valid fixup is applicable. The row number i of the first unacceptable pivot element is identified in the computational error message.
- The return code is set to 2.
- i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2126 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

n < 0
nu < 0
IDU(n+1) > nu+1
IDU(i+1) <= IDU(i) for i = 1, n
IDU(i+1) > IDU(i)+i and IPARM(4) = 0 for i = 1, n
IDU(i) > IDU(i-1)+i and IPARM(4) = 1 for i = 2, n
nl < 0
IDL(n+1) > nl+1
IDL(i+1) <= IDL(i) for i = 1, n
IDL(i+1) > IDL(i)+i and IPARM(4) = 0 for i = 1, n
IDL(i) > IDL(i-1)+i and IPARM(4) = 1 for i = 2, n
IPARM(1) <> 0 or 1
IPARM(2) <> 0, 1, 2, 10, 11, 100, 102, or 110
IPARM(3) < 0
IPARM(3) > n
IPARM(3) > 0 and IPARM(2) <> 1 or 11
IPARM(4), IPARM(5) <> 0 or 1
IPARM(2) = 0, 1, 10, 11, 100, or 110 and:

IPARM(10) <> 0 or 1
IPARM(11), IPARM(12) <> -1, 0, or 1
IPARM(13) <> -1 or 1
IPARM(14), IPARM(15) <> -1, 0, or 1
RPARM(10) < 0.0
RPARM(10+i) = 0.0 and IPARM(10+i) = 1 for i = 1,5
IPARM(2) = 0, 2, 10, 100, 102, or 110 and:

ldbx <= 0 and mbx <> 0 and n <> 0
ldbx < 0 and mbx = 0
ldbx < n and mbx <> 0
mbx < 0
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example shows how to factor a 9 by 9 general sparse matrix A and solve the system Ax = b with three right-hand sides. The default values are used for IPARM and RPARM. Input matrix A, shown here, is stored in diagonal-out skyline storage mode. Matrix A is:

        *                                               *
        | 2.0  2.0  2.0  0.0  0.0  0.0   0.0  0.0   0.0 |
        | 2.0  4.0  4.0  2.0  2.0  0.0   0.0  0.0   2.0 |
        | 2.0  4.0  6.0  4.0  4.0  0.0   2.0  0.0   4.0 |
        | 2.0  4.0  6.0  6.0  6.0  2.0   4.0  0.0   6.0 |
        | 0.0  0.0  0.0  2.0  4.0  4.0   4.0  2.0   4.0 |
        | 0.0  2.0  4.0  6.0  8.0  6.0   8.0  4.0  10.0 |
        | 0.0  0.0  0.0  2.0  4.0  6.0   8.0  6.0   8.0 |
        | 0.0  0.0  0.0  2.0  4.0  6.0   8.0  8.0  10.0 |
        | 2.0  4.0  6.0  6.0  8.0  6.0  10.0  8.0  16.0 |
        *                                               *

Output matrix A, shown here, is in LU factored form with U^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Matrix B is:

        *                                             *
        | 0.5  2.0  2.0  0.0  0.0  0.0  0.0  0.0  0.0 |
        | 1.0  0.5  2.0  2.0  2.0  0.0  0.0  0.0  2.0 |
        | 1.0  1.0  0.5  2.0  2.0  0.0  2.0  0.0  2.0 |
        | 1.0  1.0  1.0  0.5  2.0  2.0  2.0  0.0  2.0 |
        | 0.0  0.0  0.0  1.0  0.5  2.0  2.0  2.0  2.0 |
        | 0.0  1.0  1.0  1.0  1.0  0.5  2.0  2.0  2.0 |
        | 0.0  0.0  0.0  1.0  1.0  1.0  0.5  2.0  2.0 |
        | 0.0  0.0  0.0  1.0  1.0  1.0  1.0  0.5  2.0 |
        | 1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.5 |
        *                                             *

Call Statement and Input

            N   AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX  NAUX  BX  LDBX MBX
            |   |   |    |   |   |    |     |      |     |    |    |    |    |
CALL DGKFS( 9 , AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX,  57 , BX , 12 , 3 )

AU       =  (2.0, 4.0, 2.0, 6.0, 4.0, 2.0, 6.0, 4.0, 2.0, 4.0, 6.0,
             4.0, 2.0, 6.0, 4.0, 2.0, 8.0, 8.0, 4.0, 4.0, 2.0, 8.0,
             6.0, 4.0, 2.0, 16.0, 10.0, 8.0, 10.0, 4.0, 6.0, 4.0, 2.0)
IDU      =  (1, 2, 4, 7, 10, 14, 17, 22, 26, 34)
AL       =  (0.0, 0.0, 2.0, 0.0, 4.0, 2.0, 0.0, 6.0, 4.0, 2.0, 0.0,
             2.0, 0.0, 8.0, 6.0, 4.0, 2.0, 0.0, 6.0, 4.0, 2.0, 0.0,
             8.0, 6.0, 4.0, 2.0, 0.0, 8.0, 10.0, 6.0, 8.0, 6.0, 6.0,
             4.0, 2.0)
IDL      =  (1, 2, 4, 7, 11, 13, 18, 22, 27, 36)
IPARM    =  (0, . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . , . )

RPARM    =(not relevant)

        *                       *
        |  6.00   12.00   18.00 |
        | 16.00   32.00   48.00 |
        | 26.00   52.00   78.00 |
        | 36.00   72.00  108.00 |
        | 20.00   40.00   60.00 |
BX   =  | 48.00   96.00  144.00 |
        | 34.00   68.00  102.00 |
        | 38.00   76.00  114.00 |
        | 66.00  132.00  198.00 |
        |   .       .       .   |
        |   .       .       .   |
        |   .       .       .   |
        *                       *

Output

AU       =  (0.5, 0.5, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0,
             2.0, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0, 2.0, 2.0, 0.5,
             2.0, 2.0, 2.0, 0.5, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0)
IDU      =(same as input)
AL       =  (0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0,
             1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0,
             1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0)
IDL      =(same as input)
IPARM    =  (0, . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 9, . , . , . , . , 0, 0, 0, 0, 9)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 8.0, . , . , . , . , . , . , . , . , . )

        *                  *
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
BX   =  | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        |  .     .     .   |
        |  .     .     .   |
        |  .     .     .   |
        *                  *

Example 2

This example shows how to factor the 9 by 9 general sparse matrix A from Example 1, solve the system A^Tx = b with three right-hand sides, and compute the determinant of A. The default values for pivot processing are used for IPARM. Input matrix A is stored in profile-in skyline storage mode. Output matrix A is in LU factored form with U^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. It is the same as output matrix A in Example 1.

Call Statement and Input

            N  AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX  NAUX  BX  LDBX MBX
            |  |   |    |   |   |    |     |       |     |   |    |    &darrow    |
CALL DGKFS( 9, AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX,  57 , BX , 12 , 3 )

AU       =  (2.0, 2.0, 4.0, 2.0, 4.0, 6.0, 2.0, 4.0, 6.0, 2.0, 4.0,
             6.0, 4.0, 2.0, 4.0, 6.0, 2.0, 4.0, 4.0, 8.0, 8.0, 2.0,
             4.0, 6.0, 8.0, 2.0, 4.0, 6.0, 4.0, 10.0, 8.0, 10.0, 16.0)
IDU      =  (1, 3, 6, 9, 13, 16, 21, 25, 33, 34)
AL       =  (0.0, 2.0, 0.0, 2.0, 4.0, 0.0, 2.0, 4.0, 6.0, 0.0, 2.0,
             0.0, 2.0, 4.0, 6.0, 8.0, 0.0, 2.0, 4.0, 6.0, 0.0, 2.0,
             4.0, 6.0, 8.0, 0.0, 2.0, 4.0, 6.0, 6.0, 8.0, 6.0, 10.0,
             8.0, 0.0)
IDL      =  (1, 3, 6, 10, 12, 17, 21, 26, 35, 36)
IPARM    =  (1, 110, 0, 1, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )

RPARM    =(not relevant)

        *                       *
        | 10.00   20.00   30.00 |
        | 20.00   40.00   60.00 |
        | 28.00   56.00   84.00 |
        | 30.00   60.00   90.00 |
        | 40.00   80.00  120.00 |
BX   =  | 30.00   60.00   90.00 |
        | 44.00   88.00  132.00 |
        | 28.00   56.00   84.00 |
        | 60.00  120.00  180.00 |
        |   .       .       .   |
        |   .       .       .   |
        |   .       .       .   |
        *                       *

Output

AU       =(same as output AU in Example 1)
IDU      =(same as output IDU in Example 1)
AL       =(same as output AL in Example 1)
IDL      =(same as output IDL in Example 1)
IPARM    =  (1, 110, 0, 1, 0, . , . , . , . , 0, . , . , . , . , . ,
             9, . , . , . , . , 0, 0, 0, 0, 9)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 8.0, 5.12, 2.0, . , . , . , . , . , . , . )
BX       =(same as output BX in Example 1)

Example 3

This example shows how to factor a 9 by 9 negative-definite general sparse matrix A, solve the system Ax = b with three right-hand sides, and compute the determinant of A. (Default values for pivot processing are not used for IPARM because A is negative-definite.) Input matrix A, shown here, is stored in diagonal-out skyline storage mode:

        *                                                        *
        | -2.0  -2.0  -2.0   0.0   0.0   0.0    0.0   0.0    0.0 |
        | -2.0  -4.0  -4.0  -2.0  -2.0   0.0    0.0   0.0   -2.0 |
        | -2.0  -4.0  -6.0  -4.0  -4.0   0.0   -2.0   0.0   -4.0 |
        | -2.0  -4.0  -6.0  -6.0  -6.0  -2.0   -4.0   0.0   -6.0 |
        |  0.0   0.0   0.0  -2.0  -4.0  -4.0   -4.0  -2.0   -4.0 |
        |  0.0  -2.0  -4.0  -6.0  -8.0  -6.0   -8.0  -4.0  -10.0 |
        |  0.0   0.0   0.0  -2.0  -4.0  -6.0   -8.0  -6.0   -8.0 |
        |  0.0   0.0   0.0  -2.0  -4.0  -6.0   -8.0  -8.0  -10.0 |
        | -2.0  -4.0  -6.0  -6.0  -8.0  -6.0  -10.0  -8.0  -16.0 |
        *                                                        *

Output matrix A, shown here, is in LU factored form with U^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Matrix A is:

         *                                                      *
         | -0.5  -2.0  -2.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         |  1.0  -0.5  -2.0  -2.0  -2.0   0.0   0.0   0.0  -2.0 |
         |  1.0   1.0  -0.5  -2.0  -2.0   0.0  -2.0   0.0  -2.0 |
         |  1.0   1.0   1.0  -0.5  -2.0  -2.0  -2.0   0.0  -2.0 |
         |  0.0   0.0   0.0   1.0  -0.5  -2.0  -2.0  -2.0  -2.0 |
         |  0.0   1.0   1.0   1.0   1.0  -0.5  -2.0  -2.0  -2.0 |
         |  0.0   0.0   0.0   1.0   1.0   1.0  -0.5  -2.0  -2.0 |
         |  0.0   0.0   0.0   1.0   1.0   1.0   1.0  -0.5  -2.0 |
         |  1.0   1.0   1.0   1.0   1.0   1.0   1.0   1.0  -0.5 |
         *                                                      *

Call Statement and Input

            N  AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX  NAUX  BX  LDBX MBX
            |  |   |    |   |   |    |     |      |     |    |    |    |    |
CALL DGKFS( 9, AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX,  57 , BX , 12 , 3 )

AU       =  (-2.0, -4.0, -2.0, -6.0, -4.0, -2.0, -6.0, -4.0, -2.0,
             -4.0, -6.0, -4.0, -2.0, -6.0, -4.0, -2.0, -8.0, -8.0,
             -4.0, -4.0, -2.0, -8.0, -6.0, -4.0, -2.0, -16.0, -10.0,
             -8.0, -10.0, -4.0, -6.0, -4.0, -2.0)
IDU      =  (1, 2, 4, 7, 10, 14, 17, 22, 26, 34)
AL       =  (0.0, 0.0, -2.0, 0.0, -4.0, -2.0, 0.0, -6.0, -4.0, -2.0,
             0.0, -2.0, 0.0, -8.0, -6.0, -4.0, -2.0, 0.0, -6.0, -4.0,
             -2.0, 0.0, -8.0, -6.0, -4.0, -2.0, 0.0, -8.0, -10.0,
             -6.0, -8.0, -6.0, -6.0, -4.0, -2.0)
IDL      =  (1, 2, 4, 7, 11, 13, 18, 22, 27, 36)
IPARM    =  (1, 10, 0, 0, 0, . , . , . , . , 1, 0, -1, -1, -1, -1, . ,
             . , . , . , . , . , . , . , . , . )

 
RPARM    =  ( . , . , . , . , . , . , . , . , . , 10^-15, . , . ,
              . , . , . , . , . , . , . , . , . , . , . , . , . )
BX       =(same as input BX in Example 1)

Output

AU       =  (-0.5, -0.5, -2.0, -0.5, -2.0, -2.0, -0.5, -2.0, -2.0,
             -0.5, -2.0, -2.0, -2.0, -0.5, -2.0, -2.0, -0.5, -2.0,
             -2.0, -2.0, -2.0, -0.5, -2.0, -2.0, -2.0, -0.5, -2.0,
             -2.0, -2.0, -2.0, -2.0, -2.0, -2.0)
IDU      =(same as input)
AL       =  (0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0,
             1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0,
             1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0)
IDL      =(same as input)
IPARM    =  (1, 10, 0, 0, 0, . , . , . , . , 1, 0, -1, -1, -1, -1, 9,
             . , . , . , . , 9, 0, 0, 0, 0)
 
RPARM    =  ( . , . , . , . , . , . , . , . , . , 10^-15, . , . ,
             . , . , . , 8.0, -5.12, 2.0, . , . , . , . , . , . , . )

        *                     *
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
BX   =  | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        |   .      .      .   |
        |   .      .      .   |
        |   .      .      .   |
        *                     *

Example 4

This example shows how to factor the first six rows and columns, referred to as matrix A1, of the 9 by 9 general sparse matrix A from Example 1 and compute the determinant of A1. Input matrix A1, shown here, is stored in diagonal-out skyline storage mode. Input matrix A1 is:

                     *                              *
                     | 2.0  2.0  2.0  0.0  0.0  0.0 |
                     | 2.0  4.0  4.0  2.0  2.0  0.0 |
                     | 2.0  4.0  6.0  4.0  4.0  0.0 |
                     | 2.0  4.0  6.0  6.0  6.0  2.0 |
                     | 0.0  0.0  0.0  2.0  4.0  4.0 |
                     | 0.0  2.0  4.0  6.0  8.0  6.0 |
                     *                              *

Output matrix A1, shown here, is in LU factored form with U^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Output matrix A1 is:

                     *                              *
                     | 0.5  2.0  2.0  0.0  0.0  0.0 |
                     | 1.0  0.5  2.0  2.0  2.0  0.0 |
                     | 1.0  1.0  0.5  2.0  2.0  0.0 |
                     | 1.0  1.0  1.0  0.5  2.0  2.0 |
                     | 0.0  0.0  0.0  1.0  0.5  2.0 |
                     | 0.0  1.0  1.0  1.0  1.0  0.5 |
                     *                              *

Call Statement and Input

            N  AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX  NAUX  BX   LDBX   MBX
            |  |   |    |   |   |    |     |      |     |    |    |     |      |
CALL DGKFS( 6, AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX,  45 , BX , LDBX , MBX )

AU       =(same as input AU in Example 1)
IDU      =  (1, 2, 4, 7, 10, 14, 17)
AL       =(same as input AL in Example 1)
IDL      =  (1, 2, 4, 7, 11, 13, 18)
IPARM    =  (1, 11, 0, 0, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =(not relevant)
LDBX     =(not relevant)
MBX      =(not relevant)

Output

AU       =  (0.5, 0.5, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0,
             2.0, 2.0, 0.5, 2.0, 2.0, 8.0, 8.0, 4.0, 4.0, 2.0, 8.0,
             6.0, 4.0, 2.0, 16.0, 10.0, 8.0, 10.0, 4.0, 6.0, 4.0, 2.0)
IDU      =(same as input)
AL       =  (0.0, 0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0,
             1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 6.0, 4.0, 2.0, 0.0,
             8.0, 6.0, 4.0, 2.0, 0.0, 8.0, 10.0, 6.0, 8.0, 6.0, 6.0,
             4.0, 2.0)
IDL      =(same as input)
IPARM    =  (1, 11, 0, 0, 0, . , . , . , . , 0, . , . , . , . , . , 3,
             . , . , . , . , 0, 0, 0, 0, 6)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 3.0, 6.4, 1.0, . , . , . , . , . , . , . )
BX       =(same as input)
LDBX     =(same as input)
MBX      =(same as input)

Example 5

This example shows how to do a partial factorization of the 9 by 9 general sparse matrix A from Example 1, where the first six rows and columns were factored in Example 4. It factors the remaining three rows and columns and computes the determinant of that part of the matrix. The input matrix, referred to as A2, shown here, is made up of the output factored matrix A1 plus the three remaining unfactored rows and columns of matrix A. Matrix A2 is:

            *                                               *
            | 0.5  2.0  2.0  0.0  0.0  0.0   0.0  0.0   0.0 |
            | 1.0  0.5  2.0  2.0  2.0  0.0   0.0  0.0   2.0 |
            | 1.0  1.0  0.5  2.0  2.0  0.0   2.0  0.0   4.0 |
            | 1.0  1.0  1.0  0.5  2.0  2.0   4.0  0.0   6.0 |
            | 0.0  0.0  0.0  1.0  0.5  2.0   4.0  2.0   4.0 |
            | 0.0  1.0  1.0  1.0  1.0  0.5   8.0  4.0  10.0 |
            | 0.0  0.0  0.0  2.0  4.0  6.0   8.0  6.0   8.0 |
            | 0.0  0.0  0.0  2.0  4.0  6.0   8.0  8.0  10.0 |
            | 2.0  4.0  6.0  6.0  8.0  6.0  10.0  8.0  16.0 |
            *                                               *

Both parts of input matrix A2 are stored in diagonal-out skyline storage mode.

Output matrix A2 is the same as output matrix A in Example 1 and is stored in diagonal-out skyline storage mode.

Call Statement and Input

            N  AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX  NAUX  BX   LDBX   MBX
            |  |   |    |   |   |    |     |      |     |    |    |     |      |
CALL DGKFS( 9, AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX,  45 , BX , LDBX , MBX )

AU       =(same as output AU in Example 4)
IDU      =(same as input IDU in Example 1)
AL       =(same as output AL in Example 4)
IDL      =(same as input IDL in Example 1)
IPARM    =  (1, 11, 6, 0, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =(not relevant)
LDBX     =(not relevant)
MBX      =(not relevant)

Output

AU       =(same as output AU in Example 1)
IDU      =(same as output IDU in Example 1)
AL       =(same as output AL in Example 1)
IDL      =(same as output IDL in Example 1)
IPARM    =  (1, 11, 6, 0, 0, . , . , . , . , 0, . , . , . , . , . , 9,
             . , . , . , . , 0, 0, 0, 0, 3)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 8.0, 8.0, 0.0, . , . , . , . , . , . , . )
BX       =(same as input)
LDBX     =(same as input)
MBX      =(same as input)

Example 6

This example shows how to solve the system Ax = b with one right-hand side for a general sparse matrix A. Input matrix A, used here, is the same as factored output matrix A from Example 1, stored in profile-in skyline storage mode. Here, output matrix A is unchanged on output and is stored in profile-in skyline storage mode.

Call Statement and Input

            N  AU  NU  IDU  AL  NL  IDL  IPARM  RPARM  AUX NAUX  BX  LDBX MBX
            |  |   |    |   |   |    |     |      |     |   |    |    |    |
CALL DGKFS( 9, AU, 33, IDU, AL, 35, IDL, IPARM, RPARM, AUX, 49 , BX , 9 ,  1 )

AU       =  (0.5, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0,
             2.0, 0.5, 2.0, 2.0, 0.5, 2.0, 2.0, 2.0, 2.0, 0.5, 2.0,
             2.0, 2.0, 0.5, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 2.0, 0.5)
IDU      =  (1, 3, 6, 9, 13, 16, 21, 25, 33, 34)
AL       =  (0.0, 1.0, 0.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0,
             0.0, 1.0, 1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 0.0, 1.0,
             1.0, 1.0, 1.0, 0.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 0.0)
IDL      =  (1, 3, 6, 10, 12, 17, 21, 26, 35, 36)
IPARM    =  (1, 2, 0, 1, 1, . , . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =  (12.0, 58.0, 114.0, 176.0, 132.0, 294.0, 240.0, 274.0,
             406.0)

Output

AU       =(same as input)
IDU      =(same as input)
AL       =(same as input)
IDL      =(same as input)
IPARM    =(same as input)
RPARM    =(not relevant)
BX       =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)

DSKFS--Symmetric Sparse Matrix Factorization, Determinant, and Solve Using Skyline Storage Mode

This subroutine can perform either or both of the following functions for symmetric sparse matrix A, stored in skyline storage mode, and for vectors x and b:

Factor A and, optionally, compute the determinant of A.
Solve the system Ax = b using the results of the factorization of matrix A, produced on this call or a preceding call to this subroutine.

You have the choice of using either Gaussian elimination or Cholesky decomposition. You also have the choice of using profile-in or diagonal-out skyline storage mode for A on input or output.
Note: The input to the solve performed by this subroutine must be the output from the factorization performed by this subroutine.

Syntax

Fortran	CALL DSKFS (`n`, `a`, `na`, `idiag`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`)
C and C++	dskfs (`n`, `a`, `na`, `idiag`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`);
PL/I	CALL DSKFS (`n`, `a`, `na`, `idiag`, `iparm`, `rparm`, `aux`, `naux`, `bx`, `ldbx`, `mbx`);

On Entry

n

is the order of symmetric sparse matrix A. Specified as: a fullword integer; n >= 0.

a

is the array, referred to as A, containing one of three forms of the upper triangular part of symmetric sparse matrix A, depending on the type of computation performed, where:

If you are doing a factor and solve or a factor only, and if IPARM(3) = 0, then A contains the unfactored upper triangle of symmetric sparse matrix A.
If you are doing a factor only, and if IPARM(3) > 0, then A contains the partially factored upper triangle of symmetric sparse matrix A. The first IPARM(3) columns in the upper triangle of A are already factored. The remaining columns are factored in this computation.
If you are doing a solve only, then A contains the factored upper triangle of sparse matrix A, produced by a preceding call to this subroutine.

In each case:

If IPARM(4) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(4) = 1, profile-in skyline storage mode is used for A.

Specified as: a one-dimensional array of (at least) length na, containing long-precision real numbers.

na

is the length of array A. Specified as: a fullword integer; na >= 0 and na >= (IDIAG(n+1)-1).

idiag

is the array, referred to as IDIAG, containing the relative positions of the diagonal elements of matrix A (in one of its three forms) in array A. Specified as: a one-dimensional array of (at least) length n+1, containing fullword integers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) indicates whether certain default values for iparm and rparm are used by this subroutine, where:
If IPARM(1) = 0, the following default values are used. For restrictions, see "Notes".

IPARM(2) = 0
IPARM(3) = 0
IPARM(4) = 0
IPARM(5) = 0
IPARM(10) = 0
IPARM(11) = -1
IPARM(12) = -1
IPARM(13) = -1
IPARM(14) = -1
IPARM(15) = 0
RPARM(10) = 10^-12

If IPARM(1) = 1, the default values are not used.

IPARM(2) indicates the type of computation performed by this subroutine. The following table gives the IPARM(2) values for each variation:

Type of Computation	Gaussian Elimination Ax = b	Gaussian Elimination Ax = b and Determinant(A)	Cholesky Decomposition Ax = b	Cholesky Decomposition Ax = b and Determinant(A)
Factor and Solve	0	10	100	110
Factor Only	1	11	101	111
Solve Only	2	N/A	102	N/A

IPARM(3) indicates whether a full or partial factorization is performed on matrix A, where:
If IPARM(3) = 0, and:

If you are doing a factor and solve or a factor only, then a full factorization is performed for matrix A on rows and columns 1 through n.
If you are doing a solve only, this argument has no effect on the computation, but must be set to 0.

If IPARM(3) > 0, and you are doing a factor only, then a partial factorization is performed on matrix A. Rows 1 through IPARM(3) of columns 1 through IPARM(3) in matrix A must be in factored form from a preceding call to this subroutine. The factorization is performed on rows IPARM(3)+1 through n and columns IPARM(3)+1 through n. For an illustration, see "Notes".
IPARM(4) indicates the input storage mode used for matrix A. This determines the arrangement of data in arrays A and IDIAG on input, where:
If IPARM(4) = 0, diagonal-out skyline storage mode is used.
If IPARM(4) = 1, profile-in skyline storage mode is used.
IPARM(5) indicates the output storage mode used for matrix A. This determines the arrangement of data in arrays A and IDAIG on output, where:
If IPARM(5) = 0, diagonal-out skyline storage mode is used.
If IPARM(5) = 1, profile-in skyline storage mode is used.
IPARM(6) through IPARM(9) are reserved.
IPARM(10) has the following meaning, where:
If you are doing a factor and solve or a factor only, then IPARM(10) indicates whether certain default values for iparm and rparm are used by this subroutine, where:

If IPARM(10) = 0, the following default values are used. For restrictions, see "Notes".

IPARM(11) = -1
IPARM(12) = -1
IPARM(13) = -1
IPARM(14) = -1
IPARM(15) = 0
RPARM(10) = 10^-12

If IPARM(10) = 1, the default values are not used.

If you are doing a solve only, this argument is not used.

IPARM(11) through IPARM(15) have the following meaning, where:

If you are doing a factor and solve or a factor only, then IPARM(11) through IPARM(15) control the type of processing to apply to pivot elements occurring in regions 1 through 5, respectively. The pivot elements are d_kk for Gaussian elimination and r_kk for Cholesky decomposition for k = 1, n when doing a full factorization, and they are k = IPARM(3)+1, n when doing a partial factorization. The region in which a pivot element falls depends on the sign and magnitude of the pivot element. The regions are determined by RPARM(10). For a description of the regions and associated pivot values, see "Notes". For each region i for i = 1,5, where the pivot occurs in region i, the processing applied to the pivot element is determined by IPARM(10+i), where:

If IPARM(10+i) = -1, the pivot element is trapped and computational error 2126 is generated. See "Error Conditions".

If IPARM(10+i) = 0, processing continues normally.

Note:

A value of 0 is not permitted for region 3, because if processing continues, a divide-by-zero exception occurs. In addition, if you are doing a Cholesky decomposition, a value of 0 is not permitted in regions 1 and 2, because a square root exception occurs.

If IPARM(10+i) = 1, the pivot element is replaced with the value in RPARM(10+i), and processing continues normally.

If you are doing a solve only, these arguments are not used.

IPARM(16) through IPARM(25), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 25, containing fullword integers, where:

IPARM(1) = 0 or 1

IPARM(2) = 0, 1, 2, 10, 11, 100, 101, 102, 110, or 111

If IPARM(2) = 0, 2, 10, 100, 102, or 110, then IPARM(3) = 0

If IPARM(2) = 1, 11, 101, or 111, then 0 <= IPARM(3) <= n

IPARM(4), IPARM(5) = 0 or 1

If IPARM(2) = 0, 1, 10, or 11, then:

IPARM(10) = 0 or 1

IPARM(11), IPARM(12) = -1, 0, or 1

IPARM(13) = -1 or 1

IPARM(14), IPARM(15) = -1, 0, or 1

If IPARM(2) = 100, 101, 110, or 111, then:

IPARM(10) = 0 or 1

IPARM(11), IPARM(12), IPARM(13) = -1 or 1

IPARM(14), IPARM(15) = -1, 0, or 1

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) through RPARM(9) are reserved.
RPARM(10) has the following meaning, where:
If you are doing a factor and solve or a factor only, RPARM(10) is the tolerance value for small pivots. This sets the bounds for the pivot regions, where pivots are processed according to the options you specify for the five regions in IPARM(11) through IPARM(15), respectively. The suggested value is 10^-15 <= IPARM(10) <= 1.
If you are doing a solve only, this argument is not used.
RPARM(11) through RPARM(15) have the following meaning, where:
If you are doing a factor and solve or a factor only, RPARM(11) through RPARM(15) are the fix-up values to use for the pivots in regions 1 through 5, respectively. For each RPARM(10+i) for i = 1,5, where the pivot occurs in region i:

If IPARM(10+i) = 1, the pivot is replaced with RPARM(10+i), where |RPARM(10+i)| should be a sufficiently large nonzero value to avoid overflow when calculating the reciprocal of the pivot. For Gaussian elimination, the suggested value is 10^-15 <= |RPARM(10+i)| <= 1. For Cholesky decomposition, the value must be RPARM(10+i) > 0.
If IPARM(10+i) <> 1, RPARM(10+i) is not used.

If you are doing a solve only, these arguments are not used.
RPARM(16) through RPARM(25), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 25, containing long-precision real numbers, where if IPARM(2) = 0, 1, 10, 11, 100, 101, 110, or 111, then:

RPARM(10) >= 0.0

If IPARM(2) = 0, 1, 10, or 11, then RPARM(11) through RPARM(15) <> 0.0

If IPARM(2) = 100, 101, 110, or 111, then RPARM(11) through RPARM(15) > 0.0

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is the storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing long-precision real numbers.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, DSKFS dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, If you are doing a factor only, you can use naux >= n; however, for optimal performance, use naux >= 3n.

If you are doing a factor and solve or a solve only, use naux >= 3n+4mbx.

For further details on error handling and the special factor-only case, see "Notes".

bx

has the following meaning, where:

If you are doing a factor and solve or a solve only, bx is the array, containing the mbx right-hand side vectors b of the system Ax = b. Each vector b is length n and is stored in the corresponding column of the array.

If you are doing a factor only, this argument is not used in the computation.

Specified as: an ldbx by (at least) mbx array, containing long-precision real numbers.

ldbx

has the following meaning, where:

If you are doing a factor and solve or a solve only, ldbx is the leading dimension of the array specified for bx.

If you are doing a factor only, this argument is not used in the computation.

Specified as: a fullword integer; ldbx >= n and:

If mbx <> 0, then ldbx > 0.

If mbx = 0, then ldbx >= 0.

mbx

has the following meaning, where:

If you are doing a factor and solve or a solve only, mbx is the number of right-hand side vectors, b, in the array specified for bx.

If you are doing a factor only, this argument is not used in the computation.

Specified as: a fullword integer; mbx >= 0.

On Return

a

is the array, referred to as A, containing the upper triangular part of symmetric sparse matrix A in LDL^T or R^TR factored form, where:

If IPARM(5) = 0, diagonal-out skyline storage mode is used for A.

If IPARM(5) = 1, profile-in skyline storage mode is used for A.

(If mbx = 0 and you are doing a solve only, then a is unchanged on output.) Returned as: a one-dimensional array of (at least) length na, containing long-precision real numbers.

idiag

is the array, referred to as IDIAG, containing the relative positions of the diagonal elements of the factored output matrix A in array A. (If mbx = 0 and you are doing a solve only, then idiag is unchanged on output.)

Returned as: a one-dimensional array of (at least) length n+1, containing fullword integers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) through IPARM(15) are unchanged.
IPARM(16) has the following meaning, where:
If you are doing a factor and solve or a factor only, and:

If IPARM(16) = -1, your factorization did not complete successfully, resulting in computational error 2126.
If IPARM(16) > 0, it is the row number k, in which the maximum absolute value of the ratio a_kk/d_kk for Gaussian elimination and a_kk/r_kk for Cholesky decomposition occurred, where:

If IPARM(3) = 0, k can be any of the rows, 1 through n, in the full factorization.
If IPARM(3) > 0, k can be any of the rows, IPARM(3)+1 through n, in the partial factorization.

If you are doing a solve only, this argument is not used in the computation and is unchanged.
IPARM(17) through IPARM(20) are reserved.
IPARM(21) through IPARM(25) have the following meaning, where:
If you are doing a factor and solve or a factor only, IPARM(21) through IPARM(25) have the following meanings for each region i for i = 1,5, respectively:

If IPARM(20+i) = -1, your factorization did not complete successfully, resulting in computational error 2126.
If IPARM(20+i) >= 0, it is the number of pivots in region i for the columns that were factored in matrix A, where:

If IPARM(3) = 0, columns 1 through n were factored in the full factorization.
If IPARM(3) > 0, columns IPARM(3)+1 through n were factored in the partial factorization.

If you are doing a solve only, these arguments are not used in the computation and are unchanged.

Returned as: a one-dimensional array of (at least) length 25, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) through RPARM(15) are unchanged.
RPARM(16) has the following meaning, where:
If you are doing a factor and solve or a factor only, and:

If RPARM(16) = 0.0, your factorization did not complete successfully, resulting in computational error 2126.
If |RPARM(16)| > 0.0, it is the ratio for row k, a_kk/d_kk for Gaussian elimination and a_kk/r_kk for Cholesky decomposition, having the maximum absolute value. Row k is indicated in IPARM(16), and:

If IPARM(3) = 0, the ratio corresponds to one of the rows, 1 through n, in the full factorization.
If IPARM(3) > 0, the ratio corresponds to one of the rows, IPARM(3)+1 through n, in the partial factorization.

If you are doing a solve only, this argument is not used in the computation and is unchanged.
RPARM(17) and RPARM(18) have the following meaning, where:
If you are computing the determinant of matrix A, then RPARM(17) is the mantissa, detbas, and RPARM(18) is the power of 10, detpwr, used to express the value of the determinant: detbas(10^detpwr), where 1 <= detbas < 10. Also:

If IPARM(3) = 0, the determinant is computed for columns 1 through n in the full factorization.
If IPARM(3) > 0, the determinant is computed for columns IPARM(3)+1 through n in the partial factorization.

If you are not computing the determinant of matrix A, these arguments are not used in the computation and are unchanged.
RPARM(19) through RPARM(25) are reserved.

containing long-precision real numbers.

bx

has the following meaning, where:

If you are doing a factor and solve or a solve only, bx is the array, containing the mbx solution vectors x of the system Ax = b. Each vector x is length n and is stored in the corresponding column of the array. (If mbx = 0, then bx is unchanged on output.)

If you are doing a factor only, this argument is not used in the computation and is unchanged.

Returned as: an ldbx by (at least) mbx array, containing long-precision real numbers.

Notes

When doing a solve only, you should specify the same factorization method in IPARM(2), Gaussian elimination or Cholesky decomposition, that you specified for your factorization on a previous call to this subroutine.
If you set either IPARM(1) = 0 or IPARM(10) = 0, indicating you want to use the default values for IPARM(11) through IPARM(15) and RPARM(10), then:
- Matrix A must be positive definite.
- No pivots are fixed, using RPARM(11) through RPARM(15) values.
- No small pivots are tolerated; that is, the value should be |pivot| > RPARM(10).
Many of the input and output parameters for iparm and rparm are defined for the five pivot regions handled by this subroutine. The limits of the regions are based on RPARM(10), as shown in Figure 12. The pivot values in each region are:

Region 1: pivot < -RPARM(10)
Region 2: -RPARM(10) <= pivot < 0
Region 3: pivot = 0
Region 4: 0 < pivot <= RPARM(10)
Region 5: pivot > RPARM(10)

Figure 12. Five Pivot Regions
The IPARM(4) and IPARM(5) arguments allow you to specify the same or different skyline storage modes for your input and output arrays for matrix A. This allows you to change storage modes as needed. However, if you are concerned with performance, you should use diagonal-out skyline storage mode for both input and output, if possible, because there is less overhead.
For a description of how sparse matrices are stored in skyline storage mode, see "Profile-In Skyline Storage Mode" and "Diagonal-Out Skyline Storage Mode". Those descriptions use different array and variable names from the ones used here. To relate the two sets, use the following table:

Name Here Name in the Storage Description
A AU
na nu
IDIAG IDU
Following is an illustration of the portion of matrix A factored in the partial factorization when IPARM(3) > 0. In this case, the subroutine assumes that rows and columns 1 through IPARM(3) are already factored and that rows and columns IPARM(3)+1 through n are to be factored in this computation.

You use the partial factorization function when, for design or storage reasons, you must factor the matrix A in stages. When doing a partial factorization, you must use the same skyline storage mode for all parts of the matrix as it is progressively factored.
Your various arrays must have no common elements; otherwise, results are unpredictable.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

This subroutine can factor, compute the determinant of, and solve symmetric sparse matrix A, stored in skyline storage mode. It can use either Gaussian elimination or Cholesky decomposition. For all computations, input matrix A can be stored in either diagonal-out or profile-in skyline storage mode. Output matrix A can also be stored in either of these modes and can be different from the mode used for input.

For Gaussian elimination, matrix A is factored into the following form using specified pivot processing:

A = LDL^T

where:

D is a diagonal matrix.

L is a lower triangular matrix.

The transformed matrix A, factored into its LDL^T form, is stored in packed format in array A, such that the inverse of the diagonal matrix D is stored in the corresponding elements of array A. The off-diagonal elements of the unit upper triangular matrix L^T are stored in the corresponding off-diagonal elements of array A.

For Cholesky decomposition, matrix A is factored into the following form using specified pivot processing:

A = R^TR

where R is an upper triangular matrix

The transformed matrix A, factored into its R^TR form, is stored in packed format in array A, such that the inverse of the diagonal elements of the upper triangular matrix R is stored in the corresponding elements of array A. The off-diagonal elements of matrix R are stored in the corresponding off-diagonal elements of array A.

The partial factorization of matrix A, which you can do when you specify the factor-only option, assumes that the first IPARM(3) rows and columns are already factored in the input matrix. It factors the remaining n-IPARM(3) rows and columns in matrix A. (See "Notes" for an illustration.) It updates only the elements in array A corresponding to the part of matrix A that is factored.

The system Ax = b, having multiple right-hand sides, is solved for x using the transformed matrix A produced by this call or a subsequent call to this subroutine.

See references [9], [12], [25], [47], [65]. If n is 0, no computation is performed. If mbx is 0, no solve is performed.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.
Unable to allocate internal work area.

Computational Errors

If a pivot occurs in region i for i = 1,5 and IPARM(10+i) = 1, the pivot value is replaced with RPARM(10+i), an attention message is issued, and processing continues.
Unacceptable pivot values occurred in the factorization of matrix A.
- One or more diagonal elements of D or R contains unacceptable pivots and no valid fixup is applicable. The row number i of the first unacceptable pivot element is identified in the computational error message.
- The return code is set to 2.
- i can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2126 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

n < 0
na < 0
IDIAG(n+1) > na+1
IDIAG(i+1) <= IDIAG(i) for i = 1, n
IDIAG(i+1) > IDIAG(i)+i and IPARM(4) = 0 for i = 1, n
IDIAG(i) > IDIAG(i-1)+i and IPARM(4) = 1 for i = 2, n
IPARM(1) <> 0 or 1
IPARM(2) <> 0, 1, 2, 10, 11, 100, 101, 102, 110, or 111
IPARM(3) < 0
IPARM(3) > n
IPARM(3) > 0 and IPARM(2) <> 1, 11, 101, or 111
IPARM(4), IPARM(5) <> 0 or 1
IPARM(2) = 0, 1, 10, or 11 and:

IPARM(10) <> 0 or 1
IPARM(11), IPARM(12) <> -1, 0, or 1
IPARM(13) <> -1 or 1
IPARM(14), IPARM(15) <> -1, 0, or 1
RPARM(10) < 0.0
RPARM(10+i) = 0.0 and IPARM(10+i) = 1 for i = 1,5
IPARM(2) = 100, 101, 110, or 111 and:

IPARM(10) <> 0 or 1
IPARM(11), IPARM(12), IPARM(13) <> -1 or 1
IPARM(14), IPARM(15) <> -1, 0, or 1
RPARM(10) < 0.0
RPARM(10+i) <= 0.0 and IPARM(10+i) = 1 for i = 1,5
IPARM(2) = 0, 2, 10, 100, 102, or 110 and:

ldbx <= 0 and mbx <> 0 and n <> 0
ldbx < 0 and mbx = 0
ldbx < n and mbx <> 0
mbx < 0
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example shows how to factor a 9 by 9 symmetric sparse matrix A and solve the system Ax = b with three right-hand sides. It uses Gaussian elimination. The default values are used for IPARM and RPARM. Input matrix A, shown here, is stored in diagonal-out skyline storage mode. Matrix A is:

            *                                             *
            | 1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0 |
            | 1.0  2.0  2.0  2.0  1.0  1.0  0.0  1.0  0.0 |
            | 1.0  2.0  3.0  3.0  2.0  2.0  0.0  2.0  0.0 |
            | 1.0  2.0  3.0  4.0  3.0  3.0  0.0  3.0  0.0 |
            | 0.0  1.0  2.0  3.0  4.0  4.0  1.0  4.0  0.0 |
            | 0.0  1.0  2.0  3.0  4.0  5.0  2.0  5.0  1.0 |
            | 0.0  0.0  0.0  0.0  1.0  2.0  3.0  3.0  2.0 |
            | 0.0  1.0  2.0  3.0  4.0  5.0  3.0  7.0  3.0 |
            | 0.0  0.0  0.0  0.0  0.0  1.0  2.0  3.0  4.0 |
            *                                             *

Output matrix A, shown here, is in LDL^T factored form with D^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Matrix A is:

             *                                             *
             | 1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0  0.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  0.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
             | 0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0  1.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0  1.0 |
             | 0.0  0.0  0.0  0.0  0.0  1.0  1.0  1.0  1.0 |
             *                                             *

Call Statement and Input

            N  A  NA  IDIAG  IPARM  RPARM  AUX NAUX  BX  LDBX MBX
            |  |  |     |      |      |     |   |    |    |    |
CALL DSKFS( 9, A, 33, IDIAG, IPARM, RPARM, AUX, 39 , BX , 12 , 3 )

A        =  (1.0, 2.0, 1.0, 3.0, 2.0, 1.0, 4.0, 3.0, 2.0, 1.0, 4.0,
             3.0, 2.0, 1.0, 5.0, 4.0, 3.0, 2.0, 1.0, 3.0, 2.0, 1.0,
             7.0, 3.0, 5.0, 4.0, 3.0, 2.0, 1.0, 4.0, 3.0, 2.0, 1.0)
IDIAG    =  (1, 2, 4, 7, 11, 15, 20, 23, 30, 34)
IPARM    =  (0, . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . , . )

RPARM    =(not relevant)

        *                     *
        |  4.00   8.00  12.00 |
        | 10.00  20.00  30.00 |
        | 15.00  30.00  45.00 |
        | 19.00  38.00  57.00 |
        | 19.00  38.00  57.00 |
BX   =  | 23.00  46.00  69.00 |
        | 11.00  22.00  33.00 |
        | 28.00  56.00  84.00 |
        | 10.00  20.00  30.00 |
        |   .      .      .   |
        |   .      .      .   |
        |   .      .      .   |
        *                     *

Output

A        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IDIAG    =(same as input)
IPARM    =  (0, . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 8, . , . , . , . , 0, 0, 0, 0, 9)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 7.0, . , . , . , . , . , . , . , . , . )

        *                  *
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
BX   =  | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        | 1.00  2.00  3.00 |
        |  .     .     .   |
        |  .     .     .   |
        |  .     .     .   |
        *                  *

Example 2

This example shows how to factor the 9 by 9 symmetric sparse matrix A from Example 1, solve the system Ax = b with three right-hand sides, and compute the determinant of A. It uses Gaussian elimination. The default values for pivot processing are used for IPARM. Input matrix A is stored in profile-in skyline storage mode. Output matrix A is in LDL^T factored form with D^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. It is the same as output matrix A in Example 1.

Call Statement and Input

            N  A  NA  IDIAG  IPARM  RPARM  AUX NAUX  BX  LDBX MBX
            |  |  |     |      |      |     |   |    |    |    |
CALL DSKFS( 9, A, 33, IDIAG, IPARM, RPARM, AUX, 39 , BX , 12 , 3 )

A        =  (1.0, 1.0, 2.0, 1.0, 2.0, 3.0, 1.0, 2.0, 3.0, 4.0, 1.0,
             2.0, 3.0, 4.0, 1.0, 2.0, 3.0, 4.0, 5.0, 1.0, 2.0, 3.0,
             1.0, 2.0, 3.0, 4.0, 5.0, 3.0, 7.0, 1.0, 2.0, 3.0, 4.0)
IDIAG    =  (1, 3, 6, 10, 14, 19, 22, 29, 33, 34)
IPARM    =  (1, 10, 0, 1, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )

RPARM    =(not relevant)

        *                     *
        |  4.00   8.00  12.00 |
        | 10.00  20.00  30.00 |
        | 15.00  30.00  45.00 |
        | 19.00  38.00  57.00 |
        | 19.00  38.00  57.00 |
BX   =  | 23.00  46.00  69.00 |
        | 11.00  22.00  33.00 |
        | 28.00  56.00  84.00 |
        | 10.00  20.00  30.00 |
        |   .      .      .   |
        |   .      .      .   |
        |   .      .      .   |
        *                     *

Output

A        =(same as output A in Example 1)
IDIAG    =(same as input IDIAG in Example 1)
IPARM    =  (1, 10, 0, 1, 0, . , . , . , . , 0, . , . , . , . , . , 8,
             . , . , . , . , 0, 0, 0, 0, 9)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 7.0, 1.0, 0.0, . , . , . , . , . , . , . )
BX       =(same as output BX in Example 1)

Example 3

This example shows how to factor a 9 by 9 negative-definite symmetric sparse matrix A, solve the system Ax = b with three right-hand sides, and compute the determinant of A. It uses Gaussian elimination. (Default values for pivot processing are not used for IPARM because A is negative-definite.) Input matrix A, shown here, is stored in diagonal-out skyline storage mode. Matrix A is:

         *                                                      *
         | -1.0  -1.0  -1.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
         | -1.0  -2.0  -2.0  -2.0  -1.0  -1.0   0.0  -1.0   0.0 |
         | -1.0  -2.0  -3.0  -3.0  -2.0  -2.0   0.0  -2.0   0.0 |
         | -1.0  -2.0  -3.0  -4.0  -3.0  -3.0   0.0  -3.0   0.0 |
         |  0.0  -1.0  -2.0  -3.0  -4.0  -4.0  -1.0  -4.0   0.0 |
         |  0.0  -1.0  -2.0  -3.0  -4.0  -5.0  -2.0  -5.0  -1.0 |
         |  0.0   0.0   0.0   0.0  -1.0  -2.0  -3.0  -3.0  -2.0 |
         |  0.0  -1.0  -2.0  -3.0  -4.0  -5.0  -3.0  -7.0  -3.0 |
         |  0.0   0.0   0.0   0.0   0.0  -1.0  -2.0  -3.0  -4.0 |
         *                                                      *

Output matrix A, shown here, is in LDL^T factored form with D^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Matrix A is:

         *                                                      *
         | -1.0   1.0   1.0   1.0   0.0   0.0   0.0   0.0   0.0 |
         |  1.0  -1.0   1.0   1.0   1.0   1.0   0.0   1.0   0.0 |
         |  1.0   1.0  -1.0   1.0   1.0   1.0   0.0   1.0   0.0 |
         |  1.0   1.0   1.0  -1.0   1.0   1.0   0.0   1.0   0.0 |
         |  0.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0   0.0 |
         |  0.0   1.0   1.0   1.0   1.0  -1.0   1.0   1.0   1.0 |
         |  0.0   0.0   0.0   0.0   1.0   1.0  -1.0   1.0   1.0 |
         |  0.0   1.0   1.0   1.0   1.0   1.0   1.0  -1.0   1.0 |
         |  0.0   0.0   0.0   0.0   0.0   1.0   1.0   1.0  -1.0 |
         *                                                      *

Call Statement and Input

           N  A  NA  IDIAG  IPARM  RPARM  AUX  NAUX  BX  LDBX MBX
           |  |  |     |      |      |     |    |    |    |    |
CALL DSKFS(9, A, 33, IDIAG, IPARM, RPARM, AUX,  39 , BX , 12 , 3 )

A        =  (-1.0, -2.0, -1.0, -3.0, -2.0, -1.0, -4.0, -3.0, -2.0,
             -1.0, -4.0, -3.0, -2.0, -1.0, -5.0, -4.0, -3.0, -2.0,
             -1.0, -3.0, -2.0, -1.0, -7.0, -3.0, -5.0, -4.0, -3.0,
             -2.0, -1.0, -4.0, -3.0, -2.0, -1.0)
IDIAG    =  (1, 2, 4, 7, 11, 15, 20, 23, 30, 34)
IPARM    =  (1, 10, 0, 0, 0, . , . , . , . , 1, 0, -1, -1, -1, -1, . ,
             . , . , . , . , . , . , . , . , . )

 
RPARM    =  ( . , . , . , . , . , . , . , . , . , 10^-15, . , . ,
              . , . , . , . , . , . , . , . , . , . , . , . , . )
BX       =(same as input BX in Example 1)

Output

A        =  (-1.0, -1.0, 1.0, -1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0,
             -1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0, 1.0, 1.0, -1.0, 1.0,
             1.0, -1.0 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, -1.0, 1.0, 1.0,
             1.0)
IDIAG    =(same as input)
IPARM    =  (1, 10, 0, 0, 0, . , . , . , . , 1, 0, -1, -1, -1, -1, 8,
             . , . , . , . , 9, 0, 0, 0, 0)
 
RPARM    =  ( . , . , . , . , . , . , . , . , . ,10^-15, . , . ,
              . , . , . , 7.0, -1.0, 0.0, . , . , . , . , . , . , . )

        *                     *
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
BX   =  | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        | -1.00  -2.00  -3.00 |
        |   .      .      .   |
        |   .      .      .   |
        |   .      .      .   |
        *                     *

Example 4

This example shows how to factor the first six rows and columns, referred to as matrix A1, of the 9 by 9 symmetric sparse matrix A from Example 1 and compute the determinant of A1. It uses Gaussian elimination. Input matrix A1, shown here, is stored in diagonal-out skyline storage mode. Input matrix A1 is:

                     *                              *
                     | 1.0  1.0  1.0  1.0  0.0  0.0 |
                     | 1.0  2.0  2.0  2.0  1.0  1.0 |
                     | 1.0  2.0  3.0  3.0  2.0  2.0 |
                     | 1.0  2.0  3.0  4.0  3.0  3.0 |
                     | 0.0  1.0  2.0  3.0  4.0  4.0 |
                     | 0.0  1.0  2.0  3.0  4.0  5.0 |
                     *                              *

Output matrix A1, shown here, is in LDL^T factored form with D^-1 on the diagonal, and is stored in diagonal-out skyline storage mode. Output matrix A1 is:

                     *                              *
                     | 1.0  1.0  1.0  1.0  0.0  0.0 |
                     | 1.0  1.0  1.0  1.0  1.0  1.0 |
                     | 1.0  1.0  1.0  1.0  1.0  1.0 |
                     | 1.0  1.0  1.0  1.0  1.0  1.0 |
                     | 0.0  1.0  1.0  1.0  1.0  1.0 |
                     | 0.0  1.0  1.0  1.0  1.0  1.0 |
                     *                              *

Call Statement and Input

            N   A   NA   IDIAG   IPARM   RPARM   AUX  NAUX  BX   LDBX   MBX
            |   |    |     |       |       |      |    |    |     |      |
CALL DSKFS (6 , A , 33 , IDIAG , IPARM , RPARM , AUX , 27 , BX , LDBX , MBX )

A        =(same as input A in Example 1)
IDIAG    =  (1, 2, 4, 7, 11, 15, 20)
IPARM    =  (1, 11, 0, 0, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =(not relevant)
LDBX     =(not relevant)
MBX      =(not relevant)

Output

A        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 3.0, 2.0, 1.0,
             7.0, 3.0, 5.0, 4.0, 3.0, 2.0, 1.0, 4.0, 3.0, 2.0, 1.0)
IDIAG    =(same as input)
IPARM    =  (1, 11, 0, 0, 0, . , . , . , . , 0, . , . , . , . , . , 6,
             . , . , . , . , 0, 0, 0, 0, 6)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 5.0, 1.0, 0.0, . , . , . , . , . , . , . )
BX       =(same as input)
LDBX     =(same as input)
MBX      =(same as input)

Example 5

This example shows how to do a partial factorization of the 9 by 9 symmetric sparse matrix A from Example 1, where the first six rows and columns were factored in Example 4. It factors the remaining three rows and columns and computes the determinant of that part of the matrix. It uses Gaussian elimination. The input matrix, referred to as A2, shown here, is made up of the output factored matrix A1 plus the three remaining unfactored rows and columns of matrix A Matrix A2 is:

             *                                             *
             | 1.0  1.0  1.0  1.0  0.0  0.0  0.0  0.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  1.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  2.0  0.0 |
             | 1.0  1.0  1.0  1.0  1.0  1.0  0.0  3.0  0.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  1.0  4.0  0.0 |
             | 0.0  1.0  1.0  1.0  1.0  1.0  2.0  5.0  1.0 |
             | 0.0  0.0  0.0  0.0  1.0  2.0  3.0  3.0  2.0 |
             | 0.0  1.0  2.0  3.0  4.0  5.0  3.0  7.0  3.0 |
             | 0.0  0.0  0.0  0.0  0.0  1.0  2.0  3.0  4.0 |
             *                                             *

Both parts of input matrix A2 are stored in diagonal-out skyline storage mode.

Output matrix A2 is the same as output matrix A in Example 1 and is stored in diagonal-out skyline storage mode.

Call Statement and Input

            N   A   NA   IDIAG   IPARM   RPARM   AUX   NAUX  BX   LDBX   MBX
            |   |   |      |       |       |      |     |    |     |      |
CALL DSKFS (9 , A , 33 , IDIAG , IPARM , RPARM , AUX ,  27 , BX , LDBX , MBX )

A        =(same as output A in Example 4)
IDIAG    =(same as input IDIAG in Example 1)
IPARM    =  (1, 11, 6, 0, 0, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =(not relevant)
LDBX     =(not relevant)
MBX      =(not relevant)

Output

A        =(same as output A in Example 1)
 
IDIAG    =(same as output IDIAG in Example 1)
IPARM    =  (1, 11, 6, 0, 0, . , . , . , . , 0, . , . , . , . , . , 8,
             . , . , . , . , 0, 0, 0, 0, 3)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 7.0, 1.0, 0.0, . , . , . , . , . , . , . )
BX       =(same as input)
LDBX     =(same as input)
MBX      =(same as input)

Example 6

This example shows how to solve the system Ax = b with one right-hand side for a symmetric sparse matrix A. Input matrix A, used here, is the same as factored output matrix A from Example 1, stored in profile-in skyline storage mode. It specifies Gaussian elimination, as used in Example 1. Here, output matrix A is unchanged on output and is stored in profile-in skyline storage mode.

Call Statement and Input

            N  A  NA  IDIAG  IPARM  RPARM  AUX NAUX  BX  LDBX MBX
            |  |  |     |      |      |     |   |    |    |    |
CALL DSKFS (9, A, 33, IDIAG, IPARM, RPARM, AUX, 31 , BX , 9 ,  1 )

A        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IDIAG    =  (1, 3, 6, 10, 14, 19, 22, 29, 33, 34)
IPARM    =  (1, 2, 0, 1, 1, . , . , . , . , . , . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )
RPARM    =(not relevant)
BX       =  (10.0, 38.0, 64.0, 87.0, 103.0, 133.0, 80.0, 174.0, 80.0)

Output

A        =(same as input)
IDIAG    =(same as input)
IPARM    =(same as input)
APARM    =(same as input)
BX       =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0, 8.0, 9.0)

Example 7

This example shows how to factor a 9 by 9 symmetric sparse matrix A and solve the system Ax = b with four right-hand sides. It uses Cholesky decomposition. Input matrix A, shown here, is stored in profile-in skyline storage mode Matrix A is:

          *                                                    *
          | 1.0  1.0   1.0   0.0   1.0   0.0   0.0   0.0   1.0 |
          | 1.0  5.0   3.0   0.0   3.0   0.0   0.0   0.0   3.0 |
          | 1.0  3.0  11.0   3.0   5.0   3.0   3.0   0.0   5.0 |
          | 0.0  0.0   3.0  17.0   5.0   5.0   5.0   0.0   5.0 |
          | 1.0  3.0   5.0   5.0  29.0   7.0   7.0   0.0   9.0 |
          | 0.0  0.0   3.0   5.0   7.0  39.0   9.0   6.0   9.0 |
          | 0.0  0.0   3.0   5.0   7.0   9.0  53.0   8.0  11.0 |
          | 0.0  0.0   0.0   0.0   0.0   6.0   8.0  66.0  10.0 |
          | 1.0  3.0   5.0   5.0   9.0   9.0  11.0  10.0  89.0 |
          *                                                    *

Output matrix A, shown here, is in R^TR factored form with the inverse of the diagonal of R on the diagonal, and is stored in profile-in skyline storage mode. Matrix A is:

           *                                                  *
           | 1.0  1.0   1.0  0.0  1.0   0.0   0.0   0.0   1.0 |
           | 1.0   .5   1.0  0.0  1.0   0.0   0.0   0.0   1.0 |
           | 1.0  1.0  .333  1.0  1.0   1.0   1.0   0.0   1.0 |
           | 0.0  0.0   1.0  .25  1.0   1.0   1.0   0.0   1.0 |
           | 1.0  1.0   1.0  1.0   .2   1.0   1.0   0.0   1.0 |
           | 0.0  0.0   1.0  1.0  1.0  .167   1.0   1.0   1.0 |
           | 0.0  0.0   1.0  1.0  1.0   1.0  .143   1.0   1.0 |
           | 0.0  0.0   0.0  0.0  0.0   1.0   1.0  .125   1.0 |
           | 1.0  1.0   1.0  1.0  1.0   1.0   1.0   1.0  .111 |
           *                                                  *

Call Statement and Input

            N  A  NA  IDIAG  IPARM  RPARM  AUX NAUX  BX  LDBX MBX
            |  |  |     |      |      |     |   |    |    |    |
CALL DSKFS( 9, A, 34, IDIAG, IPARM, RPARM, AUX, 43 , BX , 10 , 4 )

A        =  (1.0, 1.0, 5.0, 1.0, 3.0, 11.0, 3.0, 17.0, 1.0, 3.0, 5.0,
             5.0, 29.0, 3.0, 5.0, 7.0, 39.0, 3.0, 5.0, 7.0, 9.0, 53.0,
             6.0, 8.0, 66.0, 1.0, 3.0, 5.0, 5.0, 9.0, 9.0, 11.0, 10.0,
             89.0)
IDIAG    =  (1, 3, 6, 8, 13, 17, 22, 25, 34, 35)
IPARM    =  (1, 110, 0, 1, 1, . , . , . , . , 0, . , . , . , . , . ,
             . , . , . , . , . , . , . , . , . , . )

RPARM    =(not relevant)

        *                                *
        |   5.00   10.00   15.00   20.00 |
        |  15.00   30.00   45.00   60.00 |
        |  34.00   68.00  102.00  136.00 |
        |  40.00   80.00  120.00  160.00 |
BX   =  |  66.00  132.00  198.00  264.00 |
        |  78.00  156.00  234.00  312.00 |
        |  96.00  192.00  288.00  384.00 |
        |  90.00  180.00  270.00  360.00 |
        | 142.00  284.00  426.00  568.00 |
        |    .       .       .       .   |
        *                                *

Output

A        =  (1.0, 1.0, .5, 1.0, 1.0, .333, 1.0, .25, 1.0, 1.0, 1.0,
             1.0, .2, 1.0, 1.0, 1.0, .167, 1.0, 1.0, 1.0, 1.0, .143,
             1.0, 1.0, .125, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0,
             .111)
IDIAG    =(same as input)
IPARM    =  (1, 110, 0, 1, 1, . , . , . , . , 0, . , . , . , . , . ,
             9, . , . , . , . , 0, 0, 0, 0, 9)
RPARM    =  ( . , . , . , . , . , . , . , . , . , . , . , . , . , . ,
             . , 9.89, 1.32, 11.0, . , . , . , . , . , . , . )

        *                        *
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
BX   =  | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        | 1.00  2.00  3.00  4.00 |
        |  .     .     .     .   |
        *                        *

DSRIS--Iterative Linear System Solver for a General or Symmetric Sparse Matrix Stored by Rows

This subroutine solves a general or symmetric sparse linear system of equations, using an iterative algorithm, with or without preconditioning. The methods include conjugate gradient (CG), conjugate gradient squared (CGS), generalized minimum residual (GMRES), more smoothly converging variant of the CGS method (Bi-CGSTAB), or transpose-free quasi-minimal residual method (TFQMR). The preconditioners include an incomplete LU factorization, an incomplete Cholesky factorization (for positive definite symmetric matrices), diagonal scaling, or symmetric successive over-relaxation (SSOR) with two possible choices for the diagonal matrix: one uses the absolute values sum of the input matrix, and the other uses the diagonal obtained from the LU factorization. The sparse matrix is stored using storage-by-rows for general matrices and upper- or lower-storage-by-rows for symmetric matrices. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Syntax

Fortran	CALL DSRIS (`stor`, `init`, `n`, `ar`, `ja`, `ia`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsris (`stor`, `init`, `n`, `ar`, `ja`, `ia`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSRIS (`stor`, `init`, `n`, `ar`, `ja`, `ia`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

stor

indicates the form of sparse matrix A and the storage mode used, where:

If stor = 'G', A is a general sparse matrix, stored using storage-by-rows.

If stor = 'U', A is a symmetric sparse matrix, stored using upper-storage-by-rows.

If stor = 'L', A is a symmetric sparse matrix, stored using lower-storage-by-rows.

Specified as: a single character. It must be 'G', 'U', or 'L'.

init

indicates the type of computation to be performed, where:

If init = 'I', the preconditioning matrix is computed, the internal representation of the sparse matrix is generated, and the iteration procedure is performed. The coefficient matrix and preconditioner in internal format are saved in aux1.

If init = 'S', the iteration procedure is performed using the coefficient matrix and the preconditioner in internal format, stored in aux1, created in a preceding call to this subroutine with init = 'I'. You use this option to solve the same matrix for different right-hand sides, b, optimizing your performance. As long as you do not change the coefficient matrix and preconditioner in aux1, any number of calls can be made with init = 'S'.

Specified as: a single character. It must be 'I' or 'S'.

n

is the order of the linear system Ax = b and the number of rows and columns in sparse matrix A. Specified as: a fullword integer; n >= 0.

ar

is the sparse matrix A of order n, stored by rows in an array, referred to as AR. The stor argument indicates the storage variation used for storing matrix A. Specified as: a one-dimensional array, containing long-precision real numbers. The number of elements in this array can be determined by subtracting 1 from the value in IA(n+1).

ja

is the array, referred to as JA, containing the column numbers of each nonzero element in sparse matrix A. Specified as: a one-dimensional array, containing fullword integers; 1 <= (JA elements) <= n. The number of elements in this array can be determined by subtracting 1 from the value in IA(n+1).

ia

is the row pointer array, referred to as IA, containing the starting positions of each row of matrix A in array AR and one position past the end of array AR. Specified as: a one-dimensional array of (at least) length n+1, containing fullword integers; IA(i+1) >= IA(i) for i = 1, n+1.

b

is the vector b of length n, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length n, containing long-precision real numbers.

x

is the vector x of length n, containing your initial guess of the solution of the linear system. Specified as: a one-dimensional array of (at least) length n, containing long-precision real numbers. The elements can have any value, and if no guess is available, the value can be zero.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 4
IPARM(4) = 4
IPARM(5) = 1
RPARM(1) = 10^-6
RPARM(2) = 1
IPARM(2) is the flag used to select the iterative procedure used in this subroutine.
If IPARM(2) = 1, the conjugate gradient (CG) method is used. Note that this algorithm should only be used with positive definite symmetric matrices.
If IPARM(2) = 2, the conjugate gradient squared (CGS) method is used.
If IPARM(2) = 3, the generalized minimum residual (GMRES) method, restarted after k steps, is used.
If IPARM(2) = 4, the more smoothly converging variant of the CGS method (Bi-CGSTAB) is used.
If IPARM(2) = 5, the transpose-free quasi-minimal residual method (TFQMR) is used.
IPARM(3) has the following meaning, where:
If IPARM(2) <> 3, then IPARM(3) is not used.
If IPARM(2) = 3, then IPARM(3) = k, where k is the number of steps after which the generalized minimum residual method is restarted. A value for k in the range of 5 to 10 is suitable for most problems.
IPARM(4) is the flag that determines the type of preconditioning.
If IPARM(4) = 1, the system is not preconditioned.
If IPARM(4) = 2, the system is preconditioned by a diagonal matrix.
If IPARM(4) = 3, the system is preconditioned by SSOR splitting with the diagonal given by the absolute values sum of the input matrix.
If IPARM(4) = 4, the system is preconditioned by an incomplete LU factorization.
If IPARM(4) = 5, the system is preconditioned by SSOR splitting with the diagonal given by the incomplete LU factorization.

Note: The multithreaded version of DSRIS only runs on multiple threads when IPARM(4) = 1 or 2.

IPARM(5) is the flag used to select the stopping criterion used in the computation, where the following items are used in the definitions of the stopping criteria below:

epsilon is the desired relative accuracy and is stored in RPARM(1).
x_j is the solution found at the j-th iteration.
r_j and r₀ are the preconditioned residuals obtained at iterations j and 0, respectively. (The residual at iteration j is given by b-Ax_j.)

If IPARM(5) = 1, the iterative method is stopped when:

||r_j||₂ / ||x_j||₂ < epsilon

Note: IPARM(5) = 1 is the default value assumed by ESSL if you do not specify one of the values described here; therefore, if you do not update your program to set an IPARM(5) value, you, by default, use the above stopping criterion.

If IPARM(5) = 2, the iterative method is stopped when:

||r_j||₂ / ||r₀||₂ < epsilon

If IPARM(5) = 3, the iterative method is stopped when:

||x_j -x_j-1||₂ / ||x_j||₂ < epsilon

Note: Stopping criterion 3 performs poorly with the TFQMR method; therefore, if you specify TFQMR (IPARM(2) = 5), you should not specify stopping criterion 3.

IPARM(6), see 'On Return'.

Specified as: an array of (at least) length 6, containing fullword integers, where:

IPARM(1) >= 0

IPARM(2) = 1, 2, 3, 4, or 5

If IPARM(2) = 3, then IPARM(3) > 0

IPARM(4) = 1, 2, 3, 4, or 5

IPARM(5) = 1, 2, or 3 (Other values default to stopping criterion 1.)

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is the relative accuracy epsilon used in the stopping criterion. See "Notes".

RPARM(2), see 'On Return'.

RPARM(3) has the following meaning, where:

If IPARM(4) <> 3, then RPARM(3) is not used.
If IPARM(4) = 3, then RPARM(3) is the acceleration parameter used in SSOR. (A value in the range 0.5 to 2.0 is suitable for most problems.)

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers, where:

RPARM(1) >= 0

If IPARM(4) = 3, RPARM(3) > 0

aux1

is working storage for this subroutine, where:

If init = 'I', the working storage is computed. It can contain any values.

If init = 'S', the working storage is used in solving the linear system. It contains the coefficient matrix and preconditioner in internal format, computed in an earlier call to this subroutine.

Specified as: an area of storage, containing naux1 long-precision real numbers.

naux1

is the number of doublewords in the working storage specified in aux1.

Specified as: a fullword integer, where:

In these formulas nw has the following value:

If stor = 'G', then nw = IA(n+1)-1+n.

If stor = 'U' or 'L', then nw = 2(IA(n+1)-1).

If IPARM(4) = 1, use naux1 = (3/2)nw+(7/2)n+40.

If IPARM(4) = 2, use naux1 = (3/2)nw+(9/2)n+40.

If IPARM(4) = 3, 4, or 5, then:

If IPARM(2) <> 1, use naux1 = 3nw+10n+60.

If IPARM(2) = 1, use naux1 = 3nw+(21/2)n+60.

Note: If you receive an attention message, you have not specified sufficient auxiliary storage to achieve optimal performance, but it is enough to perform the computation. To obtain optimal performance, you need to use the amount given by the attention message.

aux2

has the following meaning:

If naux2 = 0 and error 2015 is unrecoverable, aux2 is ignored.

Otherwise, it is working storage used by this subroutine that is available for use by the calling program between calls to this subroutine.

Specified as: an area of storage, containing naux2 long-precision real numbers.

naux2

is the number of doublewords in the working storage specified in aux2. Specified as: a fullword integer, where:

If naux2 = 0 and error 2015 is unrecoverable, DSRIS dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise,

If IPARM(2) = 1, use naux2 >= 4n.

If IPARM(2) = 2, use naux2 >= 7n.

If IPARM(2) = 3, use naux2 >= (k+2)n+k(k+4)+1, where k = IPARM(3).

If IPARM(2) = 4, use naux2 >= 7n.

If IPARM(2) = 5, use naux2 >= 9n.

On Return

ar

is the sparse matrix A of order n, stored by rows in an array, referred to as AR. The stor argument indicates the storage variation used for storing matrix A. The order of the elements in each row of A in AR may be changed on output.

Returned as: a one-dimensional array, containing long-precision real numbers. The number of elements in this array can be determined by subtracting 1 from the value in IA(n+1).

ja

is the array, referred to as JA, containing the column numbers of each nonzero element in sparse matrix A. These elements correspond to the arrangement of the contents of AR on output.

Returned as: a one-dimensional array, containing fullword integers; 1 <= (JA elements) <= n. The number of elements in this array can be determined by subtracting 1 from the value in IA(n+1).

x

is the vector x of length n, containing the solution of the system Ax = b. Returned as: a one-dimensional array of (at least) length n, containing long-precision real numbers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) through IPARM(5) are unchanged.

IPARM(6) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 6, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) contains the estimate of the error of the solution. If the process converged, RPARM(2) <= epsilon.

RPARM(3) is unchanged.

Returned as: a one-dimensional array of length 3, containing long-precision real numbers.

aux1

is working storage for this subroutine, containing the coefficient matrix and preconditioner in internal format, ready to be passed in a subsequent invocation of this subroutine. Returned as: an area of storage, containing naux1 long-precision real numbers.

Notes

If you want to solve the same sparse linear system of equations multiple times using a different algorithm with the same preconditioner and using a different right-hand side each time, you get the best performance by using the following technique. Call DSRIS the first time with init = 'I'. This solves the system, and then stores the coefficient matrix and preconditioner in internal format in aux1. On the subsequent invocations of DSRIS with different right-hand sides, specify init = 'S'. This indicates to DSRIS to use the contents of aux1, saving the time to convert your coefficient matrix and preconditioner to internal format. If you use this technique, you should not modify the contents of aux1 between calls to DSRIS.
In some cases, you can specify a different algorithm in IPARM(2) when making calls with init = 'S'. (See "Example 2".) However, DSRIS sometimes needs different information in aux1 for different algorithms. When this occurs, DSRIS issues an attention message, continues processing the computation, and then resets the contents of aux1. Your performance is not improved in this case, which is functionally equivalent to calling DSRIS with init = 'I'.
If you use the CG method with init = 'I', you must use the CG method when you specify init = 'S'. However, if you use a different method with init = 'I', you can use any other method, except CG, when you specify init = 'S'.
These subroutines accept lowercase letters for the stor and init arguments.
Matrix A, vector x, and vector b must have no common elements; otherwise, results are unpredictable.
In this subroutine, a value of RPARM(1) = 0 is permitted to force the solver to evaluate exactly IPARM(1) iterations. The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the selected stopping criterion is satisfied or when more than the maximum number of iterations (in IPARM(1)) is reached.
For the stopping criteria specified in IPARM(5), the relative accuracy epsilon (in RPARM(1)) must be specified reasonably (10^-4 to 10^-8). If you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored by rows, see "Storage-by-Rows".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The linear system:

Ax = b

is solved using one of the following methods: conjugate gradient (CG), conjugate gradient squared (CGS), generalized minimum residual (GMRES), more smoothly converging variant of the CGS method (Bi-CGSTAB), or transpose-free quasi-minimal residual method (TFQMR), where:

A is a sparse matrix of order n. The matrix is stored in arrays AR, IA, and JA. If it is general, it is stored by rows. If it is symmetric, it can be stored using upper- or lower-storage-by-rows.

x is a vector of length n.

b is a vector of length n.

One of the following preconditioners is used:

an incomplete LU factorization
an incomplete Cholesky factorization (for positive definite symmetric matrices)
diagonal scaling
symmetric successive over-relaxation (SSOR) with two possible choices for the diagonal matrix:
- the absolute values sum of the input matrix
- the diagonal obtained from the LU factorization

See references [36], [53], [76], [80], [83], and [89].

When you call this subroutine to solve a system for the first time, you specify init = 'I'. After that, you can solve the same system any number of times by calling this subroutine each time with init = 'S'. These subsequent calls use the coefficient matrix and preconditioner, stored in internal format in aux1. You optimize performance by doing this, because certain portions of the computation have already been performed.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux2 = 0, and unable to allocate work area.

Computational Errors

The following errors, with their corresponding return codes, can occur in this subroutine. For details on error handling, see "What Can You Do about ESSL Computational Errors?".

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2130, return code 2 indicates that the incomplete LU factorization of A could not be completed, because one pivot was 0.
For error 2124, the subroutine has been called with init = 'S', but the data contained in aux1 was computed for a different algorithm. An attention message is issued. Processing continues, and the contents of aux1 are reset correctly.
For error 2134, return code 3 indicates that the data contained in aux1 is not consistent with the input sparse matrix. The subroutine has been called with init = 'S', and aux1 contains an incomplete factorization and internal data storage for the input matrix A that was computed by a previous call to the subroutine when init = 'I'. This error indicates that aux1 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(4) = 1 or 4.
For error 2131, return code 4 indicates that the matrix is singular, because all elements in one row of the matrix contain zero.
For error 2129, return code 5 indicates that the matrix is not positive definite.
For error 2128, return code 8 indicates an internal ESSL error. Please contact your IBM Representative.

Input-Argument Errors

n < 0
stor <> 'G', 'U', or 'L'
init <> 'I' or 'S'
IA(n+1) < 1
IA(i+1)-IA(i) < 0, for any i = 1, n
IPARM(1) < 0
IPARM(2) <> 1, 2, 3, 4, or 5
IPARM(3) <= 0 and IPARM(2) = 3
IPARM(4) <> 1, 2, 3, 4, or 5
RPARM(1) < 0
RPARM(3) <= 0 and IPARM(4) = 3
naux1 is too small--that is, less than the minimum required value. Return code 6 is returned if error 2015 is recoverable.
Error 2015 is recoverable or naux2<>0, and naux2 is too small--that is, less than the minimum required value. Return code 7 is returned for naux2 if error 2015 is recoverable.

Example 1

This example finds the solution of the linear system Ax = b for the sparse matrix A, which is stored by rows in arrays AR, IA, and JA. The system is solved using the Bi-CGSTAB algorithm. The iteration is stopped when the norm of the residual is less than the given threshold specified in RPARM(1). The algorithm is allowed to perform 20 iterations. The process converges after 9 iterations. Matrix A is:

          *                                                   *
          | 2.0  0.0   0.0  0.0   0.0   0.0   0.0   0.0   0.0 |
          | 0.0  2.0  -1.0  0.0   0.0   0.0   0.0   0.0   0.0 |
          | 0.0  1.0   2.0  0.0   0.0   0.0   0.0   0.0   0.0 |
          | 1.0  0.0   0.0  2.0  -1.0   0.0   0.0   0.0   0.0 |
          | 0.0  0.0   0.0  1.0   2.0  -1.0   0.0   0.0   0.0 |
          | 0.0  0.0   0.0  0.0   1.0   2.0  -1.0   0.0   0.0 |
          | 0.0  0.0   0.0  0.0   0.0   1.0   2.0  -1.0   0.0 |
          | 0.0  0.0   0.0  0.0   0.0   0.0   1.0   2.0  -1.0 |
          | 0.0  0.0   0.0  0.0   0.0   0.0   0.0   1.0   2.0 |
          *                                                   *

Call Statement and Input

            STOR INIT   N   AR   JA   IA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |     |    |   |    |    |    |   |     |       |      |       |      |       |
CALL DSRIS( 'G' , 'I' , 9 , AR , JA , IA , B , X , IPARM , RPARM , AUX1 ,   98  , AUX2 ,   63 )

AR       =  (2.0, 2.0, -1.0, 1.0, 2.0, 1.0, 2.0, -1.0, 1.0, 2.0, -1.0,
             1.0, 2.0, -1.0, 1.0, 2.0, -1.0, 1.0, 2.0, -1.0, 1.0, 2.0)
JA       =  (1, 2, 3, 2, 3, 1, 4, 5, 4, 5, 6, 5, 6, 7, 6, 7, 8, 7, 8,
             9, 8, 9)
IA       =  (1, 2, 4, 6, 9, 12, 15, 18, 21, 23)
B        =  (2.0, 1.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
IPARM(1) =  20
IPARM(2) =  4
IPARM(3) =  0
IPARM(4) =  1
IPARM(5) =  10
RPARM(1) =  1.D-7
RPARM(3) =  1.0

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(6) =  9
RPARM(2) =  0.29D-16

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A used in Example 1. It also uses the same right-hand side in b and the same initial guesses in x. However, the system is solved using a different algorithm, conjugate gradient squared (CGS). Because INIT is 'S', the best performance is achieved. The iteration is stopped when the norm of the residual is less than the given threshold specified in RPARM(1). The algorithm is allowed to perform 20 iterations. The process converges after 9 iterations.

Call Statement and Input

            STOR INIT   N   AR   JA   IA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |     |    |   |    |    |    |   |     |       |      |       |      |       |
CALL DSRIS( 'G' , 'S' , 9 , AR , JA , IA , B , X , IPARM , RPARM , AUX1 ,   98  , AUX2 ,   63 )

AR       =(same as input AR in Example 1)
JA       =(same as input JA in Example 1)
IA       =(same as input IA in Example 1)
B        =(same as input B in Example 1)
X        =(same as input X in Example 1)
IPARM(1) =  20
IPARM(2) =  2
IPARM(3) =  0
IPARM(4) =  1
IPARM(5) =  10
RPARM(1) =  1.D-7
RPARM(3) =  1.0

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(6) =  9
RPARM(2) =  0.42D-19

Example 3

This example finds the solution of the linear system Ax = b for the sparse matrix A, which is stored by rows in arrays AR, IA, and JA. The system is solved using the two-term conjugate gradient method (CG), preconditioned by incomplete LU factorization. The iteration is stopped when the norm of the residual is less than the given threshold specified in RPARM(1). The algorithm is allowed to perform 20 iterations. The process converges after 1 iteration. Matrix A is:

         *                                                      *
         |  2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         |  0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
         | -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0 |
         |  0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0 |
         |  0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0 |
         |  0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0 |
         |  0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0 |
         |  0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0 |
         |  0.0   0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0 |
         *                                                      *

Call Statement Input

            STOR INIT   N   AR   JA   IA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |     |    |   |    |    |    |   |     |       |      |       |      |       |
CALL DSRIS( 'G' , 'I' , 9 , AR , JA , IA , B , X , IPARM , RPARM , AUX1 ,  223  , AUX2 ,   36 )

AR       =  (2.0, -1.0, 2.0, -1.0, -1.0, 2.0, -1.0, -1.0, 2.0, -1.0,
             -1.0, 2.0, -1.0, -1.0, 2.0, -1.0, -1.0, 2.0, -1.0, -1.0,
             2.0, -1.0, 2.0)
JA       =  (1, 3, 2, 4, 1, 3, 5, 2, 4, 6, 3, 5, 7, 4, 6, 8, 5, 7, 9,
             6, 8, 7, 9)
IA       =  (1, 3, 5, 8, 11, 14, 17, 20, 22, 24)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
IPARM(1) =  20
IPARM(2) =  1
IPARM(3) =  0
IPARM(4) =  4
IPARM(5) =  1
RPARM(1) =  1.D-7
RPARM(3) =  1.0

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(6) =  1
RPARM(2) =  0.16D-15

Example 4

This example finds the solution of the linear system Ax = b for the same sparse matrix A used in Example 3. However, matrix A is stored using upper-storage-by-rows in arrays AR, IA, and JA. The system is solved using the generalized minimum residual (GMRES), restarted after 5 steps and preconditioned with SSOR splitting. The iteration is stopped when the norm of the residual is less than the given threshold specified in RPARM(1). The algorithm is allowed to perform 20 iterations. The process converges after 12 iterations.

Call Statement Input

            STOR INIT   N   AR   JA   IA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |     |    |   |    |    |    |   |     |       |      |       |      |       |
CALL DSRIS( 'U' , 'I' , 9 , AR , JA , IA , B , X , IPARM , RPARM , AUX1 ,  219  , AUX2 ,  109 )

AR       =  (2.0, -1.0, 2.0, -1.0, 2.0, -1.0, 2.0, -1.0, 2.0, -1.0,
             2.0, -1.0, 2.0, -1.0, 2.0, 2.0)
JA       =  (1, 3, 2, 4, 3, 5, 4, 6, 5, 7, 6, 8, 7, 9, 8, 9)
IA       =  (1, 3, 5, 7, 9, 11, 13, 15, 16, 17)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)
IPARM(1) =  20
IPARM(2) =  3
IPARM(3) =  5
IPARM(4) =  3
IPARM(5) =  1
RPARM(1) =  1.D-7
RPARM(3) =  2.0

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(6) =  12
RPARM(2) =  0.33D-7

DSMCG--Sparse Positive Definite or Negative Definite Symmetric Matrix Iterative Solve Using Compressed-Matrix Storage Mode

This subroutine solves a symmetric, positive definite or negative definite linear system, using the conjugate gradient method, with or without preconditioning by an incomplete Cholesky factorization, for a sparse matrix stored in compressed-matrix storage mode. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Notes:

These subroutines are provided only for migration purposes. You get better performance and a wider choice of algorithms if you use the DSRIS subroutine.
If your sparse matrix is stored by rows, as defined in "Storage-by-Rows", you should first use the utility subroutine DSRSM to convert your sparse matrix to compressed-matrix storage mode. See DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode

Syntax

Fortran	CALL DSMCG (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsmcg (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSMCG (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

m

is the order of the linear system Ax = b and the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nz

is the maximum number of nonzero elements in each row of sparse matrix A. Specified as: a fullword integer; nz >= 0.

ac

is the array, referred to as AC, containing the values of the nonzero elements of the sparse matrix, stored in compressed-matrix storage mode. Specified as: an lda by (at least) nz array, containing long-precision real numbers.

ka

is the array, referred to as KA, containing the column numbers of the matrix A elements stored in the corresponding positions in array AC. Specified as: an lda by (at least) nz array, containing fullword integers, where 1 <= (elements of KA) <= m.

lda

is the leading dimension of the arrays specified for ac and ka. Specified as: a fullword integer; lda > 0 and lda >= m.

b

is the vector b of length m, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

x

is the vector x of length m, containing your initial guess of the solution of the linear system. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers. The elements can have any value, and if no guess is available, the value can be zero.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 1
IPARM(3) = 0
RPARM(1) = 10^-6
IPARM(2) is the flag used to select the stopping criterion.
If IPARM(2) = 0, the conjugate gradient iterative procedure is stopped when:

||r||₂ / ||x||₂ < epsilon

where r = b-Ax is the residual, and epsilon is the desired relative accuracy. epsilon is stored in RPARM(1).
If IPARM(2) = 1, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is an estimate to the minimum eigenvalue of the iteration matrix. lambda is computed adaptively by this program and, on output, is stored in RPARM(2).
If IPARM(2) = 2, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is a predetermined estimate to the minimum eigenvalue of the iteration matrix. This eigenvalue estimate, on input, is stored in RPARM(2) and may be obtained by an earlier call to this subroutine with the same matrix.
IPARM(3) is the flag that determines whether the system is to be solved using the conjugate gradient method, preconditioned by an incomplete Cholesky factorization with no fill-in.
If IPARM(3) = 0, the system is not preconditioned.
If IPARM(3) = 10, the system is preconditioned by an incomplete Cholesky factorization.
If IPARM(3) = -10, the system is preconditioned by an incomplete Cholesky factorization, where the factorization matrix was computed in an earlier call to this subroutine and is stored in aux2.
IPARM(4), see 'On Return'.

integers, where:

IPARM(1) >= 0

IPARM(2) = 0, 1, or 2

IPARM(3) = 0, 10, or -10

rparm

is an array of parameters, RPARM(i), where epsilon is stored in RPARM(1), and lambda is stored in RPARM(2).

RPARM(1) > 0, is the relative accuracy epsilon used in the stopping criterion.

RPARM(2) > 0, is the estimate of the smallest eigenvalue, lambda, of the iteration matrix. It is only used when IPARM(2) = 2.

RPARM(3), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers.

aux1

has the following meaning:

If naux1 = 0 and error 2015 is unrecoverable, aux1 is ignored.

Otherwise, it is a storage work area used by this subroutine, which is available for use by the calling program between calls to this subroutine. Its size is specified by naux1.

Specified as: an area of storage, containing long-precision real numbers.

naux1

is the size of the work area specified by aux1--that is, the number of elements in aux1. Specified as: a fullword integer, where:

If naux1 = 0 and error 2015 is unrecoverable, DSMCG dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux1 must have at least the following value, where:

If IPARM(2) = 0 or 2, use naux1 >= 3m.

If IPARM(2) = 1 and IPARM(1) <> 0, use naux1 >= 3m+2(IPARM(1)).

If IPARM(2) = 1 and IPARM(1) = 0, use naux1 >= 3m+600.

aux2

is a storage work area used by this subroutine. If IPARM(3) = -10, aux2 must contain the incomplete Cholesky factorization of matrix A, computed in an earlier call to DSMCG. The size of aux2 is specified by naux2. Specified as: an area of storage, containing long-precision real numbers.

naux2

is the size of the work area specified by aux2--that is, the number of elements in aux2. Specified as: a fullword integer. When IPARM(3) = 10 or -10, naux2 must have at least the following value: naux2 >= m(nz-1)1.5+2(m+6).

On Return

x

is the vector x of length m, containing the solution of the system Ax = b. Returned as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) is unchanged.

IPARM(2) is unchanged.

IPARM(3) is unchanged.

IPARM(4) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 4, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) is unchanged if IPARM(2) = 0 or 2. If IPARM(2) = 1, RPARM(2) contains lambda, an estimate of the smallest eigenvalue of the iteration matrix.

RPARM(3) contains the estimate of the error of the solution. If the process converged, RPARM(3) <= epsilon.

Returned as: a one-dimensional array of length 3, containing long-precision real numbers; lambda > 0.

aux2

is the storage work area used by this subroutine.

If IPARM(3) = 10, aux2 contains the incomplete Cholesky factorization of matrix A.

If IPARM(3) = -10, aux2 is unchanged.

See "Notes" for additional information on aux2. Returned as: an area of storage, containing long-precision real numbers.

Notes

When IPARM(3) = -10, this subroutine uses the incomplete Cholesky factorization in aux2, computed in an earlier call to this subroutine. When IPARM(3) = 10, this subroutine computes the incomplete Cholesky factorization and stores it in aux2.
If you solve the same sparse linear system of equations several times with different right-hand sides using the preconditioned algorithm, specify IPARM(3) = 10 on the first invocation. The incomplete factorization is stored in aux2. You may save computing time on subsequent calls by setting IPARM(3) = -10. In this way, the algorithm reutilizes the incomplete factorization that was computed the first time. Therefore, you should not modify the contents of aux2 between calls.
Matrix A must have no common elements with vectors x and b; otherwise, results are unpredictable.
In the iterative solvers for sparse matrices, the relative accuracy epsilon (RPARM(1)) must be specified "reasonably" (10^-4 to 10^-8). The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the norm of the residual is sufficiently small--that is, when:

||b-Ax||₂ / lambda||x||₂ < epsilon

where lambda is an estimate of the minimum eigenvalue of the iteration matrix, which is either estimated adaptively or given by the user. As a result, if you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored in compressed-matrix storage mode, see "Compressed-Matrix Storage Mode".
On output, array AC and vector b are not bitwise identical to what they were on input, because the matrix A and the right-hand side are scaled before starting the iterative process and are unscaled before returning control to the user. In addition, arrays AC and KA may be rearranged on output, but still contain a mathematically equivalent mapping of the elements in matrix A.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The sparse positive definite or negative definite linear system:

Ax = b

is solved, where:

A is a symmetric, positive definite or negative definite sparse matrix of order m, stored in compressed-matrix storage mode in AC and KA.

x is a vector of length m.

b is a vector of length m.

The system is solved using the two-term conjugate gradient method, with or without preconditioning by an incomplete Cholesky factorization. In both cases, the matrix is scaled by the square root of the diagonal.

See references [59] and [62]. [36].

If your program uses a sparse matrix stored by rows and you want to use this subroutine, first convert your sparse matrix to compressed-matrix storage mode by using the subroutine DSRSM described on page DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux1 = 0, and unable to allocate work area.

Computational Errors

The following errors, with their corresponding return codes, can occur in this subroutine. Where a value of i is indicated, it can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for that particular error code in the ESSL error option table; otherwise, the default value causes your program to terminate when the error occurs. For details, see "What Can You Do about ESSL Computational Errors?".

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2111, return code 2 indicates that aux2 contains an incorrect factorization. The subroutine has been called with IPARM(3) = -10, and aux2 contains an incomplete factorization of the input matrix A that was computed by a previous call to the subroutine when IPARM(3) = 10. This error indicates that aux2 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(3) = 0 or 10.
For error 2109, return code 3 indicates that the inner product (y,Ay) is negative in the iterative procedure after iteration i. This should not occur, because the input matrix is assumed to be positive or negative definite. Vector x contains the results of the last iteration. The value i is identified in the computational error message.
For error 2108, return code 4 indicates that the matrix is not positive definite. AC is partially modified and does not represent the same matrix as on entry.

Input-Argument Errors

m < 0
lda < 1
lda < m
nz < 0
nz = 0 and m > 0
IPARM(1) < 0
IPARM(2) <> 0, 1, or 2
IPARM(3) <> 0, 10, or -10
RPARM(1) < 0
RPARM(2) < 0
Error 2015 is recoverable or naux1<>0, and naux1 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.
naux2 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.

Example 1

This example finds the solution of the linear system Ax = b for the sparse matrix A, which is stored in compressed-matrix storage mode in arrays AC and KA. The system is solved using the conjugate gradient method. Matrix A is:

         *                                                      *
         |  2.0   0.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
         |  0.0   2.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         |  0.0  -1.0   2.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         | -1.0   0.0   0.0   2.0  -1.0   0.0   0.0   0.0   0.0 |
         |  0.0   0.0   0.0  -1.0   2.0  -1.0   0.0   0.0   0.0 |
         |  0.0   0.0   0.0   0.0  -1.0   2.0  -1.0   0.0   0.0 |
         |  0.0   0.0   0.0   0.0   0.0  -1.0   2.0  -1.0   0.0 |
         |  0.0   0.0   0.0   0.0   0.0   0.0  -1.0   2.0  -1.0 |
         |  0.0   0.0   0.0   0.0   0.0   0.0   0.0  -1.0   2.0 |
         *                                                      *

Note: For input matrix KA, ( . ) indicates any value between 1 and 9.

Call Statement and Input

            M   NZ  AC  KA LDA  B   X  IPARM  RPARM  AUX1 NAUX1  AUX2 NAUX2
            |   |   |   |   |   |   |    |      |     |     |     |     |
CALL DSMCG( 9 , 3 , AC, KA, 9 , B , X, IPARM, RPARM, AUX1, 27  , AUX2,  0  )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  0
RPARM(1) =  1.D-7

        *                  *
        |  2.0  -1.0   0.0 |
        |  2.0  -1.0   0.0 |
        | -1.0   2.0   0.0 |
        | -1.0   2.0  -1.0 |
AC   =  | -1.0   2.0  -1.0 |
        | -1.0   2.0  -1.0 |
        | -1.0   2.0  -1.0 |
        | -1.0   2.0  -1.0 |
        | -1.0   2.0   0.0 |
        *                  *

        *         *
        | 1  4  . |
        | 2  3  . |
        | 2  3  . |
        | 1  4  5 |
KA   =  | 4  5  6 |
        | 5  6  7 |
        | 6  7  8 |
        | 7  8  9 |
        | 8  9  . |
        *         *

B        =  (1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  5
RPARM(2) =  0
RPARM(3) =  0.351D-15

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A as in Example 1, which is stored in compressed-matrix storage mode in arrays AC and KA. The system is solved using the conjugate gradient method, preconditioned with an incomplete Cholesky factorization. The smallest eigenvalue of the iteration matrix is computed and used in stopping the computation.
Note: For input matrix KA, ( . ) indicates any value between 1 and 9.

Call Statement and Input

            M   NZ  AC  KA LDA  B   X  IPARM  RPARM  AUX1 NAUX1  AUX2 NAUX2
            |   |   |   |   |   |   |    |      |     |     |     |     |
CALL DSMCG( 9 , 3 , AC, KA, 9 , B , X, IPARM, RPARM, AUX1, 67  , AUX2, 74  )

IPARM(1) =  20
IPARM(2) =  1
IPARM(3) =  10
RPARM(1) =  1.D-7
AC       =(same as input AC in Example 1)
KA       =(same as input KA in Example 1)
B        =  (1.0, 1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  1
RPARM(2) =  1
RPARM(3) =  0.100D-15

DSDCG--Sparse Positive Definite or Negative Definite Symmetric Matrix Iterative Solve Using Compressed-Diagonal Storage Mode

This subroutine solves a symmetric, positive definite or negative definite linear system, using the two-term conjugate gradient method, with or without preconditioning by an incomplete Cholesky factorization, for a sparse matrix stored in compressed-diagonal storage mode. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Syntax

Fortran	CALL DSDCG (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsdcg (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSDCG (`iopt`, `m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

iopt

indicates the type of storage used, where:

If iopt = 0, all the nonzero diagonals of the sparse matrix are stored in compressed-diagonal storage mode.

If iopt = 1, the sparse matrix, stored in compressed-diagonal storage mode, is symmetric. Only the main diagonal and one of each pair of identical diagonals are stored in array AD.

Specified as: a fullword integer; iopt = 0 or 1.

m

is the order of the linear system Ax = b and the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nd

is the number of nonzero diagonals stored in the columns of array AD, the number of columns in the array AD, and the number of elements in array LA. Specified as: a fullword integer; it must have the following value, where:

If m > 0, then nd > 0.

If m = 0, then nd >= 0.

ad

is the array, referred to as AD, containing the values of the nonzero elements of the sparse matrix stored in compressed-diagonal storage mode. If iopt = 1, the main diagonal and one of each pair of identical diagonals is stored in this array.

Specified as: an lda by (at least) nd array, containing long-precision real numbers.

lda

is the leading dimension of the array specified for ad. Specified as: a fullword integer; lda > 0 and lda >= m.

la

is the array, referred to as LA, containing the diagonal numbers k for the diagonals stored in each corresponding column in array AD. For an explanation of how diagonal numbers are assigned, see "Compressed-Diagonal Storage Mode".

Specified as: a one-dimensional array of (at least) length nd, containing fullword integers, where 1-m <= (elements of LA) <= m-1.

b

is the vector b of length m, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

x

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 1
IPARM(3) = 0
RPARM(1) = 10^-6
IPARM(2) is the flag used to select the stopping criterion.
If IPARM(2) = 0, the conjugate gradient iterative procedure is stopped when:

||r||₂ / ||x||₂ < epsilon

where r = b-Ax is the residual and epsilon is the desired relative accuracy. epsilon is stored in RPARM(1).
If IPARM(2) = 1, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is an estimate to the minimum eigenvalue of the iteration matrix. lambda is computed adaptively by this program and, on output, is stored in RPARM(2).
If IPARM(2) = 2, the conjugate gradient iterative procedure is stopped when:

||r||₂ / lambda||x||₂ < epsilon

where lambda is a predetermined estimate to the minimum eigenvalue of the iteration matrix. This eigenvalue estimate, on input, is stored in RPARM(2) and may be obtained by an earlier call to this subroutine with the same matrix.
IPARM(3) is the flag that determines whether the system is to be solved using the conjugate gradient method, preconditioned by an incomplete Cholesky factorization with no fill-in.
If IPARM(3) = 0, the system is not preconditioned.
If IPARM(3) = 10, the system is preconditioned by an incomplete Cholesky factorization.
If IPARM(3) = -10, the system is preconditioned by an incomplete Cholesky factorization, where the factorization matrix was computed in an earlier call to this subroutine and is stored in aux2.
IPARM(4), see 'On Return'.

Specified as: an array of (at least) length 4, containing fullword integers, where:

IPARM(1) = 0

IPARM(2) = 0, 1, or 2

IPARM(3) = 0, 10, or -10

rparm

is an array of parameters, RPARM(i), where epsilon is stored in RPARM(1), and lambda is stored in RPARM(2).

RPARM(1) > 0, is the relative accuracy epsilon used in the stopping criterion.

RPARM(2) > 0, is the estimate of the smallest eigenvalue, lambda, of the iteration matrix. It is only used when IPARM(2) = 2.

RPARM(3), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers.

aux1

has the following meaning:

If naux1 = 0 and error 2015 is unrecoverable, aux1 is ignored.

Otherwise, it is a storage work area used by this subroutine, which is available for use by the calling program between calls to this subroutine. Its size is specified by naux1.

Specified as: an area of storage, containing long-precision real numbers.

naux1

is the size of the work area specified by aux1--that is, the number of elements in aux1.

Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, DSDCG dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, it must have at least the following value, where:

If IPARM(2) = 0 or 2, use naux1 >= 3m.

If IPARM(2) = 1 and IPARM(1) <> 0, use naux1 >= 3m+2(IPARM(1)).

If IPARM(2) = 1 and IPARM(1) = 0, use naux1 >= 3m+600.

aux2

is the storage work area used by this subroutine. If IPARM(3) = -10, aux2 must contain the incomplete Cholesky factorization of matrix A, computed in an earlier call to DSDCG. Its size is specified by naux2. Specified as: an area of storage, containing long-precision real numbers.

naux2

If iopt = 0, use naux2 >= m(1.5nd+2)1.5+2(m+6).

If iopt = 1, use naux2 >= m(3nd+2)+8.

On Return

x

is the vector x of length m, containing the solution of the system Ax = b. Returned as: a one-dimensional array, containing long-precision real numbers.

iparm

As an array of parameters, IPARM(i), where:

IPARM(1) is unchanged.

IPARM(2) is unchanged.

IPARM(3) is unchanged.

IPARM(4) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 4, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) is unchanged if IPARM(2) = 0 or 2. If IPARM(2) = 1, RPARM(2) contains lambda, an estimate of the smallest eigenvalue of the iteration matrix.

RPARM(3) contains the estimate of the error of the solution. If the process converged, RPARM(3) <= epsilon.

Returned as: a one-dimensional array of length 3, containing long-precision real numbers; lambda > 0.

aux2

is the storage work area used by this subroutine.

If IPARM(3) = 10, aux2 contains the incomplete Cholesky factorization of matrix A.

If IPARM(3) = -10, aux2 is unchanged.

See "Notes" for additional information on aux2. Returned as: an area of storage, containing long-precision real numbers.

Notes

When IPARM(3) = -10, this subroutine uses the incomplete Cholesky factorization in aux2, computed in an earlier call to this subroutine. When IPARM(3) = 10, this subroutine computes the incomplete Cholesky factorization and stores it in aux2.
If you solve the same sparse linear system of equations several times with different right-hand sides using the preconditioned algorithm, specify IPARM(3) = 10 on the first invocation. The incomplete factorization is stored in aux2. You may save computing time on subsequent calls by setting IPARM(3) = -10. In this way, the algorithm reutilizes the incomplete factorization that was computed the first time. Therefore, you should not modify the contents of aux2 between calls.
Matrix A must have no common elements with vectors x and b; otherwise, results are unpredictable.
In the iterative solvers for sparse matrices, the relative accuracy epsilon (RPARM(1)) must be specified "reasonably" (10^-4 to 10^-8). The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the norm of the residual is sufficiently small--that is, when:

||b-Ax||₂ / lambda||x||₂ < epsilon

where lambda is an estimate of the minimum eigenvalue of the iteration matrix, which is either estimated adaptively or given by the user. As a result, if you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored in compressed-matrix storage mode, see "Compressed-Matrix Storage Mode".
On output, array AD and vector b are not bitwise identical to what they were on input, because the matrix A and the right-hand side are scaled before starting the iterative process and are unscaled before returning control to the user. In addition, arrays AD and LA may be rearranged on output, but still contain a mathematically equivalent mapping of the elements in matrix A.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The sparse positive definite or negative definite linear system:

Ax = b

is solved, where:

A is a symmetric, positive definite or negative definite sparse matrix of order m, stored in compressed-diagonal storage mode in arrays AD and LA.

x is a vector of length m.

b is a vector of length m.

See references [59] and [62]. [36].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux1 = 0, and unable to allocate work area.

Computational Errors

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2111, return code 2 indicates that aux2 contains an incorrect factorization. The subroutine has been called with IPARM(3) = -10, and aux2 contains an incomplete factorization of the input matrix A that was computed by a previous call to the subroutine when IPARM(3) = 10. This error indicates that aux2 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(3) = 0 or 10.
For error 2109, return code 3 indicates that the inner product (y,Ay) is negative in the iterative procedure after iteration i. This should not occur, because the input matrix is assumed to be positive or negative definite. Vector x contains the results of the last iteration. The value i is identified in the computational error message.
For error 2108, return code 4 indicates that the matrix is not positive definite. AC is partially modified and does not represent the same matrix as on entry.

Input-Argument Errors

iopt <> 0 or 1
m < 0
lda < 1
lda < m
nd < 0
nd = 0 and m > 0
|lambda(i)| > m-1 for i = 1, nd
IPARM(1) < 0
IPARM(2) <> 0, 1, or 2
IPARM(3) <> 0, 10, or -10
RPARM(1) < 0
RPARM(2) < 0
Error 2015 is recoverable or naux1<>0, and naux1 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.
naux2 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.

Example 1

This example finds the solution of the linear system Ax = b for sparse matrix A, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the two-term conjugate gradient method. In this example, IOPT = 0.. Matrix A is:

         *                                                      *
         |  2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
         |  0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
         | -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0 |
         |  0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0 |
         |  0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0   0.0 |
         |  0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0   0.0 |
         |  0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0  -1.0 |
         |  0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0   0.0 |
         |  0.0   0.0   0.0   0.0   0.0   0.0  -1.0   0.0   2.0 |
         *                                                      *

Call Statement and Input

           IOPT  M   ND  AD  LDA  LA   B   X   IPARM   RPARM   AUX1  NAUX1  AUX2  NAUX2
            |    |   |   |    |   |    |   |     |       |      |      |     |      |
CALL DSDCG( 0  , 9 , 3 , AD , 9 , LA , B , X , IPARM , RPARM , AUX1 , 283 , AUX2 ,  0  )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  0
RPARM(1) =  1.D-7

        *                 *
        | 2.0   0.0  -1.0 |
        | 2.0   0.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
AD   =  | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0  -1.0 |
        | 2.0  -1.0   0.0 |
        | 2.0  -1.0   0.0 |
        *                 *

LA       =  (0, -2, 2)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  5
RPARM(2) =  0
RPARM(3) =  0.46D-16

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A as in Example 1, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the two-term conjugate gradient method. In this example, IOPT = 1, indicating that the matrix is symmetric, and only the main diagonal and one of each pair of identical diagonals are stored in array AD.

Call Statement and Input

           IOPT  M   ND  AD  LDA  LA   B   X   IPARM   RPARM   AUX1  NAUX1  AUX2  NAUX2
            |    |   |   |    |   |    |   |     |       |      |      |     |      |
CALL DSDCG( 1  , 9 , 2 , AD , 9 , LA , B , X , IPARM , RPARM , AUX1 , 283 , AUX2 , 80  )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  10
RPARM(1) =  1.D-7

        *           *
        | 2.0   0.0 |
        | 2.0   0.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
AD   =  | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        | 2.0  -1.0 |
        *           *

LA       =  (0, -2)
B        =  (1.0, 1.0, 0.0, 0.0, 0.0, 0.0, 0.0, 1.0, 1.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  1
RPARM(2) =  0
RPARM(3) =  0.89D-16

DSMGCG--General Sparse Matrix Iterative Solve Using Compressed-Matrix Storage Mode

This subroutine solves a general sparse linear system of equations using an iterative algorithm, conjugate gradient squared or generalized minimum residual, with or without preconditioning by an incomplete LU factorization. The subroutine is suitable for positive real matrices--that is, when the symmetric part of the matrix, (A+A^T)/2, is positive definite. The sparse matrix is stored in compressed-matrix storage mode. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Notes:

These subroutines are provided only for migration purposes. You get better performance and a wider choice of algorithms if you use the DSRIS subroutine.
If your sparse matrix is stored by rows, as defined in "Storage-by-Rows", you should first use the utility subroutine DSRSM to convert your sparse matrix to compressed-matrix storage mode. See DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode.

Syntax

Fortran	CALL DSMGCG (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsmgcg (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSMGCG (`m`, `nz`, `ac`, `ka`, `lda`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

m

is the order of the linear system Ax = b and the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nz

is the maximum number of nonzero elements in each row of sparse matrix A. Specified as: a fullword integer; nz >= 0.

ac

ka

lda

is the leading dimension of the arrays specified for ac and ka. Specified as: a fullword integer; lda > 0 and lda >= m.

b

is the vector b of length m, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

x

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 0
IPARM(3) = 10
RPARM(1) = 10^-6
IPARM(2) is the flag used to select the iterative procedure used in this subroutine.
If IPARM(2) = 0, the conjugate gradient squared method is used.
If IPARM(2) = k, the generalized minimum residual method, restarted after k steps, is used. Note that the size of the work area aux1 becomes larger as k increases. A value for k in the range of 5 to 10 is suitable for most problems.
IPARM(3) is the flag that determines whether the system is to be preconditioned by an incomplete LU factorization with no fill-in.
If IPARM(3) = 0, the system is not preconditioned.
If IPARM(3) = 10, the system is preconditioned by an incomplete LU factorization.
If IPARM(3) = -10, the system is preconditioned by an incomplete LU factorization, where the factorization matrix was computed in an earlier call to this subroutine and is stored in aux2.
IPARM(4), see 'On Return'.

integers, where:

IPARM(1) >= 0

IPARM(2) >= 0

IPARM(3) = 0, 10, or -10

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) > 0, is the relative accuracy epsilon used in the stopping criterion. The iterative procedure is stopped when:

||b-Ax||₂ / ||x||₂ < epsilon

RPARM(2) is reserved.

RPARM(3), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers.

aux1

has the following meaning:

If naux1 = 0 and error 2015 is unrecoverable, aux1 is ignored.

Otherwise, it is a storage work area used by this subroutine, which is available for use by the calling program between calls to this subroutine. Its size is specified by naux1.

Specified as: an area of storage, containing long-precision real numbers.

naux1

is the size of the work area specified by aux1--that is, the number of elements in aux1. Specified as: a fullword integer, where:

If naux1 = 0 and error 2015 is unrecoverable, DSMGCG dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, it must have at least the following value, where:

If IPARM(2) = 0, use naux1 >= 7m.

If IPARM(2) > 0, use naux1 >= (k+2)m+k(k+4)+1, where k = IPARM(2).

aux2

is the storage work area used by this subroutine. If IPARM(3) = -10, aux2 must contain the incomplete LU factorization of matrix A, computed in an earlier call to DSMGCG. The size of aux2 is specified by naux2.

Specified as: an area of storage, containing long-precision real numbers.

naux2

is the size of the work area specified by aux2--that is, the number of elements in aux2. Specified as: a fullword integer. When IPARM(3) = 10, naux2 must have at least the following value: naux2 >= 3+2m+1.5nz(m).

On Return

x

is the vector x of length m, containing the solution of the system Ax = b. Returned as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) is unchanged.

IPARM(2) is unchanged.

IPARM(3) is unchanged.

IPARM(4) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 4, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) is reserved.

RPARM(3) contains the estimate of the error of the solution. If the process converged, RPARM(3) <= RPARM(1)

Returned as: a one-dimensional array of length 3, containing long-precision real numbers.

aux2

is the storage work area used by this subroutine.

If IPARM(3) = 10, aux2 contains the incomplete LU factorization of matrix A.

If IPARM(3) = -10, aux2 is unchanged.

See "Notes" for additional information on aux2. Returned as: an area of storage, containing long-precision real numbers.

Notes

When IPARM(3) = -10, this subroutine uses the incomplete LU factorization in aux2, computed in an earlier call to this subroutine. When IPARM(3) = 10, this subroutine computes the incomplete LU factorization and stores it in aux2.
If you solve the same sparse linear system of equations several times with different right-hand sides using the preconditioned algorithm, specify IPARM(2) = 10 on the first invocation. The incomplete factorization is stored in aux2. You may save computing time on subsequent calls by setting IPARM(3) equal to -10. In this way, the algorithm reutilizes the incomplete factorization that was computed the first time. Therefore, you should not modify the contents of aux2 between calls.
Matrix A must have no common elements with vectors x and b; otherwise, results are unpredictable.
In the iterative solvers for sparse matrices, the relative accuracy epsilon (RPARM(1)) must be specified "reasonably" (10^-4 to 10^-8). The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the norm of the residual is sufficiently small--that is, when:

||b-Ax||₂ / ||x||₂ < epsilon

As a result, if you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored in compressed-matrix storage mode, see "Compressed-Matrix Storage Mode".
On output, array AC is not bitwise identical to what it was on input because the matrix A is scaled before starting the iterative process and is unscaled before returning control to the user.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The linear system:

Ax = b

is solved using either the conjugate gradient squared method or the generalized minimum residual method, with or without preconditioning by an incomplete LU factorization, where:

A is a sparse matrix of order m, stored in compressed-matrix storage mode in arrays AC and KA.

x is a vector of length m.

b is a vector of length m.

See references [80] and [82]. [36].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux1 = 0, and unable to allocate work area.

Computational Errors

The following errors, with their corresponding return codes, can occur in this subroutine. For details on error handling, see "What Can You Do about ESSL Computational Errors?".

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2111, return code 2 indicates that aux2 contains an incorrect factorization. The subroutine has been called with IPARM(3) = -10, and aux2 contains an incomplete factorization of the input matrix A that was computed by a previous call to the subroutine when IPARM(3) = 10. This error indicates that aux2 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(3) = 0 or 10.
For error 2112, return code 3 indicates that the incomplete LU factorization of A could not be completed, because one pivot was 0.
For error 2116, return code 4 indicates that the matrix is singular, because all elements in one row of the matrix contain 0. Array AC is partially modified and does not represent the same matrix as on entry.

Input-Argument Errors

m < 0
lda < 1
lda < m
nz < 0
nz = 0 and m > 0
IPARM(1) < 0
IPARM(2) < 0
IPARM(3) <> 0, 10, or -10
RPARM(1) < 0
RPARM(2) < 0
Error 2015 is recoverable or naux1<>0, and naux1 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.
naux2 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.

Example 1

        *                                                   *
        | 2.0  0.0   0.0  0.0   0.0   0.0   0.0   0.0   0.0 |
        | 0.0  2.0  -1.0  0.0   0.0   0.0   0.0   0.0   0.0 |
        | 0.0  1.0   2.0  0.0   0.0   0.0   0.0   0.0   0.0 |
        | 1.0  0.0   0.0  2.0  -1.0   0.0   0.0   0.0   0.0 |
        | 0.0  0.0   0.0  1.0   2.0  -1.0   0.0   0.0   0.0 |
        | 0.0  0.0   0.0  0.0   1.0   2.0  -1.0   0.0   0.0 |
        | 0.0  0.0   0.0  0.0   0.0   1.0   2.0  -1.0   0.0 |
        | 0.0  0.0   0.0  0.0   0.0   0.0   1.0   2.0  -1.0 |
        | 0.0  0.0   0.0  0.0   0.0   0.0   0.0   1.0   2.0 |
        *                                                   *

Note: For input matrix KA, ( . ) indicates any value between 1 and 9.

Call Statement and Input

             M   NZ   AC   KA   LDA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |    |    |    |    |    |   |     |       |      |       |      |       |
CALL DSMGCG( 9 ,  3 , AC , KA ,  9  , B , X , IPARM , RPARM , AUX1 ,  63   , AUX2 ,   0 )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  0
RPARM(1) =  1.D-7

        *                 *
        | 2.0   0.0   0.0 |
        | 2.0  -1.0   0.0 |
        | 1.0   2.0   0.0 |
        | 1.0   2.0  -1.0 |
AC   =  | 1.0   2.0  -1.0 |
        | 1.0   2.0  -1.0 |
        | 1.0   2.0  -1.0 |
        | 1.0   2.0  -1.0 |
        | 1.0   2.0   0.0 |
        *                 *

        *         *
        | 1  .  . |
        | 2  3  . |
        | 2  3  . |
        | 1  4  5 |
KA   =  | 4  5  6 |
        | 5  6  7 |
        | 6  7  8 |
        | 7  8  9 |
        | 8  9  . |
        *         *

B        =  (2.0, 1.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  9
RPARM(3) =  0.150D-19

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A as in Example 1, which is stored in compressed-matrix storage mode in arrays AC and KA. The system is solved using the generalized minimum residual method, restarted after 5 steps and preconditioned with an incomplete LU factorization. Most of the input is the same as in Example 1.
Note: For input matrix KA, ( . ) indicates any value between 1 and 9.

Call Statement and Input

             M   NZ   AC   KA   LDA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |    |    |    |    |    |   |     |       |      |       |      |       |
CALL DSMGCG( 9 ,  3 , AC , KA ,  9  , B , X , IPARM , RPARM , AUX1 ,  109  , AUX2 ,  46 )

IPARM(1) =  20
IPARM(2) =  5
IPARM(3) =  10
RPARM(1) =  1.D-7
AC       =(same as input AC in Example 1)
KA       =(same as input KA in Example 1)
B        =  (2.0, 1.0, 3.0, 2.0, 2.0, 2.0, 2.0, 2.0, 3.0)
X        =  (0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  2
RPARM(3) =  0.290D-15

DSDGCG--General Sparse Matrix Iterative Solve Using Compressed-Diagonal Storage Mode

This subroutine solves a general sparse linear system of equations using an iterative algorithm, conjugate gradient squared or generalized minimum residual, with or without preconditioning by an incomplete LU factorization. The subroutine is suitable for positive real matrices--that is, when the symmetric part of the matrix, (A+A^T)/2, is positive definite. The sparse matrix is stored in compressed-diagonal storage mode. Matrix A and vectors x and b are used:

Ax = b

where A, x, and b contain long-precision real numbers.

Syntax

Fortran	CALL DSDGCG (`m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`)
C and C++	dsdgcg (`m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);
PL/I	CALL DSDGCG (`m`, `nd`, `ad`, `lda`, `la`, `b`, `x`, `iparm`, `rparm`, `aux1`, `naux1`, `aux2`, `naux2`);

On Entry

m

is the order of the linear system Ax = b and the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nd

is the number of nonzero diagonals stored in the columns of array AD, the number of columns in array AD, and the number of elements in array LA. Specified as: a fullword integer; it must have the following value, where:

If m > 0, then nd > 0.

If m = 0, then nd >= 0.

ad

is the array, referred to as AD, containing the values of the nonzero elements of the sparse matrix, stored in compressed-matrix storage mode. Specified as: an lda by (at least) nd array, containing long-precision real numbers.

lda

is the leading dimension of the arrays specified for ad. Specified as: a fullword integer; lda > 0 and lda >= m.

la

Specified as: a one-dimensional array of (at least) length nd, containing fullword integers, where 1-m <= (elements of LA) <= (m-1).

b

is the vector b of length m, containing the right-hand side of the matrix problem. Specified as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

x

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) controls the number of iterations.
If IPARM(1) > 0, IPARM(1) is the maximum number of iterations allowed.
If IPARM(1) = 0, the following default values are used:

IPARM(1) = 300
IPARM(2) = 0
IPARM(3) = 10
RPARM(1) = 10^-6
IPARM(2) is the flag used to select the iterative procedure used in this subroutine.
If IPARM(2) = 0, the conjugate gradient squared method is used.
If IPARM(2) = k, the generalized minimum residual method, restarted after k steps, is used. Note that the size of the work area aux1 becomes larger as k increases. A value for k in the range of 5 to 10 is suitable for most problems.
IPARM(3) is the flag that determines whether the system is to be preconditioned by an incomplete LU factorization with no fill-in.
If IPARM(3) = 0, the system is not preconditioned.
If IPARM(3) = 10, the system is preconditioned by an incomplete LU factorization.
If IPARM(3) = -10, the system is preconditioned by an incomplete LU factorization, where the factorization matrix was computed in an earlier call to this subroutine and is stored in aux2.
IPARM(4), see 'On Return'.

integers, where:

IPARM(1) >= 0

IPARM(2) >= 0

IPARM(3) = 0, 10, or -10

rparm

is an array of parameters, RPARM(i), where:

If RPARM(1) > 0, is the relative accuracy epsilon used in the stopping criterion. The iterative procedure is stopped when:

||b-Ax||₂ / ||x||₂ < epsilon

RPARM(2) is reserved.

RPARM(3), see 'On Return'.

Specified as: a one-dimensional array of (at least) length 3, containing long-precision real numbers.

aux1

has the following meaning:

If naux1 = 0 and error 2015 is unrecoverable, aux1 is ignored.

Otherwise, it is a storage work area used by this subroutine, which is available for use by the calling program between calls to this subroutine. Its size is specified by naux1.

Specified as: an area of storage, containing long-precision real numbers.

naux1

is the size of the work area specified by aux1--that is, the number of elements in aux1. Specified as: a fullword integer, where:

If naux1 = 0 and error 2015 is unrecoverable, DSDGCG dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux1 > 0 and must have at least the following value, where:

If IPARM(2) = 0, use naux1 >= 7m.

If IPARM(2) > 0, use naux1 >= (k+2)m+k(k+4)+1, where k = PARM(2).

aux2

is a storage work area used by this subroutine. If IPARM(3) = -10, aux2 must contain the incomplete LU factorization of matrix A, computed in an earlier call to DSDGCG. The size of aux2 is specified by naux2.

Specified as: an area of storage, containing long-precision real numbers.

naux2

On Return

x

is the vector x of length m, containing the solution of the system Ax = b. Returned as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

iparm

is an array of parameters, IPARM(i), where:

IPARM(1) is unchanged.

IPARM(2) is unchanged.

IPARM(3) is unchanged.

IPARM(4) contains the number of iterations performed by this subroutine.

Returned as: a one-dimensional array of length 4, containing fullword integers.

rparm

is an array of parameters, RPARM(i), where:

RPARM(1) is unchanged.

RPARM(2) is reserved.

RPARM(3) contains the estimate of the error of the solution. If the process converged, RPARM(3) <= RPARM(1).

Returned as: a one-dimensional array of length 3, containing long-precision real numbers.

aux2

is the storage work area used by this subroutine.

If IPARM(3) = 10, aux2 contains the incomplete LU factorization of matrix A.

If IPARM(3) = -10, aux2 is unchanged.

See "Notes" for additional information on aux2. Returned as: an area of storage, containing long-precision real numbers.

Notes

When IPARM(3) = -10, this subroutine uses the incomplete LU factorization in aux2, computed in an earlier call to this subroutine. When IPARM(3) = 10, this subroutine computes the incomplete LU factorization and stores it in aux2.
If you solve the same sparse linear system of equations several times with different right-hand sides, using the preconditioned algorithm, specify IPARM(3) = 10 on the first invocation. The incomplete factorization is stored in aux2. You may save computing time on subsequent calls by setting IPARM(3) = -10. In this way, the algorithm reutilizes the incomplete factorization that was computed the first time. Therefore, you should not modify the contents of aux2 between calls.
Matrix A must have no common elements with vectors x and b; otherwise, results are unpredictable.
In the iterative solvers for sparse matrices, the relative accuracy epsilon (RPARM(1)) must be specified "reasonably" (10^-4 to 10^-8). The algorithm computes a sequence of approximate solution vectors x that converge to the solution. The iterative procedure is stopped when the norm of the residual is sufficiently small--that is, when:

||b-Ax||₂ / ||x||₂ < epsilon

As a result, if you specify a larger epsilon, the algorithm takes fewer iterations to converge to a solution. If you specify a smaller epsilon, the algorithm requires more iterations and computer time, but converges to a more precise solution. If the value you specify is unreasonably small, the algorithm may fail to converge within the number of iterations it is allowed to perform.
For a description of how sparse matrices are stored in compressed-diagonal storage mode, see "Compressed-Diagonal Storage Mode".
On output, array AD is not bitwise identical to what it was on input, because matrix A is scaled before starting the iterative process and is unscaled before returning control to the user.
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The linear system:

Ax = b

is solved using either the conjugate gradient squared method or the generalized minimum residual method, with or without preconditioning by an incomplete LU factorization, where:

A is a sparse matrix of order m, stored in compressed-diagonal storage mode in arrays AD and LA.

x is a vector of length m.

b is a vector of length m.

See references [80] and [82]. [36].

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux1 = 0, and unable to allocate work area.

Computational Errors

The following errors, with their corresponding return codes, can occur in this subroutine. For details on error handling, see "What Can You Do about ESSL Computational Errors?".

For error 2110, return code 1 indicates that the subroutine exceeded IPARM(1) iterations without converging. Vector x contains the approximate solution computed at the last iteration.
For error 2111, return code 2 indicates that aux2 contains an incorrect factorization. The subroutine has been called with IPARM(3) = -10, and aux2 contains an incomplete factorization of the input matrix A that was computed by a previous call to the subroutine when IPARM(3) = 10. This error indicates that aux2 has been modified since the last call to the subroutine, or that the input matrix is not the same as the one that was factored. If the default action has been overridden, the subroutine can be called again with the same parameters, with the exception of IPARM(3) = 0 or 10.
For error 2112, return code 3 indicates that the incomplete LU factorization of A could not be completed, because one pivot was 0.
For error 2116, return code 4 indicates that the matrix is singular, because all elements in one row of the matrix contain 0. Array AC is partially modified and does not represent the same matrix as on entry.

Input-Argument Errors

m < 0
lda < 1
lda < m
nd < 0
nd = 0 and m > 0
IPARM(1) < 0
IPARM(2) < 0
IPARM(3) <> 0, 10, or -10
RPARM(1) < 1.D0
Error 2015 is recoverable or naux1<>0, and naux1 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.
naux2 is too small--that is, less than the minimum required value. Return code 5 is returned if error 2015 is recoverable.

Example 1

This example finds the solution of the linear system Ax = b for the sparse matrix A, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the conjugate gradient squared method. Matrix A is:

        *                                                    *
        | 2.0  0.0  -1.0   0.0   0.0   0.0   0.0   0.0   0.0 |
        | 0.0  2.0   0.0  -1.0   0.0   0.0   0.0   0.0   0.0 |
        | 0.0  0.0   2.0   0.0  -1.0   0.0   0.0   0.0   0.0 |
        | 0.0  0.0   0.0   2.0   0.0  -1.0   0.0   0.0   0.0 |
        | 1.0  0.0   0.0   0.0   2.0   0.0  -1.0   0.0   0.0 |
        | 0.0  1.0   0.0   0.0   0.0   2.0   0.0  -1.0   0.0 |
        | 0.0  0.0   1.0   0.0   0.0   0.0   2.0   0.0  -1.0 |
        | 0.0  0.0   0.0   1.0   0.0   0.0   0.0   2.0   0.0 |
        | 0.0  0.0   0.0   0.0   1.0   0.0   0.0   0.0   2.0 |
        *                                                    *

Call Statement and Input

             M   ND   AD   LDA   LA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |    |    |    |     |   |   |     |       |      |       |      |       |
     DSDGCG( 9 ,  3 , AD ,  9  , LA , B , X , IPARM , RPARM , AUX1 ,  63   , AUX2 ,   0 )

IPARM(1) =  20
IPARM(2) =  0
IPARM(3) =  0
RPARM(1) =  1.D-7

        *                *
        | 2.0  -1.0  0.0 |
        | 2.0  -1.0  0.0 |
        | 2.0  -1.0  0.0 |
        | 2.0  -1.0  0.0 |
AD   =  | 2.0  -1.0  1.0 |
        | 2.0  -1.0  1.0 |
        | 2.0  -1.0  1.0 |
        | 2.0   0.0  1.0 |
        | 2.0   0.0  1.0 |
        *                *

LA       =  (0, 2, -4)
B        =  (1, 1, 1, 1, 2, 2, 2, 3, 3)
X        =  (0, 0, 0, 0, 0, 0, 0, 0, 0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  8
RPARM(3) =  0.308D-17

Example 2

This example finds the solution of the linear system Ax = b for the same sparse matrix A as in Example 1, which is stored in compressed-diagonal storage mode in arrays AD and LA. The system is solved using the generalized minimum residual method, restarted after 5 steps and preconditioned with an incomplete LU factorization. Most of the input is the same as in Example 1.

Call Statement and Input

             M   ND   AD   LDA   LA   B   X   IPARM   RPARM   AUX1   NAUX1   AUX2   NAUX2
             |    |    |    |     |   |   |     |       |      |       |      |       |
CALL DSDGCG( 9 ,  3 , AD ,  9  , LA , B , X , IPARM , RPARM , AUX1 ,  109  , AUX2 ,  46 )

IPARM(1) =  20
IPARM(2) =  5
IPARM(3) =  10
RPARM(1) =  1.D-7
AD       =(same as input AD in Example 1)
LA       =(same as input LA in Example 1)
B        =  (1, 1, 1, 1, 2, 2, 2, 3, 3)
X        =  (0, 0, 0, 0, 0, 0, 0, 0, 0)

Output

X        =  (1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0, 1.0)
IPARM(4) =  6
RPARM(3) =  0.250D-15

Linear Least Squares Subroutines

This section contains the linear least squares subroutine descriptions.

SGESVF and DGESVF--Singular Value Decomposition for a General Matrix

These subroutines compute the singular value decomposition of general matrix A in preparation for solving linear least squares problems. To compute the minimal norm linear least squares solution of AX is congruent to B, follow the call to these subroutines with a call to SGESVS or DGESVS, respectively.

Table 114. Data Types

A, B, s, aux Subroutine
Short-precision real SGESVF
Long-precision real DGESVF

Syntax

Fortran	CALL SGESVF \| DGESVF (`iopt`, `a`, `lda`, `b`, `ldb`, `nb`, `s`, `m`, `n`, `aux`, `naux`)
C and C++	sgesvf \| dgesvf (`iopt`, `a`, `lda`, `b`, `ldb`, `nb`, `s`, `m`, `n`, `aux`, `naux`);
PL/I	CALL SGESVF \| DGESVF (`iopt`, `a`, `lda`, `b`, `ldb`, `nb`, `s`, `m`, `n`, `aux`, `naux`);

On Entry

iopt

indicates the type of computation to be performed, where:

If iopt = 0 or 10, singular values are computed.

If iopt = 1 or 11, singular values and V are computed.

If iopt = 2 or 12, singular values, V, and U^TB are computed.

Specified as: a fullword integer; iopt = 0, 1, 2, 10, 11, or 12.

If iopt < 10, singular values are unordered.

If iopt >= 10, singular values are sorted in descending order and, if applicable, the columns of V and the rows of U^TB are swapped to correspond to the sorted singular values.

a

is the m by n general matrix A, whose singular value decomposition is to be computed. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 114.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= max(m, n).

b

has the following meaning, where:

If iopt = 0, 1, 10, or 11, this argument is not used in the computation.

If iopt = 2 or 12, it is the m by nb matrix B.

Specified as: an ldb by (at least) nb array, containing numbers of the data type indicated in Table 114.

If this subroutine is followed by a call to SGESVS or DGESVS, B should contain the right-hand side of the linear least squares problem, AX is congruent to B. (The nb column vectors of B contain right-hand sides for nb distinct linear least squares problems.) However, if the matrix U^T is desired on output, B should be equal to the identity matrix of order m.

ldb

has the following meaning, where:

If iopt = 0, 1, 10, or 11, this argument is not used in the computation.

If iopt = 2 or 12, it is the leading dimension of the array specified for b.

Specified as: a fullword integer. It must have the following values, where:

If iopt = 0, 1, 10, or 11, ldb > 0.

If iopt = 2 or 12, ldb > 0 and ldb >= max(m, n).

nb

has the following meaning, where:

If iopt = 0, 1, 10, or 11, this argument is not used in the computation.

If iopt = 2 or 12, it is the number of columns in matrix B.

Specified as: a fullword integer; if iopt = 2 or 12, nb > 0.

s

See 'On Return'.

m

is the number of rows in matrices A and B. Specified as: a fullword integer; m >= 0.

n

is the number of columns in matrix A and the number of elements in vector s. Specified as: a fullword integer; n >= 0.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is the storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing numbers of the data type indicated in Table 114.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, SGESVF and DGESVF dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, It must have the following value, where:

If iopt = 0 or 10, naux >= n+max(m, n).

If iopt = 1 or 11, naux >= 2n+max(m, n).

If iopt = 2 or 12, naux >= 2n+max(m, n, nb).

On Return

a

has the following meaning, where:

If iopt = 0, or 10, A is overwritten; that is, the original input is not preserved.

If iopt = 1, 2, 11, or 12, A contains the real orthogonal matrix V, of order n, in its first n rows and n columns. If iopt = 11 or 12, the columns of V are swapped to correspond to the sorted singular values. If m > n, rows n+1, n+2, ..., m of array A are overwritten; that is, the original input is not preserved.

Returned as: an lda by (at least) n array, containing numbers of the data type indicated in Table 114.

b

has the following meaning, where:

If iopt = 0, 1, 10, or 11, B is not used in the computation.

If iopt = 2 or 12, B is overwritten by the n by nb matrix U^TB.

If iopt = 12, the rows of U^TB are swapped to correspond to the sorted singular values. If m > n, rows n+1, n+2, ..., m of array B are overwritten; that is, the original input is not preserved.

Returned as: an ldb by (at least) nb array, containing numbers of the data type indicated in Table 114.

s

is a the vector s of length n, containing the singular values of matrix A. Returned as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 114; s_i >= 0, where:

If iopt < 10, the singular values are unordered in s.

If iopt >= 10, the singular values are sorted in descending order in s; that is, s₁ >= s₂ >= ... >= s_n >= 0. If applicable, the columns of V and the rows of U^TB are swapped to correspond to the sorted singular values.

Notes

The following items must have no common elements; otherwise, results are unpredictable: matrices A and B, vector s, and the data area specified for aux.
When you specify iopt = 0, 1, 10, or 11, you must also specify:
- A dummy argument for b
- A positive value for ldb
See "Example".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The singular value decomposition of a real general matrix is computed as follows:

A = USIGMAV^T

where:

U^TU = V^TV = VV^T = I

A is an m by n real general matrix.

V is a real general orthogonal matrix of order n. On output, V overwrites the first n rows and n columns of A.

U^TB is an n by nb real general matrix. On output, U^TB overwrites the first n rows and nb columns of B.

SIGMA is an n by n real diagonal matrix. The diagonal elements of SIGMA are the singular values of A, returned in the vector s.

If m or n is equal to 0, no computation is performed.

One of the following algorithms is used:

Golub-Reinsch Algorithm (See pages 134 to 151 in reference [93].)
1. Reduce the real general matrix A to bidiagonal form using Householder transformations.
2. Iteratively reduce the bidiagonal form to diagonal form using a variant of the QR algorithm.
Chan Algorithm (See reference [13].)
1. Compute the QR decomposition of matrix A using Householder transformations; that is, A = QR.
2. Apply the Golub-Reinsch Algorithm to the matrix R.
  If R = XWY^T is the singular value decomposition of R, the singular value decomposition of matrix A is given by:
  
  where:

Also, see references [13], [55], [72], and pages 134 to 151 in reference [93]. These algorithms have a tendency to generate underflows that may hurt overall performance. The system default is to mask underflow, which improves the performance of these subroutines.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

Singular value (i) failed to converge after (x) iterations.

The singular values (s_j, j = n, n-1, ..., i+1) are correct. If iopt < 10, they are unordered. Otherwise, they are ordered.
a has been modified.
If iopt = 2 or 12, then b has been modified.
The return code is set to 1.
i and x can be determined at run time by use of the ESSL error-handling facilities. To obtain this information, you must use ERRSET to change the number of allowable errors for error code 2107 in the ESSL error option table; otherwise, the default value causes your program to terminate when this error occurs. See "What Can You Do about ESSL Computational Errors?".

Input-Argument Errors

iopt <> 0, 1, 2, 10, 11, or 12
lda <= 0
max(m, n) > lda
ldb <= 0 and iopt = 2, 12
max(m, n) > ldb and iopt = 2, 12
nb <= 0 and iopt = 2, 12
m < 0
n < 0
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 2 is returned if error 2015 is recoverable.

Example 1

This example shows how to find only the singular values, s, of a real long-precision general matrix A, where:

M is greater than N.
NAUX is greater than or equal to N+max(M, N) = 7.
LDB has been set to 1 to avoid a Fortran error message.
DUMMY is a placeholder for argument b, which is not used in the computation.
The singular values are returned in S.
On output, matrix A is overwritten; that is, the original input is not preserved.

Call Statement and Input

            IOPT  A  LDA    B    LDB  NB  S   M   N   AUX  NAUX
             |    |   |     |     |   |   |   |   |    |    |
CALL DGESVF( 0  , A , 4 , DUMMY , 1 , 0 , S , 4 , 3 , AUX , 7  )
 
        *                  *
        |  1.0   2.0   3.0 |
A    =  |  4.0   5.0   6.0 |
        |  7.0   8.0   9.0 |
        | 10.0  11.0  12.0 |
        *                  *

Output

S        =   (25.462, 1.291, 0.000)

Example 2

This example computes the singular values, s, of a real long-precision general matrix A and the matrix V, where:

M is equal to N.
NAUX is greater than or equal to 2N+max(M, N) = 9.
LDB has been set to 1 to avoid a Fortran error message.
DUMMY is a placeholder for argument b, which is not used in the computation.
The singular values are returned in S.
The matrix V is returned in A.

Call Statement and Input

            IOPT  A  LDA    B    LDB  NB  S   M   N   AUX  NAUX
             |    |   |     |     |   |   |   |   |    |    |
CALL DGESVF( 1  , A , 3 , DUMMY , 1 , 0 , S , 3 , 3 , AUX , 9  )
 
        *                *
        |  2.0  1.0  1.0 |
A    =  |  4.0  1.0  0.0 |
        | -2.0  2.0  1.0 |
        *                *

Output

        *                        *
        | -0.994   0.105  -0.041 |
A    =  | -0.112  -0.870   0.480 |
        | -0.015  -0.482  -0.876 |
        *                        *
 
S        =    (4.922, 2.724, 0.597)

Example 3

This example computes the singular values, s, and computes matrices V and U^TB in preparation for solving the underdetermined system AX is congruent to B, where:

M is less than N.
NAUX is greater than or equal to 2N+max(M, N, NB) = 9.
The singular values are returned in S.
The matrix V is returned in A.
The matrix U^TB is returned in B.

Call Statement and Input

            IOPT  A  LDA  B  LDB  NB  S   M   N   AUX  NAUX
             |    |   |   |   |   |   |   |   |    |    |
CALL DGESVF( 2  , A , 3 , B , 3 , 1 , S , 2 , 3 , AUX , 9  )

        *               *
        | 1.0  2.0  2.0 |
A    =  | 2.0  4.0  5.0 |
        |  .    .    .  |
        *               *

        *     *
        | 1.0 |
B    =  | 4.0 |
        |  .  |
        *     *

Output

        *                        *
        | -0.304  -0.894   0.328 |
A    =  | -0.608   0.447   0.656 |
        | -0.733   0.000  -0.680 |
        *                        *
 
        *        *
        | -4.061 |
B    =  |  0.000 |
        | -0.714 |
        *        *
 
S        =   (7.342, 0.000, 0.305)

Example 4

This example computes the singular values, s, and matrices V and U^TB in preparation for solving the overdetermined system AX is congruent to B, where:

M is greater than N.
NAUX is greater than or equal to 2N+max(M, N, NB) = 7.
The singular values are returned in S.
The matrix V is returned in A.
The matrix U^TB is returned in B.

Call Statement and Input

            IOPT  A  LDA  B  LDB  NB  S   M   N   AUX  NAUX
             |    |   |   |   |   |   |   |   |    |    |
CALL DGESVF( 2  , A , 3 , B , 3 , 2 , S , 3 , 2 , AUX , 7  )

        *          *
        | 1.0  4.0 |
A    =  | 2.0  5.0 |
        | 3.0  6.0 |
        *          *

        *           *
        | 7.0  10.0 |
B    =  | 8.0  11.0 |
        | 9.0  12.0 |
        *           *

Output

        *                *
        |  0.922  -0.386 |
A    =  | -0.386  -0.922 |
        |   .       .    |
        *                *

        *                  *
        |  -1.310   -2.321 |
B    =  | -13.867  -18.963 |
        |    .        .    |
        *                  *
 
X        =   (0.773, 9.508)

Example 5

This example computes the singular values, s, and matrices V and U^TB in preparation for solving the overdetermined system AX is congruent to B. The singular values are sorted in descending order, and the columns of V and the rows of U^TB are swapped to correspond to the sorted singular values.

M is greater than N.
NAUX is greater than or equal to 2N+max(M, N, NB) = 7.
The singular values are returned in S.
The matrix V is returned in A.
The matrix U^TB is returned in B.

Call Statement and Input

            IOPT  A  LDA  B  LDB  NB  S   M   N   AUX  NAUX
             |    |   |   |   |   |   |   |   |    |    |
CALL DGESVF( 12 , A , 3 , B , 3 , 2 , S , 3 , 2 , AUX , 7  )

        *          *
        | 1.0  4.0 |
A    =  | 2.0  5.0 |
        | 3.0  6.0 |
        *          *

        *           *
        | 7.0  10.0 |
B    =  | 8.0  11.0 |
        | 9.0  12.0 |
        *           *

Output

        *                *
        | -0.386   0.922 |
A    =  | -0.922  -0.386 |
        |   .       .    |
        *                *

        *                  *
        | -13.867  -18.963 |
B    =  |  -1.310   -2.321 |
        |   .         .    |
        *                  *
 
S        =   (9.508, 0.773)

SGESVS and DGESVS--Linear Least Squares Solution for a General Matrix Using the Singular Value Decomposition

These subroutines compute the minimal norm linear least squares solution of AX is congruent to B, where A is a general matrix, using the singular value decomposition computed by SGESVF or DGESVF.

Table 115. Data Types

V, UB, X, s, tau Subroutine
Short-precision real SGESVS
Long-precision real DGESVS

Syntax

Fortran	CALL SGESVS \| DGESVS (`v`, `ldv`, `ub`, `ldub`, `nb`, `s`, `x`, `ldx`, `m`, `n`, `tau`)
C and C++	sgesvs \| dgesvs (`v`, `ldv`, `ub`, `ldub`, `nb`, `s`, `x`, `ldx`, `m`, `n`, `tau`);
PL/I	CALL SGESVS \| DGESVS (`v`, `ldv`, `ub`, `ldub`, `nb`, `s`, `x`, `ldx`, `m`, `n`, `tau`);

On Entry

v

is the orthogonal matrix V of order n in the singular value decomposition of matrix A. It is produced by a preceding call to SGESVF or DGESVF, where it corresponds to output argument a.

Specified as: an ldv by (at least) n array, containing numbers of the data type indicated in Table 115.

ldv

is the leading dimension of the array specified for v. Specified as: a fullword integer; ldv > 0 and ldv >= n.

ub

is an n by nb matrix, containing U^TB. It is produced by a preceding call to SGESVF or DGESVF, where it corresponds to output argument b. On output, U^TB is overwritten; that is, the original input is not preserved.

Specified as: an ldub by (at least) nb array, containing numbers of the data type indicated in Table 115.

ldub

is the leading dimension of the array specified for ub. Specified as: a fullword integer; ldub > 0 and ldub >= n.

nb

is the number of columns in matrices X and U^TB. Specified as: a fullword integer; nb > 0.

s

is the vector s of length n, containing the singular values of matrix A. It is produced by a preceding call to SGESVF or DGESVF, where it corresponds to output argument s.

Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 115; s_i >= 0.

x

See 'On Return'.

ldx

is the leading dimension of the array specified for x. Specified as: a fullword integer; ldx > 0 and ldx >= n.

m

is the number of rows in matrix A. Specified as: a fullword integer; m >= 0.

n

is the number of columns in matrix A, the order of matrix V, the number of elements in vector s, the number of rows in matrix UB, and the number of rows in matrix X. Specified as: a fullword integer; n >= 0.

tau

is the error tolerance tau. Any singular values in vector s that are less than tau are treated as zeros when computing matrix X. Specified as: a number of the data type indicated in Table 115; tau >= 0. For more information on the values for tau, see "Notes".

On Return

x: is an n by nb matrix, containing the minimal norm linear least solutions of AX is congruent to B. The nb column vectors of X contain minimal norm solution vectors for nb distinct linear least squares problems.
Returned as: an ldx by (at least) nb array, containing numbers of the data type indicated in Table 115.

Notes

V, X, s, and U^TB can have no common elements; otherwise the results are unpredictable.
In problems involving experimental data, tau should reflect the absolute accuracy of the matrix elements:

tau >= max(|DELTA_ij|)

where DELTA_ij are the errors in a_ij. In problems where the matrix elements are known exactly or are only affected by roundoff errors:

where:

epsilon is equal to 0.11920E-06 for SGESVS and 0.22204D-15 for DGESVS.
s is a vector containing the singular values of matrix A.

For more information, see references [13], [55], [72], and pages 134 to 151 in reference [93].

Function

The minimal norm linear least squares solution of AX is congruent to B, where A is a real general matrix, is computed using the singular value decomposition, produced by a preceding call to SGESVF or DGESVF. From SGESVF or DGESVF, the singular value decomposition of A is given by the following:

A = USIGMAV^T

The linear least squares of solution X, for AX is congruent to B, is given by the following formula:

X = VSIGMA⁺U^TB

where:

If m or n is equal to 0, no computation is performed. See references [13], [55], [72], and pages 134 to 151 in reference [93]. These algorithms have a tendency to generate underflows that may hurt overall performance. The system default is to mask underflow, which improves the performance of these subroutines.

Error Conditions

Computational Errors

None

Input-Argument Errors

ldv <= 0
n > ldv
ldub <= 0
n > ldub
ldx <= 0
n > ldx
nb <= 0
m < 0
n < 0
tau < 0

Example 1

This example finds the linear least squares solution for the underdetermined system AX is congruent to B, using the singular value decomposition computed by DGESVF. Matrix A is:

                       *               *
                       | 1.0  2.0  2.0 |
                       | 2.0  4.0  5.0 |
                       *               *

and matrix B is:

On output, matrix U^TB is overwritten.
Note: This example corresponds to Example 3 of DGESVF on page "Example 3".

Call Statement and Input

             V  LDV  UB  LDUB  NB  S   X  LDX  M   N   TAU
             |   |   |    |    |   |   |   |   |   |    |
CALL DGESVS( V , 3 , UB , 3  , 1 , S , X , 3 , 2 , 3 , TAU )

        *                        *
        | -0.304  -0.894   0.328 |
V    =  | -0.608   0.447   0.656 |
        | -0.733   0.000  -0.680 |
        *                        *

        *        *
        | -4.061 |
UB   =  |  0.000 |
        | -0.714 |
        *        *
 
S        =  (7.342, 0.000, 0.305)
TAU      =  0.3993D-14

Output

        *        *
        | -0.600 |
X    =  | -1.200 |
        |  2.000 |
        *        *

Example 2

This example finds the linear least squares solution for the overdetermined system AX is congruent to B, using the singular value decomposition computed by DGESVF. Matrix A is:

                          *          *
                          | 1.0  4.0 |
                          | 2.0  5.0 |
                          | 3.0  6.0 |
                          *          *

and where B is:

                          *           *
                          | 7.0  10.0 |
                          | 8.0  11.0 |
                          | 9.0  12.0 |
                          *           *

On output, matrix U^TB is overwritten.
Note: This example corresponds to Example 4 of DGESVF on page "Example 4".

Call Statement

             V  LDV  UB  LDUB  NB  S   X  LDX  M   N   TAU
             |   |   |    |    |   |   |   |   |   |    |
CALL DGESVS( V , 3 , UB , 3  , 2 , S , X , 2 , 3 , 2 , TAU )

Input

        *                *
        |  0.922  -0.386 |
V    =  | -0.386  -0.922 |
        |   .       .    |
        *                *

        *                  *
        |  -1.310   -2.321 |
UB   =  | -13.867  -18.963 |
        |    .        .    |
        *                  *
 
S        =  (0.773, 9.508)
TAU      =  0.5171D-14

Output

        *                *
X    =  | -1.000  -2.000 |
        |  2.000   3.000 |
        *                *

SGELLS and DGELLS--Linear Least Squares Solution for a General Matrix Using a QR Decomposition with Column Pivoting

These subroutines compute the minimal norm linear least squares solution of AX is congruent to B, using a QR decomposition with column pivoting.

Table 116. Data Types

A, B, X, rn, tau, aux Subroutine
Short-precision real SGELLS
Long-precision real DGELLS

Syntax

Fortran	CALL SGELLS \| DGELLS (`iopt`, `a`, `lda`, `b`, `ldb`, `x`, `ldx`, `rn`, `tau`, `m`, `n`, `nb`, `k`, `aux`, `naux`)
C and C++	sgells \| dgells (`iopt`, `a`, `lda`, `b`, `ldb`, `x`, `ldx`, `rn`, `tau`, `m`, `n`, `nb`, `k`, `aux`, `naux`);
PL/I	CALL SGELLS \| DGELLS (`iopt`, `a`, `lda`, `b`, `ldb`, `x`, `ldx`, `rn`, `tau`, `m`, `n`, `nb`, `k`, `aux`, `naux`);

On Entry

iopt

indicates the type of computation to be performed, where:

If iopt = 0, X is computed.

If iopt = 1, X and the Euclidean Norm of the residual vectors are computed.

Specified as: a fullword integer; iopt = 0 or 1.

a

is the m by n coefficient matrix A. On output, A is overwritten; that is, the original input is not preserved. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 116.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= m.

b

is the m by nb matrix B, containing the right-hand sides of the linear systems. The nb column vectors of B contain right-hand sides for nb distinct linear least squares problems. On output, B is overwritten; that is, the original input is not preserved.

Specified as: an ldb by (at least) nb array, containing numbers of the data type indicated in Table 116.

ldb

is the leading dimension of the array specified for b. Specified as: a fullword integer; ldb > 0 and ldb >= m.

x

See 'On Return'.

ldx

is the leading dimension of the array specified for x. Specified as: a fullword integer; ldx > 0 and ldx >= n.

rn

See 'On Return'.

tau

is the tolerance tau, used to determine the subset of the columns of A used in the solution. Specified as: a number of the data type indicated in Table 116; tau >= 0. For more information on how to select a value for tau, see "Notes".

m

is the number of rows in matrices A and B. Specified as: a fullword integer; m >= 0.

n

is the number of columns in matrix A and the number of rows in matrix X. Specified as: a fullword integer; n >= 0.

nb

is the number of columns in matrices B and X and the number of elements in vector rn. Specified as: a fullword integer; nb > 0.

k

See 'On Return'.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is the storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing numbers of the data type indicated in Table 116. On output, the contents of aux are overwritten.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, SGELLS and DGELLS dynamically allocate the work area used by the subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, It must have the following values:

naux >= 3n+max(n, nb) for SGELLS

naux >= [ceiling(2.5n)+max(n, nb)] for DGELLS

On Return

x

is the solution matrix X, with n rows and nb columns, where:

If k <> 0, the nb column vectors of X contain minimal norm least squares solutions for nb distinct linear least squares problems. The elements in each solution vector correspond to the original columns of A.

If k = 0, the nb column vectors of X are set to 0.

Returned as: an ldx by (at least) nb array, containing numbers of the data type indicated in Table 116.

rn

is the vector rn of length nb, where:

If iopt = 0 or k = 0, rn is not used in the computation.

If iopt = 1, rn_i is the Euclidean Norm of the residual vector for the linear least squares problem defined by the i-th column vector of B.

Returned as: a one-dimensional array of (at least) nb, containing numbers of the data type indicated in Table 116.

k

is the number of columns of matrix A used in the solution. Returned as: a fullword integer; k = (the number of diagonal elements of matrix R exceeding tau in magnitude).

Notes

In your C program, argument k must be passed by reference.
If ldb >= max(m, n), matrix X and matrix B can be the same; otherwise, matrix X and matrix B can have no common elements, or the results are unpredictable.
The following items must have no common elements; otherwise, results are unpredictable:
- Matrices A and X, vector rn, and the data area specified for aux
- Matrices A and B, vector rn, and the data area specified for aux.
If the relative uncertainty in the matrix B is rho, then:

tau >= rho||A||_F

See references [44], [59], and [72] for additional guidance on determining suitable values for tau.
When you specify iopt = 0, you must also specify a dummy argument for rn. For more details, see "Example 1".
You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

The minimal norm linear least squares solution of AX is congruent to B is computed using a QR decomposition with column pivoting, where:

A is an m by n real general matrix.

B is an m by nb real general matrix.

X is an n by nb real general matrix.

Optionally, the Euclidean Norms of the residual vectors can be computed. Following are the steps involved in finding the minimal norm linear least squares solution of AX is congruent to B. A is decomposed, using Householder transformations and column pivoting, into the following form:

AP = QR

where:

P is a permutation matrix.

Q is an orthogonal matrix.

R is an upper triangular matrix.

k is the first index, where:

|r_k+1,k+1| <= tau

If k = n, the minimal norm linear least squares solution is obtained by solving RX = Q^TB and reordering X to correspond to the original columns of A.

If k < n, R has the following form:

To find the minimal norm linear least squares solution, it is necessary to zero the submatrix R₁₂ using Householder transformations. See references [44], [59], and [72]. If m or n is equal to 0, no computation is performed. These algorithms have a tendency to generate underflows that may hurt overall performance. The system default is to mask underflow, which improves the performance of these subroutines.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

None

Input-Argument Errors

iopt <> 0 or 1
lda <= 0
m > lda
ldb <= 0
m > ldb
ldx <= 0
n > ldx
m < 0
n < 0
nb <= 0
tau < 0
Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example 1

This example solves the underdetermined system AX is congruent to B. On output, A and B are overwritten. DUMMY is used as a placeholder for argument rn, which is not used in the computation.

Call Statement and Input

            IOPT  A  LDA  B  LDB  X  LDX   RN     TAU   M   N   NB  K   AUX  NAUX
             |    |   |   |   |   |   |     |      |    |   |   |   |    |    |
CALL DGELLS( 0  , A , 2 , B , 2 , X , 3 , DUMMY , TAU , 2 , 3 , 1 , K , AUX , 11 )

        *               *
A    =  | 1.0  2.0  2.0 |
        | 2.0  4.0  5.0 |
        *               *
 
        *     *
B    =  | 1.0 |
        | 4.0 |
        *     *
 
TAU      =   0.0

Output

        *        *
        | -0.600 |
X    =  | -1.200 |
        |  2.000 |
        *        *
 
K        =   2

Example 2

This example solves the overdetermined system AX is congruent to B. On output, A and B are overwritten. DUMMY is used as a placeholder for argument rn, which is not used in the computation.

Call Statement and Input

            IOPT  A  LDA  B  LDB  X  LDX   RN     TAU   M   N   NB  K   AUX  NAUX
             |    |   |   |   |   |   |     |      |    |   |   |   |    |    |
CALL DGELLS( 0  , A , 3 , B , 3 , X , 2 , DUMMY , TAU , 3 , 2 , 2 , K , AUX , 7  )

        *          *
        | 1.0  4.0 |
A    =  | 2.0  5.0 |
        | 3.0  6.0 |
        *          *

        *           *
        | 7.0  10.0 |
B    =  | 8.0  11.0 |
        | 9.0  12.0 |
        *           *
 
TAU      =   0.0

Output

        *                *
X    =  | -1.000  -2.000 |
        |  2.000   3.000 |
        *                *
 
K        =   2

Example 3

This example solves the overdetermined system AX is congruent to B and computes the Euclidean Norms of the residual vectors. On output, A and B are overwritten.

Call Statement and Input

            IOPT  A  LDA  B  LDB  X  LDX  RN   TAU   M   N   NB  K   AUX  NAUX
             |    |   |   |   |   |   |   |     |    |   |   |   |    |    |
CALL DGELLS( 1  , A , 3 , B , 3 , X , 2 , RN , TAU , 3 , 2 , 2 , K , AUX , 7  )

        *           *
        | 1.1  -4.3 |
A    =  | 2.0  -5.0 |
        | 3.0  -6.0 |
        *           *

        *            *
        | -7.0  10.0 |
B    =  | -8.0  11.0 |
        | -9.0  12.0 |
        *            *
 
TAU      =   0.0

Output

        *               *
X    =  | 0.543  -1.360 |
        | 1.785  -2.699 |
        *               *

        *       *
RN   =  | 0.196 |
        | 0.275 |
        *       *
 
K        =   2

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

A	Subroutine
Short-precision real	SGEF
Long-precision real	DGEF
Short-precision complex	CGEF
Long-precision complex	ZGEF

A, b, x	Subroutine
Short-precision real	SGES
Long-precision real	DGES
Short-precision complex	CGES
Long-precision complex	ZGES

A	Subroutine
Short-precision real	SGETRF
Long-precision real	DGETRF
Short-precision complex	CGETRF
Long-precision complex	ZGETRF

A, B	Subroutine
Short-precision real	SGETRS
Long-precision real	DGETRS
Short-precision complex	CGETRS
Long-precision complex	ZGETRS

A, `aux`, `rcond`, *det*	Subroutine
Short-precision real	SGEFCD
Long-precision real	DGEFCD

A	Subroutine
Short-precision real	SPPF and SPOF
Long-precision real	DPPF and DPOF
Short-precision complex	CPOF
Long-precision complex	ZPOF

A, x	Subroutine
Short-precision real	STRSV and STPSV
Long-precision real	DTRSV and DTPSV
Short-precision complex	CTRSV and CTPSV
Long-precision complex	ZTRSV and ZTPSV

Solution	Equation
1. B <-- alpha(A^-1)B	AX = alphaB
2. B <-- alpha(A^-T)B	A^TX = alphaB
3. B <-- alphaB(A^-1)	XA = alphaB
4. B <-- alphaB(A^-T)	XA^T = alphaB

A, B, alpha	Subroutine
Short-precision real	STRSM
Long-precision real	DTRSM
Short-precision complex	CTRSM
Long-precision complex	ZTRSM

A	Subroutine
Short-precision real	STRI and STPI
Long-precision real	DTRI and DTPI

A	Subroutine
Short-precision real	SGBF
Long-precision real	DGBF

A	Subroutine
Short-precision real	SPBF and SPBCHF
Long-precision real	DPBF and DPBCHF

*c, d, e, f*	Subroutine
Short-precision real	SGTF
Long-precision real	DGTF

c, d, e, f, b, x	Subroutine
Short-precision real	SGTS
Long-precision real	DGTS

c, d, e, b, x	Subroutine
Short-precision real	SGTNP
Long-precision real	DGTNP
Short-precision complex	CGTNP
Long-precision complex	ZGTNP

c, d, e	Subroutine
Short-precision real	SGTNPF
Long-precision real	DGTNPF
Short-precision complex	CGTNPF
Long-precision complex	ZGTNPF

c, d	Subroutine
Short-precision real	SPTF
Long-precision real	DPTF

c, d, b, x	Subroutine
Short-precision real	SPTS
Long-precision real	DPTS

A, x	Subprogram
Short-precision real	STBSV
Long-precision real	DTBSV
Short-precision complex	CTBSV
Long-precision complex	ZTBSV

A, B, s, `aux`	Subroutine
Short-precision real	SGESVF
Long-precision real	DGESVF

V, UB, X, s, tau	Subroutine
Short-precision real	SGESVS
Long-precision real	DGESVS

*A, B, X, rn,* tau, `aux`	Subroutine
Short-precision real	SGELLS
Long-precision real	DGELLS