Guide and Reference


Performance and Accuracy Considerations

  1. In ESSL, the SSCAL and DSCAL subroutines provide the fastest way to zero out contiguous (stride 1) arrays, by specifying incx = 1 and alpha = 0.

  2. Where possible, use the matrix-vector linear algebra subprograms, rather than the vector-scalar, to optimize performance. Because data is presented in matrices rather than vectors, multiple operations can be performed by a single ESSL subprogram.

  3. Where possible, use subprograms that do multiple computations, such as SNDOT and SNAXPY, rather than individual computations, such as SDOT and SAXPY. You get better performance.

  4. Many of the short-precision subprograms provide increased accuracy by accumulating results in long precision. This is noted in the functional description of each subprogram.

  5. In some of the subprograms, because implementation techniques vary to optimize performance, accuracy of the results may vary for different array sizes. In the subprograms in which this occurs, a general description of the implementation techniques is given in the functional description for each subprogram.

  6. To select the sparse matrix subroutine that gives you the best performance, you must consider the layout of the data in your matrix. From this, you can determine the most efficient storage mode for your sparse matrix. ESSL provides two versions of each of its sparse matrix-vector subroutines that you can use. One operates on sparse matrices stored in compressed-matrix storage mode, and the other operates on sparse matrices stored in compressed-diagonal storage mode. These two storage modes are described in "Sparse Matrix".

    Compressed-matrix storage mode is generally applicable. It should be used when each row of the matrix contains approximately the same number of nonzero elements. However, if the matrix has a special form--that is, where the nonzero elements are concentrated along a few diagonals--compressed-diagonal storage mode gives improved performance.

  7. There are some ESSL-specific rules that apply to the results of computations on the workstation processors using the ANSI/IEEE standards. For details, see "What Data Type Standards Are Used by ESSL, and What Exceptions Should You Know About?".

Vector-Scalar Subprograms

This section contains the vector-scalar subprogram descriptions.

ISAMAX, IDAMAX, ICAMAX, and IZAMAX--Position of the First or Last Occurrence of the Vector Element Having the Largest Magnitude

ISAMAX and IDAMAX find the position i of the first or last occurrence of a vector element having the maximum absolute value. ICAMAX and IZAMAX find the position i of the first or last occurrence of a vector element having the largest sum of the absolute values of the real and imaginary parts of the vector elements.

You get the position of the first or last occurrence of an element by specifying positive or negative stride, respectively, for vector x. Regardless of the stride, the position i is always relative to the location specified in the calling sequence for vector x (in argument x).

Table 36. Data Types
x Subprogram
Short-precision real ISAMAX
Long-precision real IDAMAX
Short-precision complex ICAMAX
Long-precision complex IZAMAX

Syntax

Fortran ISAMAX | IDAMAX | ICAMAX | IZAMAX (n, x, incx)
C and C++ isamax | idamax | icamax | izamax (n, x, incx);
PL/I ISAMAX | IDAMAX | ICAMAX | IZAMAX (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 36.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

 

is the position i of the element in the array, where:

If incx >= 0, i is the position of the first occurrence.

If incx < 0, i is the position of the last occurrence.

Returned as: a fullword integer; 0 <= i <= n.

Note

Declare the ISAMAX, IDAMAX, ICAMAX, and IZAMAX functions in your program as returning a fullword integer value.

Function

ISAMAX and IDAMAX find the first element xk, where k is defined as the smallest index k, such that:

|xk| = max{|xj| for j = 1, n}

ICAMAX and IZAMAX find the first element xk, where k is defined as the smallest index k, such that:

|ak|+|bk| = max{|aj|+|bj| for j = 1, n}
where xk = (ak, bk)

By specifying a positive or negative stride for vector x, the first or last occurrence, respectively, is found in the array. The position i, returned as the value of the function, is always figured relative to the location specified in the calling sequence for vector x (in argument x). Therefore, depending on the stride specified for incx, i has the following values:

For incx >= 0, i = k
For incx < 0, i = n-k+1

See reference [73]. The result is returned as a function value. If n is 0, then 0 is returned as the value of the function.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
               N   X   INCX
               |   |    |
IMAX = ISAMAX( 9 , X ,  1   )
 
X        =  (1.0, 2.0, 7.0, -8.0, -5.0, -10.0, -9.0, 10.0, 6.0)

Output
IMAX     =  6

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
               N   X   INCX
               |   |    |
IMAX = ISAMAX( 5 , X ,  2   )
 
X        =  (1.0, . , 7.0, . , -5.0, . , -9.0, . , 6.0)

Output
IMAX     =  4

Example 3

This example shows a vector, x, with a stride of 0.

Function Reference and Input
               N   X   INCX
               |   |    |
IMAX = ISAMAX( 9 , X ,  0   )
 
X        =  (1.0, . , . , . , . , . , . , . , .)

Output
IMAX     =  1

Example 4

This example shows a vector, x, with a negative stride. Processing begins at element X(15), which is 2.0.

Function Reference and Input
               N   X   INCX
               |   |    |
IMAX = ISAMAX( 8 , X , -2   )
 
X        =  (3.0, . , 5.0, . , -8.0, . , 6.0, . , 8.0, . ,
             4.0, . , 8.0, . , 2.0)

Output
IMAX     =  7

Example 5

This example shows a vector, x, containing complex numbers and having a stride of 1.

Function Reference and Input
               N   X   INCX
               |   |    |
IMAX = ICAMAX( 5 , X ,  1   )
 
X        =  ((9.0 , 2.0) , (7.0 , -8.0) , (-5.0 , -10.0) , (-4.0 , 10.0),
             (6.0 , 3.0))

Output
IMAX     =  2

ISAMIN and IDAMIN--Position of the First or Last Occurrence of the Vector Element Having Minimum Absolute Value

These subprograms find the position i of the first or last occurrence of a vector element having the minimum absolute value.

You get the position of the first or last occurrence of an element by specifying positive or negative stride, respectively, for vector x. Regardless of the stride, the position i is always relative to the location specified in the calling sequence for vector x (in argument x).

Table 37. Data Types
x Subprogram
Short-precision real ISAMIN
Long-precision real IDAMIN

Syntax

Fortran ISAMIN | IDAMIN (n, x, incx)
C and C++ isamin | idamin (n, x, incx);
PL/I ISAMIN | IDAMIN (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 37.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

 

is the position i of the element in the array, where:

If incx >= 0, i is the position of the first occurrence.

If incx < 0, i is the position of the last occurrence.

Returned as: a fullword integer; 0 <= i <= n.

Note

Declare the ISAMIN and IDAMIN functions in your program as returning a fullword integer value.

Function

These subprograms find the first element xk, where k is defined as the smallest index k, such that:

|xk| = min{|xj| for j = 1, n}

By specifying a positive or negative stride for vector x, the first or last occurrence, respectively, is found in the array. The position i, returned as the value of the function, is always figured relative to the location specified in the calling sequence for vector x (in argument x). Therefore, depending on the stride specified for incx, i has the following values:

For incx >= 0, i = k
For incx < 0, i = n-k+1

See reference [73]. The result is returned as a function value. If n is 0, then 0 is returned as the value of the function.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
               N   X   INCX
               |   |    |
IMIN = ISAMIN( 6 , X ,  1  )
 
X        =  (3.0, 4.0, 1.0, 8.0, 1.0, 3.0)

Output
IMIN     =  3

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
               N   X   INCX
               |   |    |
IMIN = ISAMIN( 4 , X ,  2  )
 
X        =  (-3.0, . , -9.0, . , -8.0, . , 3.0)

Output
IMIN     =  1

Example 3

This example shows a vector, x, with a positive stride and two elements with the minimum absolute value. The position of the first occurrence is returned.

Function Reference and Input
               N   X   INCX
               |   |    |
IMIN = ISAMIN( 4 , X ,  2  )
 
X        =  (2.0, . , -1.0, . , 4.0, . , 1.0)

Output
IMIN     =  2

Example 4

This example shows a vector, x, with a negative stride and two elements with the minimum absolute value. The position of the last occurrence is returned. Processing begins at element X(7), which is 1.0.

Function Reference and Input
               N   X   INCX
               |   |    |
IMIN = ISAMIN( 4 , X , -2  )
 
X        =  (2.0, . , -1.0, . , 4.0, . , 1.0)

Output
IMIN     =  4

ISMAX and IDMAX--Position of the First or Last Occurrence of the Vector Element Having the Maximum Value

These subprograms find the position i of the first or last occurrence of a vector element having the maximum value.

You get the position of the first or last occurrence of an element by specifying positive or negative stride, respectively, for vector x. Regardless of the stride, the position i is always relative to the location specified in the calling sequence for vector x (in argument x).

Table 38. Data Types
x Subprogram
Short-precision real ISMAX
Long-precision real IDMAX

Syntax

Fortran ISMAX | IDMAX (n, x, incx)
C and C++ ismax | idmax (n, x, incx);
PL/I ISMAX | IDMAX (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 38.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

 

is the position i of the element in the array, where:

If incx >= 0, i is the position of the first occurrence.

If incx < 0, i is the position of the last occurrence.

Returned as: a fullword integer; 0 <= i <= n.

Note

Declare the ISMAX and IDMAX functions in your program as returning a fullword integer value.

Function

These subprograms find the first element xk, where k is defined as the smallest index k, such that:

xk = max{xj for j = 1, n}

By specifying a positive or negative stride for vector x, the first or last occurrence, respectively, is found in the array. The position i, returned as the value of the function, is always figured relative to the location specified in the calling sequence for vector x (in argument x). Therefore, depending on the stride specified for incx, i has the following values:

For incx >= 0, i = k
For incx < 0, i = n-k+1

See reference [73]. The result is returned as a function value. If n is 0, then 0 is returned as the value of the function.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
              N   X  INCX
              |   |   |
IMAX = ISMAX( 6 , X , 1  )
 
X        =  (3.0, 4.0, 1.0, 8.0, 1.0, 8.0)

Output
IMAX     =  4

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
              N   X  INCX
              |   |   |
IMAX = ISMAX( 4 , X , 2  )
 
X        =  (-3.0, . , 9.0, . , -8.0, . , 3.0)

Output
IMAX     =  2

Example 3

This example shows a vector, x, with a positive stride and two elements with the maximum value. The position of the first occurrence is returned.

Function Reference and Input
              N   X  INCX
              |   |   |
IMAX = ISMAX( 4 , X , 2  )
 
X        =  (2.0, . , 4.0, . , 4.0, . , 1.0)

Output
IMAX     =  2

Example 4

This example shows a vector, x, with a negative stride and two elements with the maximum value. The position of the last occurrence is returned. Processing begins at element X(7), which is 1.0.

Function Reference and Input
              N   X   INCX
              |   |    |
IMAX = ISMAX( 4 , X , -2  )
 
X        =  (2.0, . , 4.0, . , 4.0, . , 1.0)

Output
IMAX     =  3

ISMIN and IDMIN--Position of the First or Last Occurrence of the Vector Element Having Minimum Value

These subprograms find the position i of the first or last occurrence of a vector element having the minimum value.

You get the position of the first or last occurrence of an element by specifying positive or negative stride, respectively, for vector x. Regardless of the stride, the position i is always relative to the location specified in the calling sequence for vector x (in argument x).

Table 39. Data Types
x Subprogram
Short-precision real ISMIN
Long-precision real IDMIN

Syntax

Fortran ISMIN | IDMIN (n, x, incx)
C and C++ ismin | idmin (n, x, incx);
PL/I ISMIN | IDMIN (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 39.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

 

is the position i of the element in the array, where:

If incx >= 0, i is the position of the first occurrence.

If incx < 0, i is the position of the last occurrence.

Returned as: a fullword integer; 0 <= i <= n.

Note

Declare the ISMIN and IDMIN functions in your program as returning a fullword integer value.

Function

These subprograms find the first element xk, where k is defined as the smallest index k, such that:

xk = min{xj for j = 1, n}

By specifying a positive or negative stride for vector x, the first or last occurrence, respectively, is found in the array. The position i, returned as the value of the function, is always figured relative to the location specified in the calling sequence for vector x (in argument x). Therefore, depending on the stride specified for incx, i has the following values:

For incx >= 0, i = k
For incx < 0, i = n-k+1

See reference [73]. The result is returned as a function value. If n is 0, then 0 is returned as the value of the function.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
              N   X  INCX
              |   |   |
IMIN = ISMIN( 6 , X , 1  )
 
X        =  (3.0, 4.0, 1.0, 8.0, 1.0, 3.0)

Output
IMIN     =  3

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
              N   X  INCX
              |   |   |
IMIN = ISMIN( 4 , X , 2  )
 
X        =  (-3.0, . , -9.0, . , -8.0, . , 3.0)

Output
IMIN     =  2

Example 3

This example shows a vector, x, with a positive stride and two elements with the minimum value. The position of the first occurrence is returned. Processing begins at element X(7), which is 1.0.

Function Reference and Input
              N   X  INCX
              |   |   |
IMIN = ISMIN( 4 , X , 2  )
 
X        =  (2.0, . , 1.0, . , 4.0, . , 1.0)

Output
IMIN     =  2

Example 4

This example shows a vector, x, with a negative stride and two elements with the minimum value. The position of the last occurrence is returned. Processing begins at element X(7), which is 1.0.

Function Reference and Input
              N   X   INCX
              |   |    |
IMIN = ISMIN( 4 , X , -2  )
 
X        =  (2.0, . , 1.0, . , 4.0, . , 1.0)

Output
IMIN     =  4

SASUM, DASUM, SCASUM, and DZASUM--Sum of the Magnitudes of the Elements in a Vector

SASUM and DASUM compute the sum of the absolute values of the elements in vector x. SCASUM and DZASUM compute the sum of the absolute values of the real and imaginary parts of the elements in vector x.

Table 40. Data Types
x Result Subprogram
Short-precision real Short-precision real SASUM
Long-precision real Long-precision real DASUM
Short-precision complex Short-precision real SCASUM
Long-precision complex Long-precision real DZASUM

Syntax

Fortran SASUM | DASUM | SCASUM | DZASUM (n, x, incx)
C and C++ sasum | dasum | scasum | dzasum (n, x, incx);
PL/I SASUM | DASUM | SCASUM | DZASUM (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 40.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

is the result of the summation. Returned as: a number of the data type indicated in Table 40.

Note

Declare this function in your program as returning a value of the type indicated in Table 40.

Function

SASUM and DASUM compute the sum of the absolute values of the elements of x, which is expressed as follows:



Figure ESYGR54 not displayed.

SCASUM and DZASUM compute the sum of the absolute values of the real and imaginary parts of the elements of x, which is expressed as follows:



Figure ESYGR55 not displayed.

See reference [73]. The result is returned as a function value. If n is 0, then 0.0 is returned as the value of the function. For SASUM and SCASUM, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
              N   X  INCX
              |   |   |
SUMM = SASUM( 7 , X , 1  )
 
X        =  (1.0, -3.0, -6.0, 7.0, 5.0, 2.0, -4.0)

Output
SUMM     =  28.0

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
              N   X  INCX
              |   |   |
SUMM = SASUM( 4 , X , 2  )
 
X        =  (1.0, . , -6.0, . , 5.0, . , -4.0)

Output
SUMM     =  16.0

Example 3

This example shows a vector, x, with negative stride. Processing begins at element X(7), which is -4.0.

Function Reference and Input
              N   X   INCX
              |   |    |
SUMM = SASUM( 4 , X , -2  )
 
X        =  (1.0, . , -6.0, . , 5.0, . , -4.0)

Output
SUMM     =  16.0

Example 4

This example shows a vector, x, with a stride of 0. The result in SUMM is nx1.

Function Reference and Input
              N   X  INCX
              |   |   |
SUMM = SASUM( 7 , X , 0  )
 
X        =  (-2.0, . , . , . , . , . , .)

Output
SUMM     =  14.0

Example 5

This example shows a vector, x, containing complex numbers and having a stride of 1.

Function Reference and Input
               N   X  INCX
               |   |   |
SUMM = SCASUM( 5 , X , 1  )
 
X        =  ((1.0, 2.0), (-3.0, 4.0), (5.0, -6.0 ), (-7.0, -8.0),
             (9.0, 10.0))

Output
SUMM     =  55.0

SAXPY, DAXPY, CAXPY, and ZAXPY--Multiply a Vector X by a Scalar, Add to a Vector Y, and Store in the Vector Y

These subprograms perform the following computation, using the scalar alpha and vectors x and y:

y <-- y+alphax

Table 41. Data Types
alpha, x, y Subprogram
Short-precision real SAXPY
Long-precision real DAXPY
Short-precision complex CAXPY
Long-precision complex ZAXPY

Syntax

Fortran CALL SAXPY | DAXPY | CAXPY | ZAXPY (n, alpha, x, incx, y, incy)
C and C++ saxpy | daxpy | caxpy | zaxpy (n, alpha, x, incx, y, incy);
PL/I CALL SAXPY | DAXPY | CAXPY | ZAXPY (n, alpha, x, incx, y, incy);

On Entry

n

is the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 41.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 41.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 41.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

On Return

y

is the vector y, containing the results of the computation y+alphax. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 41.

Notes

  1. If you specify the same vector for x and y, incx and incy must be equal; otherwise, results are unpredictable.

  2. If you specify different vectors for x and y, they must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

The computation is expressed as follows:



Figure ESYGR56 not displayed.


See reference [73]. If alpha or n is zero, no computation is performed. For CAXPY, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x and y with positive strides.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY
            |    |    |   |    |   |
CALL SAXPY( 5 , 2.0 , X , 1  , Y , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 1.0, . , 1.0, . , 1.0, . , 1.0)

Output
Y        =  (3.0, . , 5.0, . , 7.0, . , 9.0, . , 11.0)

Example 2

This example shows vectors x and y having strides of opposite signs. For y, which has negative stride, processing begins at element Y(5), which is 1.0.

Call Statement and Input
            N  ALPHA  X  INCX  Y   INCY
            |    |    |   |    |    |
CALL SAXPY( 5 , 2.0 , X , 1  , Y , -1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Y        =  (15.0, 12.0, 9.0, 6.0, 3.0)

Example 3

This example shows a vector, x, with 0 stride. Vector x is treated like a vector of length n, all of whose elements are the same as the single element in x.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY
            |    |    |   |    |   |
CALL SAXPY( 5 , 2.0 , X , 0  , Y , 1  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Y        =  (7.0, 6.0, 5.0, 4.0, 3.0)

Example 4

This example shows how SAXPY can be used to compute a scalar value. In this case, vectors x and y contain scalar values and the strides for both vectors are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY
            |    |    |   |    |   |
CALL SAXPY( 1 , 2.0 , X , 0  , Y , 0  )
 
X        =  (1.0)
Y        =  (5.0)

Output
Y        =  (7.0)

Example 5

This example shows how to use CAXPY, where vectors x and y contain complex numbers. In this case, vectors x and y have positive strides.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY
            |    |    |   |    |   |
CALL CAXPY( 3 ,ALPHA, X , 1  , Y , 2  )
 
ALPHA    =  (2.0, 3.0)
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))
Y        =  ((1.0, 1.0 ), . , (0.0, 2.0), . , (5.0, 4.0))

Output
Y        =  ((-3.0, 8.0), . , (4.0, 8.0), . , (-4.0, 23.0))

SCOPY, DCOPY, CCOPY, and ZCOPY--Copy a Vector

These subprograms copy vector x to another vector, y:

y <-- x

Table 42. Data Types
x, y Subprogram
Short-precision real SCOPY
Long-precision real DCOPY
Short-precision complex CCOPY
Long-precision complex ZCOPY

Syntax

Fortran CALL SCOPY | DCOPY | CCOPY | ZCOPY (n, x, incx, y, incy)
C and C++ scopy | dcopy | ccopy | zcopy (n, x, incx, y, incy);
PL/I CALL SCOPY | DCOPY | CCOPY | ZCOPY (n, x, incx, y, incy);

On Entry

n

is the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 42.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

See 'On Return'.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

On Return

y

is the vector y of length n. Returned as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 42.

Notes

  1. If you specify the same vector for x and y, incx and incy must be equal; otherwise, results are unpredictable.

  2. If you specify different vectors for x and y, they must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

The copy is expressed as follows:



Figure ESYGR57 not displayed.


See reference [73]. If n is 0, no copy is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows input vector x and output vector y with positive strides.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL SCOPY( 5 , X , 1  , Y , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0)

Example 2

This example shows how to obtain a reverse copy of the input vector x by specifying strides with the same absolute value, but with opposite signs, for input vector x and output vector y. For y, which has a negative stride, results are stored beginning at element Y(5).

Call Statement and Input
            N   X  INCX  Y   INCY
            |   |   |    |    |
CALL SCOPY( 5 , X , 1  , Y , -1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Example 3

This example shows an input vector, x, with 0 stride. Vector x is treated like a vector of length n, all of whose elements are the same as the single element in x. This is a technique for replicating an element of a vector.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL SCOPY( 5 , X , 0  , Y , 1  )
 
X        =  (13.0)

Output
Y        =  (13.0, 13.0, 13.0, 13.0, 13.0)

Example 4

This example shows input vector x and output vector y, containing complex numbers and having positive strides.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL CCOPY( 4 , X , 1  , Y , 2  )
 
X        =  ((1.0, 1.0), (2.0, 2.0), (3.0, 3.0), (4.0, 4.0))

Output
Y        =  ((1.0, 1.0), . , (2.0, 2.0), . , (3.0, 3.0), . ,
             (4.0, 4.0))

SDOT, DDOT, CDOTU, ZDOTU, CDOTC, and ZDOTC--Dot Product of Two Vectors

SDOT, DDOT, CDOTU, and ZDOTU compute the dot product of vectors x and y:



Figure ESYGR58 not displayed.

CDOTC and ZDOTC compute the dot product of the complex conjugate of vector x with vector y:



Figure ESYGR59 not displayed.


Table 43. Data Types
x, y, Result Subprogram
Short-precision real SDOT
Long-precision real DDOT
Short-precision complex CDOTU and CDOTC
Long-precision complex ZDOTU and ZDOTC

Syntax

Fortran SDOT | DDOT | CDOTU | ZDOTU | CDOTC | ZDOTC (n, x, incx, y, incy)
C and C++ sdot | ddot | cdotu | zdotu | cdotc | zdotc (n, x, incx, y, incy);
PL/I SDOT | DDOT | CDOTU | ZDOTU | CDOTC | ZDOTC (n, x, incx, y, incy);

On Entry

n

is the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 43.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 43.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

On Return

Function value

is the result of the dot product computation. Returned as: a number of the data type indicated in Table 43.

Note

Declare this function in your program as returning a value of the data type indicated in Table 43.

Function

SDOT, DDOT, CDOTU, and ZDOTU compute the dot product of the vectors x and y, which is expressed as follows:



Figure ESYGR60 not displayed.

CDOTC and ZDOTC compute the dot product of the complex conjugate of vector x with vector y, which is expressed as follows:



Figure ESYGR61 not displayed.

See reference [73]. The result is returned as a function value. If n is 0, then zero is returned as the value of the function.

For SDOT, CDOTU, and CDOTC, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows how to compute the dot product of two vectors, x and y, having strides of 1.

Function Reference and Input
             N   X  INCX  Y  INCY
             |   |   |    |   |
DOTT = SDOT( 5 , X , 1  , Y , 1  )
 
X        =  (1.0, 2.0, -3.0, 4.0, 5.0)
Y        =  (9.0, 8.0, 7.0, -6.0, 5.0)

Output
DOTT     =  (9.0 + 16.0 - 21.0 - 24.0 + 25.0) = 5.0

Example 2

This example shows how to compute the dot product of a vector, x, with a stride of 1, and a vector, y, with a stride greater than 1.

Function Reference and Input
             N   X  INCX  Y  INCY
             |   |   |    |   |
DOTT = SDOT( 5 , X , 1  , Y , 2  )
 
X        =  (1.0, 2.0, -3.0, 4.0, 5.0)
Y        =  (9.0, . , 7.0, . , 5.0, . , -3.0, . , 1.0)

Output
DOTT     =  (9.0 + 14.0 - 15.0 - 12.0 + 5.0) = 1.0

Example 3

This example shows how to compute the dot product of a vector, x, with a negative stride, and a vector, y, with a stride greater than 1. For x, processing begins at element X(5), which is 5.0.

Function Reference and Input
             N   X   INCX  Y  INCY
             |   |    |    |   |
DOTT = SDOT( 5 , X , -1  , Y , 2  )
 
X        =  (1.0, 2.0, -3.0, 4.0, 5.0)
Y        =  (9.0, . , 7.0, . , 5.0, . , -3.0, . , 1.0)

Output
DOTT     =  (45.0 + 28.0 - 15.0 - 6.0 + 1.0) = 53.0

Example 4

This example shows how to compute the dot product of a vector, x, with a stride of 0, and a vector, y, with a stride of 1. The result in DOTT is x1(y1+...+yn).

Function Reference and Input
             N   X  INCX  Y  INCY
             |   |   |    |   |
DOTT = SDOT( 5 , X , 0  , Y , 1  )
 
X        =  (1.0, . , . , . , .)
Y        =  (9.0, 8.0, 7.0, -6.0, 5.0)

Output
DOTT     =  (1.0) × (9.0 + 8.0 + 7.0 - 6.0 + 5.0) = 23.0

Example 5

This example shows how to compute the dot product of two vectors, x and y, with strides of 0. The result in DOTT is nx1y1.

Function Reference and Input
             N   X  INCX  Y  INCY
             |   |   |    |   |
DOTT = SDOT( 5 , X , 0  , Y , 0  )
 
X        =  (1.0, . , . , . , .)
Y        =  (9.0, . , . , . , .)

Output
DOTT     =  (5) × (1.0) × (9.0) = 45.0

Example 6

This example shows how to compute the dot product of two vectors, x and y, containing complex numbers, where x has a stride of 1, and y has a stride greater than 1.

Function Reference and Input
              N   X  INCX  Y  INCY
              |   |   |    |   |
DOTT = CDOTU( 3 , X , 1  , Y , 2  )
 
X        =  ((1.0, 2.0), (3.0, -4.0), (-5.0, 6.0))
Y        =  ((10.0, 9.0), . , (-6.0, 5.0), . , (2.0, 1.0))

Output
DOTT     =  ((10.0 - 18.0 - 10.0) - (18.0 - 20.0 + 6.0),
             (9.0 + 15.0 - 5.0) + (20.0 + 24.0 + 12.0))
         =  (-22.0, 75.0)

Example 7

This example shows how to compute the dot product of the conjugate of a vector, x, with vector y, both containing complex numbers, where x has a stride of 1, and y has a stride greater than 1.

Function Reference and Input
              N   X  INCX  Y  INCY
              |   |   |    |   |
DOTT = CDOTC( 3 , X , 1  , Y , 2  )
 
X        =  ((1.0, 2.0), (3.0, -4.0), (-5.0, 6.0))
Y        =  ((10.0, 9.0), . , (-6.0, 5.0), . , (2.0, 1.0))

Output
DOTT     =  ((10.0 - 18.0 - 10.0) + (18.0 - 20.0 + 6.0),
             (9.0  +  15.0  -  5.0)  -  (20.0  +  24.0  +  12.0))
         =  (-14.0, -37.0)

SNAXPY and DNAXPY--Compute SAXPY or DAXPY N Times

These subprograms compute SAXPY or DAXPY, respectively, n times:

yi <-- yi + alphaixi    for i = 1, n

where each alphai is a scalar value, contained in the vector a, and each xi and yi are vectors, contained in vectors (or matrices) x and y, respectively. For an explanation of the SAXPY and DAXPY computations, see SAXPY, DAXPY, CAXPY, and ZAXPY--Multiply a Vector X by a Scalar, Add to a Vector Y, and Store in the Vector Y.

Table 44. Data Types
a, x, y Subprogram
Short-precision real SNAXPY
Long-precision real DNAXPY

Syntax

Fortran CALL SNAXPY | DNAXPY (n, m, a, inca, x, incxi, incxo, y, incyi, incyo)
C and C++ snaxpy | dnaxpy (n, m, a, inca, x, incxi, incxo, y, incyi, incyo);
PL/I CALL SNAXPY | DNAXPY (n, m, a, inca, x, incxi, incxo, y, incyi, incyo);

On Entry

n

is the number of SAXPY or DAXPY computations to be performed and the number of elements in vector a. Specified as: a fullword integer; n >= 0.

m

is the number of elements in vectors xi and yi for each SAXPY or DAXPY computation. Specified as: a fullword integer; m >= 0.

a

is the vector a of length n, containing the scalar values alphai, used in each computation of yi + alphaixi. Specified as: a one-dimensional array of (at least) length 1+(n-1)|inca|, containing numbers of the data type indicated in Table 44.

inca

is the stride for vector a. Specified as: a fullword integer. It can have any value.

x

is the vector (or matrix) x, containing the xi vectors of length m, used in the n computations of yi + alphaixi. Specified as: a one- or two-dimensional array of (at least) length (1+(n-1)(incxo)) + (m-1)|incxi|, containing numbers of the data type indicated in Table 44.

incxi

is the stride for x in the inner loop--that is, the stride identifying the elements in each vector xi. Specified as: a fullword integer. It can have any value.

incxo

is the stride for x in the outer loop--that is, the stride identifying each vector xi in x. Specified as: a fullword integer; incxo >= 0.

y

is the vector (or matrix) y, containing the yi vectors of length m, used in the n computations of yi + alphaixi. Specified as: a one- or two-dimensional array of (at least) length (1+(n-1)(incyo)) + (m-1)|incyi|, containing numbers of the data type indicated in Table 44.

incyi

is the stride for y in the inner loop--that is, the stride identifying the elements in each vector yi in y. Specified as: a fullword integer; incyi > 0 or incyi < 0.

incyo

is the stride for y in the outer loop--that is, the stride identifying each vector yi in y. Specified as: a fullword integer; incyo >= 0.

On Return

y

is the vector (or matrix) y, containing the yi vectors of length m, which contain the results of the n SAXPY or DAXPY computations, yi + alphaixi for i = 1, n. Returned as: a one- or two-dimensional array, containing numbers of the data type indicated in Table 44.

Note

Vector y must have no common elements with vector a or vector x; otherwise, results are unpredictable. See "Concepts".

Function

The SAXPY or DAXPY computations:

y <-- y + alphax

are performed n times. This is expressed as follows:

yi <-- yi + alphaixi    for i = 1, n

where each alphai is a scalar value, contained in the vector a, and each xi and yi are vectors, contained in vectors (or matrices) x and y, respectively.

Each computation of SAXPY or DAXPY on page SAXPY, DAXPY, CAXPY, and ZAXPY--Multiply a Vector X by a Scalar, Add to a Vector Y, and Store in the Vector Y uses the length of the xi and yi vectors, m, for its input argument, n. It also uses the strides for the inner loop, incxi and incyi, for its parameters incx and incy, respectively. See "Function" for a description of how the computation is done.

The outer loop of the SNAXPY or DNAXPY computation uses the strides of inca, incxo, and incyo to locate the elements in a and vectors in x and y for each i-th computation. These are:

For i = 1, n:
alpha((i-1)inca+1)    for inca >= 0
alpha((i-n)inca+1)    for inca < 0
x((i-1)incxo+1)
y((i-1)incyo+1)

If m or n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. n < 0
  2. m < 0
  3. incxo < 0
  4. incyi = 0
  5. incyo < 0

Example 1

This example shows vectors, contained in matrices, with the stride of the inner loops incxi and incyi equal to 1.

Call Statement and Input
             N   M   A  INCA  X  INCXI INCXO  Y  INCYI INCYO
             |   |   |   |    |    |     |    |    |     |
CALL SNAXPY( 3 , 4 , A , 1  , X ,  1  , 10  , Y ,  1  ,  5  )
 
A        =  (3.0, 2.0, 4.0)
        *               *
        | 1.0  4.0  3.0 |
        | 2.0  3.0  4.0 |
        | 3.0  2.0  2.0 |
        | 4.0  1.0  1.0 |
X    =  |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  1.0  3.0 |
        | 3.0  2.0  4.0 |
Y    =  | 2.0  3.0  2.0 |
        | 1.0  4.0  1.0 |
        |  .    .    .  |
        *               *

Output
        *                 *
        |  7.0  9.0  15.0 |
        |  9.0  8.0  20.0 |
Y    =  | 11.0  7.0  10.0 |
        | 13.0  6.0   5.0 |
        |   .    .     .  |
        *                 *

Example 2

This example shows vectors, contained in matrices, with a stride of the inner loop incxi greater than 1.

Call Statement and Input
             N   M   A  INCA  X  INCXI INCXO  Y  INCYI INCYO
             |   |   |   |    |    |     |    |    |     |
CALL SNAXPY( 3 , 4 , A , 1  , X ,  2  , 10  , Y ,  1  ,  5  )
 
A        =  (3.0, 2.0, 4.0)
        *               *
        | 1.0  4.0  3.0 |
        |  .    .    .  |
        | 2.0  3.0  4.0 |
        |  .    .    .  |
X    =  | 3.0  2.0  2.0 |
        |  .    .    .  |
        | 4.0  1.0  1.0 |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  1.0  3.0 |
        | 3.0  2.0  4.0 |
Y    =  | 2.0  3.0  2.0 |
        | 1.0  4.0  1.0 |
        |  .    .    .  |
        *               *

Output
Y        =(same as output Y in Example 1)

Example 3

This example shows vectors, contained in matrices, with a negative stride, incyi, for the inner loop.

Call Statement and Input
             N   M   A  INCA  X  INCXI INCXO  Y  INCYI INCYO
             |   |   |   |    |    |     |    |    |     |
CALL SNAXPY( 3 , 4 , A , 1  , X ,  1  , 10  , Y , -1  ,  5  )
 
A        =  (3.0, 2.0, 4.0)
        *               *
        | 1.0  4.0  3.0 |
        | 2.0  3.0  4.0 |
        | 3.0  2.0  2.0 |
        | 4.0  1.0  1.0 |
X    =  |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 1.0  4.0  1.0 |
        | 2.0  3.0  2.0 |
Y    =  | 3.0  2.0  4.0 |
        | 4.0  1.0  3.0 |
        |  .    .    .  |
        *               *

Output
        *                 *
        | 13.0  6.0   5.0 |
        | 11.0  7.0  10.0 |
Y    =  |  9.0  8.0  20.0 |
        |  7.0  9.0  15.0 |
        |   .    .     .  |
        *                 *

Example 4

This example shows vectors, contained in matrices, with a negative stride, inca, for vector a. For vector a, processing begins at element A(5), which is 3.0.

Call Statement and Input
             N   M   A   INCA  X  INCXI INCXO  Y  INCYI INCYO
             |   |   |    |    |    |     |    |    |     |
CALL SNAXPY( 3 , 4 , A , -2  , X ,  1  , 10  , Y ,  1  ,  5  )
 
A        =  (4.0, . , 2.0, . , 3.0)
        *               *
        | 1.0  4.0  3.0 |
        | 2.0  3.0  4.0 |
        | 3.0  2.0  2.0 |
        | 4.0  1.0  1.0 |
X    =  |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  1.0  3.0 |
        | 3.0  2.0  4.0 |
Y    =  | 2.0  3.0  2.0 |
        | 1.0  4.0  1.0 |
        |  .    .    .  |
        *               *

Output
Y        =(same as output Y in Example 1)

SNDOT and DNDOT--Compute Special Dot Products N Times

These subprograms compute one of the following special dot products n times:
si <-- xi * yi Store positive dot product
si <-- -xi * yi Store negative dot product
si <-- si+xi * yi Accumulate positive dot product
si <-- si-xi * yi Accumulate negative dot product
for i = 1, n



where each si is an element in vector s, and each xi and yi are vectors contained in vectors (or matrices) x and y, respectively.

Table 45. Data Types
s, x, y Subprogram
Short-precision real SNDOT
Long-precision real DNDOT

Syntax

Fortran CALL SNDOT | DNDOT (n, m, s, incs, isw, x, incxi, incxo, y, incyi, incyo)
C and C++ sndot | dndot (n, m, s, incs, isw, x, incxi, incxo, y, incyi, incyo);
PL/I CALL SNDOT | DNDOT (n, m, s, incs, isw, x, incxi, incxo, y, incyi, incyo);

On Entry

n

is the number of dot product computations to be performed and the number of elements in the vector s. Specified as: a fullword integer; n >= 0.

m

is the number of elements in vectors xi and yi for each dot product computation. Specified as: a fullword integer; m >= 0.

s

is the vector s, containing the n scalar values si, where: If isw = 1 or 2, si is not used in the computation (no input value specified.)

If isw = 3 or 4, si is used in the computation (input value specified.)

Specified as: a one-dimensional array of (at least) length 1+(n-1)|incs|, containing numbers of the data type indicated in Table 45.

incs

is the stride for vector s. Specified as: a fullword integer; incs > 0 or incs < 0.

isw

indicates the type of computation to perform, depending on the value specified:

If isw = 1,    si <-- xi * yi

If isw = 2,    si <-- -xi * yi

If isw = 3,    si <-- si + xi * yi

If isw = 4,    si <-- si - xi * yi

where i = 1, n

Specified as: a fullword integer. Its value must be 1, 2, 3, or 4.

x

is the vector (or matrix) x, containing the xi vectors of length m, used in the n dot product computations. Specified as: a one- or two-dimensional array of (at least) length (1+(n-1)(incxo))+(m-1)|incxi|, containing numbers of the data type indicated in Table 45.

incxi

is the stride for x in the inner loop--that is, the stride identifying the elements in each vector xi. Specified as: a fullword integer. It can have any value.

incxo

is the stride for x in the outer loop--that is, the stride identifying each vector xi in x. Specified as: a fullword integer; incxo >= 0.

y

is the vector (or matrix) y, containing the yi vectors of length m, used in the n dot product computations. Specified as: a one- or two-dimensional array of (at least) length (1+(n-1)(incyo)) + (m-1)|incyi|, containing numbers of the data type indicated in Table 45.

incyi

is the stride for y in the inner loop--that is, the stride identifying the elements in each vector yi. Specified as: a fullword integer. It can have any value.

incyo

is the stride for y in the outer loop--that is, the stride identifying each vector yi in y. Specified as: a fullword integer; incyo >= 0.

On Return

s

is the vector s of length n, containing the results of the n dot product computations. The type of dot product computation depends of the value specified for isw.

If isw = 1,    si <-- xi * yi

If isw = 2,    si <-- -xi * yi

If isw = 3,    si <-- si + xi * yi

If isw = 4,    si <-- si - xi * yi

where i = 1, n

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 45.

Function

The four possible computations that can be performed by these subprograms are:
si <-- xi * yi Store positive dot product
si <-- -xi * yi Store negative dot product
si <-- si+xi * yi Accumulate positive dot product
si <-- si-xi * yi Accumulate negative dot product
for i = 1, n



where each si is a scalar element in the vector s of length n, and each of the n xi and yi vectors of length m are contained in vectors (or matrices) x and y, respectively. Each computation uses the dot product, which is expressed:

xi * yi = u1v1 + u2v2 + ... + umvm

where ui and vi are elements of xi and yi, respectively. To find the elements for the computation, it uses:

If m or n is 0, no computation is performed. For SNDOT, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. n < 0
  2. m < 0
  3. incs = 0
  4. isw < 1 or isw > 4
  5. incxo < 0
  6. incyo < 0

Example 1

This example shows a store positive dot product computation using vectors with positive strides.

Call Statement and Input
            N   M   S  INCS ISW  X  INCXI INCXO  Y  INCYI INCYO
            |   |   |   |    |   |    |     |    |    |     |
CALL SNDOT( 3 , 4 , S , 1  , 1 , X ,  1  ,  4  , Y ,  1  ,  4  )
 
        *               *
        | 1.0  2.0  3.0 |
X    =  | 2.0  3.0  4.0 |
        | 3.0  4.0  5.0 |
        | 4.0  5.0  6.0 |
        *               *
 
        *               *
        | 4.0  3.0  2.0 |
Y    =  | 3.0  2.0  1.0 |
        | 2.0  1.0  4.0 |
        | 1.0  4.0  3.0 |
        *               *

Output
S        =  (20.0, 36.0, 48.0)

Example 2

This example shows a store negative dot product computation using vectors with positive and negative strides.

Call Statement and Input
            N   M   S   INCS ISW  X  INCXI INCXO  Y  INCYI INCYO
            |   |   |    |    |   |    |     |    |    |     |
CALL SNDOT( 3 , 4 , S , -1  , 2 , X ,  2  , 10  , Y , -1  ,  6  )
        *               *
        | 1.0  2.0  3.0 |
        |  .    .    .  |
        | 2.0  3.0  4.0 |
        |  .    .    .  |
X    =  | 3.0  4.0  5.0 |
        |  .    .    .  |
        | 4.0  5.0  6.0 |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  3.0  2.0 |
        | 3.0  2.0  1.0 |
Y    =  | 2.0  1.0  4.0 |
        | 1.0  4.0  3.0 |
        |  .    .    .  |
        |  .    .    .  |
        *               *

Output
S        =  (-42.0, -34.0, -30.0)

Example 3

This example shows an accumulative positive dot product using vectors with positive and negative strides.

Call Statement and Input
            N   M   S  INCS ISW  X  INCXI INCXO  Y  INCYI INCYO
            |   |   |   |    |   |    |     |    |    |     |
CALL SNDOT( 3 , 4 , S , 1  , 3 , X , -2  , 10  , Y ,  2  , 10  )
 
S        =  (2.0, 5.0, 8.0)
        *               *
        | 1.0  2.0  3.0 |
        |  .    .    .  |
        | 2.0  3.0  4.0 |
        |  .    .    .  |
X    =  | 3.0  4.0  5.0 |
        |  .    .    .  |
        | 4.0  5.0  6.0 |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  3.0  2.0 |
        |  .    .    .  |
        | 3.0  2.0  1.0 |
        |  .    .    .  |
Y    =  | 2.0  1.0  4.0 |
        |  .    .    .  |
        | 1.0  4.0  3.0 |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *

Output
S        =  (32.0, 39.0, 50.0)

Example 4

This example shows an accumulative negative dot product using vectors with positive and negative strides.

Call Statement and Input
            N   M   S   INCS ISW  X  INCXI INCXO  Y  INCYI INCYO
            |   |   |    |    |   |    |     |    |    |     |
CALL SNDOT( 3 , 4 , S , -1  , 4 , X ,  1  ,  6  , Y ,  2  , 10  )
S        =  (3.0, 6.0, 9.0)
        *               *
        | 1.0  2.0  3.0 |
        | 2.0  3.0  4.0 |
X    =  | 3.0  4.0  5.0 |
        | 4.0  5.0  6.0 |
        |  .    .    .  |
        |  .    .    .  |
        *               *
        *               *
        | 4.0  3.0  2.0 |
        |  .    .    .  |
        | 3.0  2.0  1.0 |
        |  .    .    .  |
Y    =  | 2.0  1.0  4.0 |
        |  .    .    .  |
        | 1.0  4.0  3.0 |
        |  .    .    .  |
        |  .    .    .  |
        |  .    .    .  |
        *               *

Output
S        =  (-45.0, -30.0, -11.0)

SNRM2, DNRM2, SCNRM2, and DZNRM2--Euclidean Length of a Vector with Scaling of Input to Avoid Destructive Underflow and Overflow

These subprograms compute the Euclidean length (l2 norm) of vector x, with scaling of input to avoid destructive underflow and overflow.

Table 46. Data Types
x Result Subprogram
Short-precision real Short-precision real SNRM2
Long-precision real Long-precision real DNRM2
Short-precision complex Short-precision real SCNRM2
Long-precision complex Long-precision real DZNRM2
Note: If there is a possibility that your data will cause the computation to overflow or underflow, you should use these subroutines instead of SNORM2, DNORM2, CNORM2, and ZNORM2, because the intermediate computational results may exceed the maximum or minimum limits of the machine. "Notes" explains how to estimate whether your data will cause an overflow or underflow.

Syntax

Fortran SNRM2 | DNRM2 | SCNRM2 | DZNRM2 (n, x, incx)
C and C++ snrm2 | dnrm2 | scnrm2 | dznrm2 (n, x, incx);
PL/I SNRM2 | DNRM2 | SCNRM2 | DZNRM2 (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n, whose Euclidean length is to be computed. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 46.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

is the Euclidean length (l2 norm) of the vector x. Returned as: a number of the data type indicated in Table 46.

Note

Declare this function in your program as returning a value of the data type indicated in Table 46.

Function

The Euclidean length (l2 norm) of vector x is expressed as follows, with scaling of input to avoid destructive underflow and overflow:



Figure ESYGR62 not displayed.

See reference [73]. The result is returned as the function value. If n is 0, then 0.0 is returned as the value of the function.

For SNRM2 and SCNRM2, the sum of the squares of the absolute values of the elements is accumulated in long precision. The square root of this long-precision sum is then computed and, if necessary, is unscaled.

Although these subroutines eliminate destructive underflow, nondestructive underflows may occur if the input elements differ greatly in magnitude. This does not affect accuracy, but it degrades performance. The system default is to mask underflow, which improves the performance of these subroutines.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Important Information About the Following Examples

Workstations use workstation architecture precisions: ANSI/IEEE 32-bit and 64-bit binary floating-point format. The ranges are:

Example 1

This example shows a vector, x, whose elements must be scaled to prevent overflow.

Function Reference and Input
               N   X  INCX
               |   |   |
DNORM = DNRM2( 6 , X , 1  )
 
X      = (0.68056D+200, 0.25521D+200, 0.34028D+200,
          0.85071D+200, 0.25521D+200, 0.85071D+200)

Output
DNORM    =  0.1469D+201

Example 2

This example shows a vector, x, whose elements must be scaled to prevent destructive underflow.

Function Reference and Input
               N   X  INCX
               |   |   |
DNORM = DNRM2( 4 , X , 2  )
 
X     = (0.10795D-200, . , 0.10795D-200, . , 0.10795D-200,
         . , 0.10795D-200)

Output
DNORM    =  0.21590D-200

Example 3

This example shows a vector, x, with a stride of 0. The result in SNORM is:



Figure ESYGR63 not displayed.

Function Reference and Input
               N   X  INCX
               |   |   |
SNORM = SNRM2( 4 , X , 0  )
 
X        =  (4.0)

Output
SNORM    =  8.0

Example 4

This example shows a vector, x, containing complex numbers, and whose elements must be scaled to prevent overflow.

Function Reference and Input
                 N   X  INCX
                 |   |   |
DZNORM = DZNRM2( 3 , X , 1  )
 
X     = ((0.68056D+200, 0.25521D+200), (0.34028D+200, 0.85071D+200),
         (0.25521D+200, 0.85071D+200))

Output
DZNORM   =  0.1469D+201

Example 5

This example shows a vector, x, containing complex numbers, and whose elements must be scaled to prevent destructive underflow.

Function Reference and Input
                 N   X  INCX
                 |   |   |
DZNORM = DZNRM2( 2 , X , 2  )
 
X     = ((0.10795D-200, 0.10795D-200), . ,
         (0.10795D-200, 0.10795D-200))

Output
DZNORM   =  0.2159D-200

SNORM2, DNORM2, CNORM2, and ZNORM2--Euclidean Length of a Vector with No Scaling of Input

These subprograms compute the euclidean length (l2 norm) of vector x with no scaling of input.

Table 47. Data Types
x Result Subprogram
Short-precision real Short-precision real SNORM2
Long-precision real Long-precision real DNORM2
Short-precision complex Short-precision real CNORM2
Long-precision complex Long-precision real ZNORM2

Syntax

Fortran SNORM2 | DNORM2 | CNORM2 | ZNORM2 (n, x, incx)
C and C++ snorm2 | dnorm2 | cnorm2 | znorm2 (n, x, incx);
PL/I SNORM2 | DNORM2 | CNORM2 | ZNORM2 (n, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n, whose euclidean length is to be computed. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 47.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

Function value

is the euclidean length (l2 norm) of the vector x. Returned as: a number of the data type indicated in Table 47.

Notes

  1. This subroutine does not underflow or overflow if the values of the elements in vector x conform to the following conditions. If these conditions are violated, overflow or destructive underflow may occur:

  2. Declare this function in your program as returning a value of the data type indicated in Table 47.

Function

The euclidean length (l2 norm) of vector x is expressed as follows with no scaling of input:



Figure ESYGR64 not displayed.

See reference [73]. The result is returned as the function value. If n is 0, then 0.0 is returned as the value of the function.

For SNORM2 and CNORM2, the sum of the squares of the absolute values of the elements is accumulated in long-precision. The square root of this long-precision sum is then computed.

This subroutine should not be used if the values in vector x do not conform to the restriction given in "Notes".

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Function Reference and Input
                N   X  INCX
                |   |   |
SNORM = SNORM2( 6 , X , 1  )
 
X        =  (3.0, 4.0, 1.0, 8.0, 1.0, 3.0)

Output
SNORM    =  10.0

Example 2

This example shows a vector, x, with a stride greater than 1.

Function Reference and Input
                N   X  INCX
                |   |   |
SNORM = SNORM2( 6 , X , 2  )
 
X        =  (3.0, . , 4.0, . , 1.0, . , 8.0, . , 1.0, . , 3.0)

Output
SNORM    =  10.0

Example 3

This example shows a vector, x, with a stride of 0. The result in SNORM is:



Figure ESYGR65 not displayed.

Function Reference and Input
                N   X  INCX
                |   |   |
SNORM = SNORM2( 4 , X , 0  )
 
X        =  (4.0)

Output
SNORM    =  8.0

Example 4

This example shows a vector, x, containing complex numbers and having a stride of 1.

Function Reference and Input
                N   X  INCX
                |   |   |
CNORM = CNORM2( 3 , X , 1  )
 
X        =  ((3.0, 4.0), (1.0, 8.0), (-1.0, 3.0))

Output
CNORM    =  10.0

SROTG, DROTG, CROTG, and ZROTG--Construct a Givens Plane Rotation

SROTG and DROTG construct a real Givens plane rotation, and CROTG and ZROTG construct a complex Givens plane rotation. The computations use rotational elimination parameters a and b. Values are returned for r, as well as the cosine c and the sine s of the angle of rotation. SROTG and DROTG also return a value for z.
Note: Throughout this description, the symbols r and z are used to represent two of the output values returned for this computation. It is important to note that the values for r and z are actually returned in the input-output arguments a and b, respectively, overwriting the original values passed in a and b.

Table 48. Data Types
a, b, r, s c z Subprogram
Short-precision real Short-precision real Short-precision real SROTG
Long-precision real Long-precision real Long-precision real DROTG
Short-precision complex Short-precision real (No value returned) CROTG
Long-precision complex Long-precision real (No value returned) ZROTG

Syntax

Fortran CALL SROTG | DROTG | CROTG | ZROTG (a, b, c, s)
C and C++ srotg | drotg | crotg | zrotg (a, b, c, s);
PL/I CALL SROTG | DROTG | CROTG | ZROTG (a, b, c, s);

On Entry

a

is the rotational elimination parameter a. Specified as: a number of the data type indicated in Table 48.

b

is the rotational elimination parameter b. Specified as: a number of the data type indicated in Table 48.

c

See 'On Return'.

s

See 'On Return'.

On Return

a

is the value computed for r.

For SROTG and DROTG:



Figure ESYGR66 not displayed.

where:

sigma = SIGN(a) if |a| > |b|
sigma = SIGN(b) if |a| <= |b|

For CROTG and ZROTG:



Figure ESYGR67 not displayed.

where:

psi = a/|a|

Returned as: a number of the data type indicated in Table 48.

b

is the value computed for z.

For SROTG and DROTG:

z = s    if |a| > |b|
z = 1/c    if |a| <= |b| and c <> 0 and r <> 0
z = 1    if |a| <= |b| and c = 0 and r <> 0
z = 0    if r = 0

For CROTG and ZROTG: no value is returned, and the input value is not changed.

Returned as: a number of the data type indicated in Table 48.

c

is the cosine c of the angle of (Givens) rotation. For SROTG and DROTG:
c = a/r    if r <> 0
c = 1    if r = 0

For CROTG and ZROTG:



Figure ESYGR68 not displayed.

Returned as: a number of the data type indicated in Table 48.

s

is the sine s of the angle of (Givens) rotation.

For SROTG and DROTG:

s = b/r    if r <> 0
s = 0    if r = 0

For CROTG and ZROTG:



Figure ESYGR69 not displayed.

where psi = a/|a|

Returned as: a number of the data type indicated in Table 48.

Note

In your C program, arguments a, b, c, and s must be passed by reference.

Function

SROTG and DROTG

A real Givens plane rotation is constructed for values a and b by computing values for r, c, s, and z, where:



Figure ESYGR70 not displayed.

where:

sigma = SIGN(a)    if |a| > |b|
sigma = SIGN(b)    if |a| <= |b|

c = a/r    if r <> 0

c = 1    if r = 0

s = b/r    if r <> 0

s = 0    if r = 0

z = s    if |a| > |b|

z = 1/c    if |a| <= |b| and c <> 0 and r <> 0

z = 1    if |a| <= |b| and c = 0 and r <> 0

z = 0    if r = 0

See reference [73].

Following are some important points about the computation:

  1. The numbers for c, s, and r satisfy:



    Figure ESYGR71 not displayed.


  2. Where necessary, scaling is used to avoid overflow and destructive underflow in the computation of r, which is expressed as follows:



    Figure ESYGR72 not displayed.

  3. sigma is not essential to the computation of a Givens rotation matrix, but its use permits later stable reconstruction of c and s from just one stored number, z. See reference [85]. c and s are reconstructed from z as follows:



    Figure ESYGR73 not displayed.

CROTG and ZROTG

A complex Givens plane rotation is constructed for values a and b by computing values for r, c, and s, where:



Figure ESYGR74 not displayed.

where:

psi = a/|a|



Figure ESYGR75 not displayed.



Figure ESYGR76 not displayed.

See reference [73].

Following are some important points about the computation:

  1. The numbers for c, s, and r satisfy:



    Figure ESYGR77 not displayed.


  2. Where necessary, scaling is used to avoid overflow and destructive underflow in the computation of r, which is expressed as follows:



    Figure ESYGR78 not displayed.

Error Conditions

Computational Errors

None

Input-Argument Errors

None

Example 1

This example shows the construction of a real Givens plane rotation, where r is 0.

Call Statement and Input
             A     B    C   S
             |     |    |   |
CALL SROTG( 0.0 , 0.0 , C , S )

Output
A        =  0.0
B        =  0.0
C        =  1.0
S        =  0.0

Example 2

This example shows the construction of a real Givens plane rotation, where c is 0.

Call Statement and Input
             A     B    C   S
             |     |    |   |
CALL SROTG( 0.0 , 2.0 , C , S )

Output
A        =  2.0
B        =  1.0
C        =  0.0
S        =  1.0

Example 3

This example shows the construction of a real Givens plane rotation, where |b| > |a|.

Call Statement and Input
             A      B    C   S
             |      |    |   |
CALL SROTG( 6.0 , -8.0 , C , S )

Output
A        =  -10.0
                 _
B        =  -1.666
C        =  -0.6
S        =  0.8

Example 4

This example shows the construction of a real Givens plane rotation, where |a| > |b|.

Call Statement and Input
             A     B    C   S
             |     |    |   |
CALL SROTG( 8.0 , 6.0 , C , S )

Output
A        =  10.0
B        =  0.6
C        =  0.8
S        =  0.6

Example 5

This example shows the construction of a complex Givens plane rotation, where |a| = 0.

Call Statement and Input
            A   B   C   S
            |   |   |   |
CALL CROTG( A , B , C , S )
 
A        =  (0.0, 0.0)
B        =  (1.0, 0.0)
 

Output
A        =  (1.0, 0.0)
C        =  0.0
S        =  (1.0, 0.0)

Example 6

This example shows the construction of a complex Givens plane rotation, where |a| <> 0.

Call Statement and Input
            A   B   C   S
            |   |   |   |
CALL CROTG( A , B , C , S )
 
A        =  (3.0, 4.0)
B        =  (4.0, 6.0)

Output
A        =  (5.26, 7.02)
C        =  0.57
S        =  (0.82, -0.05)

SROT, DROT, CROT, ZROT, CSROT, and ZDROT--Apply a Plane Rotation

SROT and DROT apply a real plane rotation to real vectors; CROT and ZROT apply a complex plane rotation to complex vectors; and CSROT and ZDROT apply a real plane rotation to complex vectors. The plane rotation is applied to n points, where the points to be rotated are contained in vectors x and y, and where the cosine and sine of the angle of rotation are c and s, respectively.

Table 49. Data Types
x, y c s Subprogram
Short-precision real Short-precision real Short-precision real SROT
Long-precision real Long-precision real Long-precision real DROT
Short-precision complex Short-precision real Short-precision complex CROT
Long-precision complex Long-precision real Long-precision complex ZROT
Short-precision complex Short-precision real Short-precision real CSROT
Long-precision complex Long-precision real Long-precision real ZDROT

Syntax

Fortran CALL SROT | DROT | CROT | ZROT | CSROT | ZDROT (n, x, incx, y, incy, c, s)
C and C++ srot | drot | crot | zrot | csrot | zdrot (n, x, incx, y, incy, c, s);
PL/I CALL SROT | DROT | CROT | ZROT | CSROT | ZDROT (n, x, incx, y, incy, c, s);

On Entry

n

is the number of points to be rotated--that is, the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n, containing the xi coordinates of the points to be rotated. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 49.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n, containing the yi coordinates of the points to be rotated. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 49.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

c

the cosine, c, of the angle of rotation. Specified as: a number of the data type indicated in Table 49.

s

the sine, s, of the angle of rotation. Specified as: a number of the data type indicated in Table 49.

On Return

x

is the vector x of length n, containing the rotated xi coordinates, where:
xi <-- cxi+syi    for i = 1,

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 49.

y

is the vector y of length n, containing the rotated yi coordinates, where:

For SROT, DROT, CSROT, and ZDROT:

yi <-- -sxi+cyi    for i = 1, n

For CROT and ZROT:



Figure ESYGR79 not displayed.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 49.

Note

The vectors x and y must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

Applying a plane rotation to n points, where the points to be rotated are contained in vectors x and y, is expressed as follows, where c and s are the cosine and sine of the angle of rotation, respectively. For SROT, DROT, CSROT, and ZDROT:



Figure ESYGR80 not displayed.


For CROT and ZROT:



Figure ESYGR81 not displayed.


See references [54] and [73]. No computation is performed if n is 0 or if c is 1.0 and s is zero. For SROT, CROT, and CSROT, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows how to apply a real plane rotation to real vectors x and y having positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY   C    S
           |   |   |    |   |     |    |
CALL SROT( 5 , X , 1  , Y , 2  , 0.5 , S )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (-1.0, . , -2.0, . , -3.0, . , -4.0, . , -5.0)



Figure ESYGR82 not displayed.

Output
X        =  (-0.366, -0.732, -1.098, -1.464, -1.830)
Y        =  (-1.366, -2.732, -4.098, -5.464, -6.830)

Example 2

This example shows how to apply a real plane rotation to real vectors x and y having strides of opposite sign.

Call Statement and Input
           N   X  INCX  Y   INCY   C    S
           |   |   |    |    |     |    |
CALL SROT( 5 , X , 1  , Y , -1  , 0.5 , S )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (-5.0, -4.0, -3.0, -2.0, -1.0)



Figure ESYGR82 not displayed.

Output
X        =(same as output X in Example 1)
Y        =  (-6.830, -5.464, -4.098, -2.732, -1.366)

Example 3

This example shows how scalar values in vectors x and y can be processed by specifying 0 strides and the number of elements to be processed, n, equal to 1.

Call Statement and Input
           N   X  INCX  Y  INCY   C    S
           |   |   |    |   |     |    |
CALL SROT( 1 , X , 0  , Y , 0  , 0.5 , S )
 
X        =  (1.0)
Y        =  (-1.0)



Figure ESYGR82 not displayed.

Output
X        =  (-0.366)
Y        =  (-1.366)

Example 4

This example shows how to apply a complex plane rotation to complex vectors x and y having positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY   C    S
           |   |   |    |   |     |    |
CALL CROT( 3 , X , 1  , Y , 2  , 0.5 , S )
 
X        =  ((1.0, 2.0), (2.0, 3.0), (3.0, 4.0))
Y        =  ((-1.0, 5.0), . , (-2.0, 4.0), . , (-3.0, 3.0))
S        =  (0.75, 0.50)

Output
X        =  ((-2.750, 4.250), (-2.500, 3.500), (-2.250, 2.750))
Y        =  ((-2.250, 1.500), . , (-4.000, 0.750), . ,
             (-5.750, 0.000))

Example 5

This example shows how to apply a real plane rotation to complex vectors x and y having positive strides.

Call Statement and Input
            N   X  INCX  Y  INCY   C    S
            |   |   |    |   |     |    |
CALL CSROT( 3 , X , 1  , Y , 2  , 0.5 , S )
 
X        =  ((1.0, 2.0), (2.0, 3.0), (3.0, 4.0))
Y        =  ((-1.0, 5.0), . , (-2.0, 4.0), . , (-3.0, 3.0))



Figure ESYGR82 not displayed.

Output
X        =  ((-0.366, 5.330), (-0.732, 4.964), (-1.098, 4.598))
Y        =  ((-1.366, 0.768), . , (-2.732, -0.598), . ,
             (-4.098, -1.964))

SSCAL, DSCAL, CSCAL, ZSCAL, CSSCAL, and ZDSCAL--Multiply a Vector X by a Scalar and Store in the Vector X

These subprograms perform the following computation, using the scalar alpha and the vector x:

x <-- alphax

Table 50. Data Types
alpha x Subprogram
Short-precision real Short-precision real SSCAL
Long-precision real Long-precision real DSCAL
Short-precision complex Short-precision complex CSCAL
Long-precision complex Long-precision complex ZSCAL
Short-precision real Short-precision complex CSSCAL
Long-precision real Long-precision complex ZDSCAL

Syntax

Fortran CALL SSCAL | DSCAL | CSCAL | ZSCAL | CSSCAL | ZDSCAL (n, alpha, x, incx)
C and C++ sscal | dscal | cscal | zscal | csscal | zdscal (n, alpha, x, incx);
PL/I CALL SSCAL | DSCAL | CSCAL | ZSCAL | CSSCAL | ZDSCAL (n, alpha, x, incx);

On Entry

n

is the number of elements in vector x. Specified as: a fullword integer; n >= 0.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 50.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 50.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

On Return

x

is the vector x of length n, containing the result of the computation alphax. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 50.

Note

The fastest way in ESSL to zero out contiguous (stride 1) arrays is to call SSCAL or DSCAL, specifying incx = 1 and alpha = 0.

Function

The computation is expressed as follows:



Figure ESYGR83 not displayed.


See reference [73]. If n is 0, no computation is performed. For CSCAL, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows a vector, x, with a stride of 1.

Call Statement and Input
            N  ALPHA  X  INCX
            |    |    |   |
CALL SSCAL( 5 , 2.0 , X , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
X        =  (2.0, 4.0, 6.0, 8.0, 10.0)

Example 2

This example shows vector, x, with a stride greater than 1.

Call Statement and Input
            N  ALPHA  X  INCX
            |    |    |   |
CALL SSCAL( 5 , 2.0 , X , 2  )
 
X        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0)

Output
X        =  (2.0, . , 4.0, . , 6.0, . , 8.0, . , 10.0)

Example 3

This example illustrates that when the strides for two similar computations (Example 1 and Example 3) have the same absolute value but have opposite signs, the output is the same. This example is the same as Example 1, except the stride for x is negative (-1). For performance reasons, it is better to specify the positive stride. For x, processing begins at element X(5), which is 5.0, and results are stored beginning at the same element.

Call Statement and Input
            N  ALPHA  X   INCX
            |    |    |    |
CALL SSCAL( 5 , 2.0 , X , -1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
X        =  (2.0, 4.0, 6.0, 8.0, 10.0)

Example 4

This example shows how SSCAL can be used to compute a scalar value. In this case, input vector x contains a scalar value, and the stride is 0. The number of elements to be processed, n, is 1.

Call Statement and Input
            N  ALPHA  X  INCX
            |    |    |   |
CALL SSCAL( 1 , 2.0 , X , 0  )
 
X        =  (1.0)

Output
X        =  (2.0)

Example 5

This example shows a scalar, alpha, and a vector, x, containing complex numbers, where vector x has a stride of 1.

Call Statement and Input
            N  ALPHA  X  INCX
            |    |    |   |
CALL CSCAL( 3 ,ALPHA, X , 1  )
 
ALPHA    =  (2.0, 3.0)
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))

Output
X        =  ((-4.0, 7.0), (4.0, 6.0), (-9.0, 19.0))

Example 6

This example shows a scalar, alpha, containing a real number, and a vector, x, containing complex numbers, where vector x has a stride of 1.

Call Statement and Input
             N  ALPHA  X  INCX
             |    |    |   |
CALL CSSCAL( 3 , 2.0 , X , 1  )
 
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))

Output
X        =  ((2.0, 4.0), (4.0, 0.0), (6.0, 10.0))

SSWAP, DSWAP, CSWAP, and ZSWAP--Interchange the Elements of Two Vectors

These subprograms interchange the elements of vectors x and y:

y <--> x

Table 51. Data Types
x, y Subprogram
Short-precision real SSWAP
Long-precision real DSWAP
Short-precision complex CSWAP
Long-precision complex ZSWAP

Syntax

Fortran CALL SSWAP | DSWAP | CSWAP | ZSWAP (n, x, incx, y, incy)
C and C++ sswap | dswap | cswap | zswap (n, x, incx, y, incy);
PL/I CALL SSWAP | DSWAP | CSWAP | ZSWAP (n, x, incx, y, incy);

On Entry

n

is the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 51.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 51.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

On Return

x

is the vector x of length n, containing the elements that were swapped from vector y. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 51.

y

is the vector y of length n, containing the elements that were swapped from vector x. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 51.

Notes

  1. If you specify the same vector for x and y, then incx and incy must be equal; otherwise, results are unpredictable.

  2. If you specify different vectors for x and y, they must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

The elements of vectors x and y are interchanged as follows:



Figure ESYGR84 not displayed.


See reference [73]. If n is 0, no elements are interchanged.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x and y with positive strides.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL SSWAP( 5 , X , 1  , Y , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (-1.0, . , -2.0, . , -3.0, . , -4.0, . , -5.0)

Output
X        =  (-1.0, -2.0, -3.0, -4.0, -5.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0)

Example 2

This example shows how to obtain output vectors x and y that are reverse copies of the input vectors y and x. You must specify strides with the same absolute value, but with opposite signs. For y, which has negative stride, processing begins at element Y(5), which is -5.0, and the results of the swap are stored beginning at the same element.

Call Statement and Input
            N   X  INCX  Y   INCY
            |   |   |    |    |
CALL SSWAP( 5 , X , 1  , Y , -1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (-1.0, -2.0, -3.0, -4.0, -5.0)

Output
X        =  (-5.0, -4.0, -3.0, -2.0, -1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Example 3

This example shows how SSWAP can be used to interchange scalar values in vectors x and y by specifying 0 strides and the number of elements to be processed as 1.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL SSWAP( 1 , X , 0  , Y , 0  )
 
X        =  (1.0)
Y        =  (-4.0)

Output
X        =  (-4.0)
Y        =  (1.0)

Example 4

This example shows vectors x and y, containing complex numbers and having positive strides.

Call Statement and Input
            N   X  INCX  Y  INCY
            |   |   |    |   |
CALL CSWAP( 4 , X , 1  , Y , 2  )
 
X        =  ((1.0, 6.0), (2.0, 7.0), (3.0, 8.0), (4.0, 9.0))
Y        =  ((-1.0, -1.0), . , (-2.0, -2.0), . , (-3.0, -3.0), . ,
             (-4.0, -4.0))

Output
X        =  ((-1.0, -1.0), (-2.0, -2.0), (-3.0, -3.0), (-4.0, -4.0))
Y        =  ((1.0, 6.0), . , (2.0, 7.0), . , (3.0, 8.0), . ,
             (4.0, 9.0))

SVEA, DVEA, CVEA, and ZVEA--Add a Vector X to a Vector Y and Store in a Vector Z

These subprograms perform the following computation, using vectors x, y, and z:

z <-- x+y

Table 52. Data Types
x, y, z Subprogram
Short-precision real SVEA
Long-precision real DVEA
Short-precision complex CVEA
Long-precision complex ZVEA

Syntax

Fortran CALL SVEA | DVEA | CVEA | ZVEA (n, x, incx, y, incy, z, incz)
C and C++ svea | dvea | cvea | zvea (n, x, incx, y, incy, z, incz);
PL/I CALL SVEA | DVEA | CVEA | ZVEA (n, x, incx, y, incy, z, incz);

On Entry

n

is the number of elements in vectors x, y, and z. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 52.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 52.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

z

See 'On Return'.

incz

is the stride for vector z. Specified as: a fullword integer. It can have any value.

On Return

z

is the vector z of length n, containing the result of the computation. Returned as: a one-dimensional array of (at least) length 1+(n-1)|incz|, containing numbers of the data type indicated in Table 52.

Notes

  1. If you specify the same vector for x and z, then incx and incz must be equal; otherwise, results are unpredictable. The same is true for y and z.

  2. If you specify different vectors for x and z, they must have no common elements; otherwise, results are unpredictable. The same is true for y and z. See "Concepts".

Function

The computation is expressed as follows:



Figure ESYGR85 not displayed.


If n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x, y, and z, with positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEA( 5 , X , 1  , Y , 2  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 1.0, . , 1.0, . , 1.0, . , 1.0)

Output
Z        =  (2.0, 3.0, 4.0, 5.0, 6.0)

Example 2

This example shows vectors x and y having strides of opposite sign, and an output vector z having a positive stride. For y, which has negative stride, processing begins at element Y(5), which is 1.0.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVEA( 5 , X , 1  , Y , -1  , Z , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (2.0, . , 4.0, . , 6.0, . , 8.0, . , 10.0)

Example 3

This example shows a vector, x, with 0 stride and a vector, z, with negative stride. x is treated like a vector of length n, all of whose elements are the same as the single element in x. For vector z, results are stored beginning in element Z(5).

Call Statement and Input
           N   X  INCX  Y  INCY  Z   INCZ
           |   |   |    |   |    |    |
CALL SVEA( 5 , X , 0  , Y , 1  , Z , -1  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (2.0, 3.0, 4.0, 5.0, 6.0)

Example 4

This example shows a vector, y, with 0 stride. y is treated like a vector of length n, all of whose elements are the same as the single element in y.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEA( 5 , X , 1  , Y , 0  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (6.0, 7.0, 8.0, 9.0, 10.0)

Example 5

This example shows the output vector, z, with 0 stride, where the vector x has positive stride, and the vector y has 0 stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEA( 5 , X , 1  , Y , 0  , Z , 0  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (10.0)

Example 6

This example shows the output vector z, with 0 stride, where the vector x has 0 stride, and the vector y has negative stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVEA( 5 , X , 0  , Y , -1  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (6.0)

Example 7

This example shows how SVEA can be used to compute a scalar value. In this case, vectors x and y contain scalar values. The strides of all vectors, x, y, and z, are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEA( 1 , X , 0  , Y , 0  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0)

Output
Z        =  (6.0)

Example 8

This example shows vectors x and y, containing complex numbers and having positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL CVEA( 3 , X , 1  , Y , 2  , Z , 1  )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
Y        =  ((7.0, 8.0), . , (9.0, 10.0), . , (11.0, 12.0))

Output
Z        =  ((8.0, 10.0), (12.0, 14.0), (16.0, 18.0))

SVES, DVES, CVES, and ZVES--Subtract a Vector Y from a Vector X and Store in a Vector Z

These subprograms perform the following computation, using vectors x, y, and z:

z <-- x-y

Table 53. Data Types
x, y, z Subprogram
Short-precision real SVES
Long-precision real DVES
Short-precision complex CVES
Long-precision complex ZVES

Syntax

Fortran CALL SVES | DVES | CVES | ZVES (n, x, incx, y, incy, z, incz)
C and C++ sves | dves | cves | zves (n, x, incx, y, incy, z, incz);
PL/I CALL SVES | DVES | CVES | ZVES (n, x, incx, y, incy, z, incz);

On Entry

n

is the number of elements in vectors x, y, and z. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 53.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 53.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

z

See 'On Return'.

incz

is the stride for vector z. Specified as: a fullword integer. It can have any value.

On Return

z

is the vector z of length n, containing the result of the computation. Returned as: a one-dimensional array of (at least) length 1+(n-1)|incz|, containing numbers of the data type indicated in Table 53.

Notes

  1. If you specify the same vector for x and z, then incx and incz must be equal; otherwise, results are unpredictable. The same is true for y and z.

  2. If you specify different vectors for x and z, they must have no common elements; otherwise, results are unpredictable. The same is true for y and z. See "Concepts".

Function

The computation is expressed as follows:



Figure ESYGR86 not displayed.


If n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x, y, and z, with positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVES( 5 , X , 1  , Y , 2  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 1.0, . , 1.0, . , 1.0, . , 1.0)

Output
Z        =  (0.0, 1.0, 2.0, 3.0, 4.0)

Example 2

This example shows vectors x and y having strides of opposite sign, and an output vector z having a positive stride. For y, which has negative stride, processing begins at element Y(5), which is 1.0.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVES( 5 , X , 1  , Y , -1  , Z , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (0.0, . , 0.0, . , 0.0, . , 0.0, . , 0.0)

Example 3

This example shows a vector, x, with 0 stride, and a vector, z, with negative stride. x is treated like a vector of length n, all of whose elements are the same as the single element in x. For vector z, results are stored beginning in element Z(5).

Call Statement and Input
           N   X  INCX  Y  INCY  Z   INCZ
           |   |   |    |   |    |    |
CALL SVES( 5 , X , 0  , Y , 1  , Z , -1  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (0.0, -1.0, -2.0, -3.0, -4.0)

Example 4

This example shows a vector, y, with 0 stride. y is treated like a vector of length n, all of whose elements are the same as the single element in y.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVES( 5 , X , 1  , Y , 0  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (-4.0, -3.0, -2.0, -1.0, 0.0)

Example 5

This example shows the output vector z, with 0 stride, where the vector x has positive stride, and the vector y has 0 stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVES( 5 , X , 1  , Y , 0  , Z , 0  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (0.0)

Example 6

This example shows the output vector z, with 0 stride, where the vector x has 0 stride, and the vector y has negative stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVES( 5 , X , 0  , Y , -1  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (-4.0)

Example 7

This example shows how SVES can be used to compute a scalar value. In this case, vectors x and y contain scalar values. The strides of all vectors, x, y, and z, are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVES( 1 , X , 0  , Y , 0  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0)

Output
Z        =  (-4.0)

Example 8

This example shows vectors x and y, containing complex numbers and having positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL CVES( 3 , X , 1  , Y , 2  , Z , 1  )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
Y        =  ((7.0, 8.0), . , (9.0, 10.0), . , (11.0, 12.0))

Output
Z        =  ((-6.0, -6.0), (-6.0, -6.0), (-6.0, -6.0))

SVEM, DVEM, CVEM, and ZVEM--Multiply a Vector X by a Vector Y and Store in a Vector Z

These subprograms perform the following computation, using vectors x, y, and z:

z <-- xy

Table 54. Data Types
x, y, z Subprogram
Short-precision real SVEM
Long-precision real DVEM
Short-precision complex CVEM
Long-precision complex ZVEM

Syntax

Fortran CALL SVEM | DVEM | CVEM | ZVEM (n, x, incx, y, incy, z, incz)
C and C++ svem | dvem | cvem | zvem (n, x, incx, y, incy, z, incz);
PL/I CALL SVEM | DVEM | CVEM | ZVEM (n, x, incx, y, incy, z, incz);

On Entry

n

is the number of elements in vectors x, y, and z. Specified as: a fullword integer; n >= 0.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 54.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 54.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

z

See 'On Return'.

incz

is the stride for vector z. Specified as: a fullword integer. It can have any value.

On Return

z

is the vector z of length n, containing the result of the computation. Returned as: a one-dimensional array of (at least) length 1+(n-1)|incz|, containing numbers of the data type indicated in Table 54.

Notes

  1. If you specify the same vector for x and z, then incx and incz must be equal; otherwise, results are unpredictable. The same is true for y and z.

  2. If you specify different vectors for x and z, they must have no common elements; otherwise, results are unpredictable. The same is true for y and z. See "Concepts".

Function

The computation is expressed as follows:

zi <-- xiyi    for i = 1, n

If n is 0, no computation is performed. For CVEM, intermediate results are accumulated in long precision (short-precision Multiply followed by a long-precision Add), with the final result truncated to short precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x, y, and z, with positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEM( 5 , X , 1  , Y , 2  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 1.0, . , 1.0, . , 1.0, . , 1.0)

Output
Z        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Example 2

This example shows vectors x and y having strides of opposite sign, and an output vector z having a positive stride. For y, which has negative stride, processing begins at element Y(5), which is 1.0.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVEM( 5 , X , 1  , Y , -1  , Z , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (1.0, . , 4.0, . , 9.0, . , 16.0, . , 25.0)

Example 3

This example shows a vector, x, with 0 stride, and a vector, z, with negative stride. x is treated like a vector of length n, all of whose elements are the same as the single element in x. For vector z, results are stored beginning in element Z(5).

Call Statement and Input
           N   X  INCX  Y  INCY  Z   INCZ
           |   |   |    |   |    |    |
CALL SVEM( 5 , X , 0  , Y , 1  , Z , -1  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Example 4

This example shows a vector, y, with 0 stride. y is treated like a vector of length n, all of whose elements are the same as the single element in y.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEM( 5 , X , 1  , Y , 0  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (5.0, 10.0, 15.0, 20.0, 25.0)

Example 5

This example shows the output vector, z, with 0 stride, where the vector x has positive stride, and the vector y has 0 stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEM( 5 , X , 1  , Y , 0  , Z , 0  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (25.0)

Example 6

This example shows the output vector z, with 0 stride, where the vector x has 0 stride, and the vector y has negative stride. The number of elements to be processed, n, is greater than 1.

Call Statement and Input
           N   X  INCX  Y   INCY  Z  INCZ
           |   |   |    |    |    |   |
CALL SVEM( 5 , X , 0  , Y , -1  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (5.0)

Example 7

This example shows how SVEM can be used to compute a scalar value. In this case, vectors x and y contain scalar values. The strides of all vectors, x, y, and z, are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL SVEM( 1 , X , 0  , Y , 0  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0)

Output
Z        =  (5.0)

Example 8

This example shows vectors x and y, containing complex numbers and having positive strides.

Call Statement and Input
           N   X  INCX  Y  INCY  Z  INCZ
           |   |   |    |   |    |   |
CALL CVEM( 3 , X , 1  , Y , 2  , Z , 1  )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
Y        =  ((7.0, 8.0), . , (9.0, 10.0), . , (11.0, 12.0))

Output
Z        =  ((-9.0, 22.0), (-13.0, 66.0), (-17.0, 126.0))

SYAX, DYAX, CYAX, ZYAX, CSYAX, and ZDYAX--Multiply a Vector X by a Scalar and Store in a Vector Y

These subprograms perform the following computation, using the scalar alpha and vectors x and y:

y <-- alphax

Table 55. Data Types
alpha x, y Subprogram
Short-precision real Short-precision real SYAX
Long-precision real Long-precision real DYAX
Short-precision complex Short-precision complex CYAX
Long-precision complex Long-precision complex ZYAX
Short-precision real Short-precision complex CSYAX
Long-precision real Long-precision complex ZDYAX

Syntax

Fortran CALL SYAX | DYAX | CYAX | ZYAX | CSYAX | ZDYAX (n, alpha, x, incx, y, incy)
C and C++ syax | dyax | cyax | zyax | csyax | zdyax (n, alpha, x, incx, y, incy);
PL/I CALL SYAX | DYAX | CYAX | ZYAX | CSYAX | ZDYAX (n, alpha, x, incx, y, incy);

On Entry

n

is the number of elements in vector x and y. Specified as: a fullword integer; n >= 0.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 55.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 55.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

See 'On Return'.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

On Return

y

is the vector y of length n, containing the result of the computation alphax. Returned as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 55.

Notes

  1. If you specify the same vector for x and y, then incx and incy must be equal; otherwise, results are unpredictable.

  2. If you specify different vectors for x and y, they must have no common elements; otherwise, results are unpredictable. See "Concepts".

Function

The computation is expressed as follows:



Figure ESYGR87 not displayed.


See reference [73]. If n is 0, no computation is performed. For CYAX, intermediate results are accumulated in long precision.

Error Condition

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x and y with positive strides.

Call Statement and Input
           N  ALPHA  X  INCX  Y  INCY
           |    |    |   |    |   |
CALL SYAX( 5 , 2.0 , X , 1  , Y , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
Y        =  (2.0, . , 4.0, . , 6.0, . , 8.0, . , 10.0)

Example 2

This example shows vectors x and y that have strides of opposite signs. For y, which has negative stride, results are stored beginning in element Y(5).

Call Statement and Input
           N  ALPHA  X  INCX  Y   INCY
           |    |    |   |    |    |
CALL SYAX( 5 , 2.0 , X , 1  , Y , -1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)

Output
Y        =  (10.0, 8.0, 6.0, 4.0, 2.0)

Example 3

This example shows a vector, x, with 0 stride. x is treated like a vector of length n, all of whose elements are the same as the single element in x.

Call Statement and Input
           N  ALPHA  X  INCX  Y  INCY
           |    |    |   |    |   |
CALL SYAX( 5 , 2.0 , X , 0  , Y , 1  )
 
X        =  (1.0)

Output
Y        =  (2.0, 2.0, 2.0, 2.0, 2.0)

Example 4

This example shows how SYAX can be used to compute a scalar value. In this case both vectors x and y contain scalar values, and the strides for both vectors are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
           N  ALPHA  X  INCX  Y  INCY
           |    |    |   |    |   |
CALL SYAX( 1 , 2.0 , X , 0  , Y , 0  )
 
X        =  (1.0)

Output
Y        =  (2.0)

Example 5

This example shows a scalar, alpha, and vectors x and y, containing complex numbers, where both vectors have a stride of 1.

Call Statement and Input
           N  ALPHA  X  INCX  Y  INCY
           |    |    |   |    |   |
CALL CYAX( 3 ,ALPHA, X , 1  , Y , 1  )
 
ALPHA    =  (2.0, 3.0)
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))

Output
Y        =  ((-4.0, 7.0), (4.0, 6.0), (-9.0, 19.0))

Example 6

This example shows a scalar, alpha, containing a real number, and vectors x and y, containing complex numbers, where both vectors have a stride of 1.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY
            |    |    |   |    |   |
CALL CSYAX( 3 , 2.0 , X , 1  , Y , 1  )
 
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))

Output
Y        =  ((2.0, 4.0), (4.0, 0.0), (6.0, 10.0))

SZAXPY, DZAXPY, CZAXPY, and ZZAXPY--Multiply a Vector X by a Scalar, Add to a Vector Y, and Store in a Vector Z

These subprograms perform the following computation, using the scalar alpha and vectors x, y, and z:

z <-- y+alphax

Table 56. Data Types
alpha, x, y, z Subprogram
Short-precision real SZAXPY
Long-precision real DZAXPY
Short-precision complex CZAXPY
Long-precision complex ZZAXPY

Syntax

Fortran CALL SZAXPY | DZAXPY | CZAXPY | ZZAXPY (n, alpha, x, incx, y, incy, z, incz)
C and C++ szaxpy | dzaxpy | czaxpy | zzaxpy (n, alpha, x, incx, y, incy, z, incz);
PL/I CALL SZAXPY | DZAXPY | CZAXPY | ZZAXPY (n, alpha, x, incx, y, incy, z, incz);

On Entry

n

is the number of elements in vectors x, y, and z. Specified as: a fullword integer; n >= 0.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 56.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 56.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 56.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

z

See 'On Return'.

incz

is the stride for vector z. Specified as: a fullword integer. It can have any value.

On Return

z

is the vector z of length n, containing the result of the computation y+alphax . Returned as: a one-dimensional array of (at least) length 1+(n-1)|incz|, containing numbers of the data type indicated in Table 56.

Notes

  1. If you specify the same vector for x and z, then incx and incz must be equal; otherwise, results are unpredictable. The same is true for y and z.

  2. If you specify different vectors for x and z, they must have no common elements; otherwise, results are unpredictable. The same is true for y and z. See "Concepts".

Function

The computation is expressed as follows:



Figure ESYGR88 not displayed.


See reference [73]. If n is 0, no computation is performed. For CZAXPY, intermediate results are accumulated in long precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

n < 0

Example 1

This example shows vectors x and y with positive strides.

Call Statement and Input
             N  ALPHA  X  INCX  Y  INCY  Z  INCZ
             |    |    |   |    |   |    |   |
CALL SZAXPY( 5 , 2.0 , X , 1  , Y , 2  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 1.0, . , 1.0, . , 1.0, . , 1.0)

Output
Z        =  (3.0, 5.0, 7.0, 9.0, 11.0)

Example 2

This example shows vectors x and y having strides of opposite sign, and an output vector z having a positive stride. For y, which has negative stride, processing begins at element Y(5), which is 1.0.

Call Statement and Input
             N  ALPHA  X  INCX  Y   INCY  Z  INCZ
             |    |    |   |    |    |    |   |
CALL SZAXPY( 5 , 2.0 , X , 1  , Y , -1  , Z , 2  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (3.0, . , 6.0, . , 9.0, . , 12.0, . , 15.0)

Example 3

This example shows a vector, x, with 0 stride, and a vector, z, with negative stride. x is treated like a vector of length n, all of whose elements are the same as the single element in x. For vector z, results are stored beginning in element Z(5).

Call Statement and Input
             N  ALPHA  X  INCX  Y  INCY  Z   INCZ
             |    |    |   |    |   |    |    |
CALL SZAXPY( 5 , 2.0 , X , 0  , Y , 1  , Z , -1  )
 
X        =  (1.0)
Y        =  (5.0, 4.0, 3.0, 2.0, 1.0)

Output
Z        =  (3.0, 4.0, 5.0, 6.0, 7.0)

Example 4

This example shows a vector, y, with 0 stride. y is treated like a vector of length n, all of whose elements are the same as the single element in y.

Call Statement and Input
             N  ALPHA  X  INCX  Y  INCY  Z  INCZ
             |    |    |   |    |   |    |   |
CALL SZAXPY( 5 , 2.0 , X , 1  , Y , 0  , Z , 1  )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (5.0)

Output
Z        =  (7.0, 9.0, 11.0, 13.0, 15.0)

Example 5

This example shows how SZAXPY can be used to compute a scalar value. In this case, vectors x and y contain scalar values. The strides of all vectors, x, y, and z, are 0. The number of elements to be processed, n, is 1.

Call Statement and Input
             N  ALPHA  X  INCX  Y  INCY  Z  INCZ
             |    |    |   |    |   |    |   |
CALL SZAXPY( 1 , 2.0 , X , 0  , Y , 0  , Z , 0  )
 
X        =  (1.0)
Y        =  (5.0)

Output
Z        =  (7.0)

Example 6

This example shows vectors x and y, containing complex numbers and having positive strides.

Call Statement and Input
             N  ALPHA  X  INCX  Y  INCY  Z  INCZ
             |    |    |   |    |   |    |   |
CALL CZAXPY( 3 ,ALPHA, X , 1  , Y , 2  , Z , 1  )
 
ALPHA    =  (2.0, 3.0)
X        =  ((1.0, 2.0), (2.0, 0.0), (3.0, 5.0))
Y        =  ((1.0, 1.0), . , (0.0, 2.0), . , (5.0, 4.0))

Output
Z        =  ((-3.0, 8.0), (4.0, 8.0), (-4.0, 23.0))

Sparse Vector-Scalar Subprograms

This section contains the sparse vector-scalar subprogram descriptions.

SSCTR, DSCTR, CSCTR, ZSCTR--Scatter the Elements of a Sparse Vector X in Compressed-Vector Storage Mode into Specified Elements of a Sparse Vector Y in Full-Vector Storage Mode

These subprograms scatter the elements of sparse vector x, stored in compressed-vector storage mode, into specified elements of sparse vector y, stored in full-vector storage mode.

Table 57. Data Types
x, y Subprogram
Short-precision real SSCTR
Long-precision real DSCTR
Short-precision complex CSCTR
Long-precision complex ZSCTR

Syntax

Fortran CALL SSCTR | DSCTR | CSCTR | ZSCTR (nz, x, indx, y)
C and C++ ssctr | dsctr | csctr | zsctr (nz, x, indx, y);
PL/I CALL SSCTR | DSCTR | CSCTR | ZSCTR (nz, x, indx, y);

On Entry

nz

is the number of elements in sparse vector x, stored in compressed-vector storage mode. Specified as: a fullword integer; nz >= 0.

x

is the sparse vector x, containing nz elements, stored in compressed-vector storage mode in an array, referred to as X. Specified as: a one-dimensional array of (at least) length nz, containing numbers of the data type indicated in Table 57.

indx

is the array, referred to as INDX, containing the nz indices that indicate the positions of the elements of the sparse vector x when in full-vector storage mode. They also indicate the positions in vector y into which the elements are copied.

Specified as: a one-dimensional array of (at least) length nz, containing fullword integers.

y

See 'On Return'.

On Return

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz, into which nz elements of vector x are copied at positions indicated by the indices array INDX.

Returned as: a one-dimensional array of (at least) length max(INDX(i)) for i = 1, nz, containing numbers of the data type indicated in Table 57.

Notes

  1. Each value specified in array INDX must be unique; otherwise, results are unpredictable.

  2. Vectors x and y must have no common elements; otherwise, results are unpredictable. See "Concepts".

  3. For a description of how sparse vectors are stored, see "Sparse Vector".

Function

The copy is expressed as follows:

yINDX(i) <-- xi    for i = 1, nz

where:

x is a sparse vector, stored in compressed-vector storage mode.
INDX is the indices array for sparse vector x.
y is a sparse vector, stored in full-vector storage mode.

See reference [29]. If nz is 0, no copy is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

nz < 0

Example 1

This example shows how to use SSCTR to copy a sparse vector x of length 5 into the following vector y, where the elements of array INDX are in ascending order:

          Y = (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0 )

Call Statement and Input
            NZ  X   INDX   Y
            |   |    |     |
CALL SSCTR( 5 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (1, 3, 4, 7, 10)

Output
Y        =  (1.0, 2.0, 2.0, 3.0, 6.0, 10.0, 4.0, 8.0, 9.0, 5.0)

Example 2

This example shows how to use SSCTR to copy a sparse vector x of length 5 into the following vector y, where the elements of array INDX are in random order:

         Y = (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0 )

Call Statement and Input
            NZ  X   INDX   Y
            |   |    |     |
CALL SSCTR( 5 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (4, 3, 1, 10, 7)

Output
Y        =  (3.0, 2.0, 2.0, 1.0, 6.0, 10.0, 5.0, 8.0, 9.0, 4.0)

Example 3

This example shows how to use CSCTR to copy a sparse vector x of length 3 into the following vector y, where the elements of array INDX are in random order:

         Y = ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))

Call Statement and Input
            NZ  X   INDX   Y
            |   |    |     |
CALL CSCTR( 3 , X , INDX , Y )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
INDX     =  (4, 1, 3)

Output
Y        =  ((3.0, 4.0), (-2.0, 3.0), (5.0, 6.0), (1.0, 2.0))

SGTHR, DGTHR, CGTHR, and ZGTHR--Gather Specified Elements of a Sparse Vector Y in Full-Vector Storage Mode into a Sparse Vector X in Compressed-Vector Storage Mode

These subprograms gather specified elements of vector y, stored in full-vector storage mode, into sparse vector x, stored in compressed-vector storage mode.

Table 58. Data Types
x, y Subprogram
Short-precision real SGTHR
Long-precision real DGTHR
Short-precision complex CGTHR
Long-precision complex ZGTHR

Syntax

Fortran CALL SGTHR | DGTHR | CGTHR | ZGTHR (nz, y, x, indx)
C and C++ sgthr | dgthr | cgthr | zgthr (nz, y, x, indx);
PL/I CALL SGTHR | DGTHR | CGTHR | ZGTHR (nz, y, x, indx);

On Entry

nz

is the number of elements in sparse vector x, stored in compressed-vector storage mode. Specified as: a fullword integer; nz >= 0.

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz, from which nz elements are copied from positions indicated by the indices array INDX.

Specified as: a one-dimensional array of (at least) length max(INDX(i)) for i = 1, nz, containing numbers of the data type indicated in Table 58.

x

See 'On Return'.

indx

is the array, referred to as INDX, containing the nz indices that indicate the positions of the elements of the sparse vector x when in full-vector storage mode. They also indicate the positions in vector y from which elements are copied.

Specified as: a one-dimensional array of (at least) length nz, containing fullword integers.

On Return

x

is the sparse vector x, containing nz elements, stored in compressed-vector storage mode in an array, referred to as X, into which are copied the elements of vector y from positions indicated by the indices array INDX.

Returned as: a one-dimensional array of (at least) length nz, containing numbers of the data type indicated in Table 58.

Notes

  1. Vectors x and y must have no common elements; otherwise, results are unpredictable. See "Concepts".

  2. For a description of how sparse vectors are stored, see "Sparse Vector".

Function

The copy is expressed as follows:

xi <-- yINDX(i)    for i = 1, nz

where:

x is a sparse vector, stored in compressed-vector storage mode.
INDX is the indices array for sparse vector x.
y is a sparse vector, stored in full-vector storage mode.

See reference [29]. If nz is 0, no copy is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

nz < 0

Example 1

This example shows how to use SGTHR to copy specified elements of a vector y into a sparse vector x of length 5, where the elements of array INDX are in ascending order.

Call Statement and Input
            NZ  Y   X   INDX
            |   |   |    |
CALL SGTHR( 5 , Y , X , INDX )
 
Y        =  (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)
INDX     =  (1, 3, 4, 7, 9)

Output
X        =  (6.0, 4.0, 7.0, -2.0, 9.0)

Example 2

This example shows how to use SGTHR to copy specified elements of a vector y into a sparse vector x of length 5, where the elements of array INDX are in random order. (Note that the element 0.0 occurs in output vector x. This does not produce an error.)

Call Statement and Input
            NZ  Y   X   INDX
            |   |   |    |
CALL SGTHR( 5 , Y , X , INDX )
 
Y        =  (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)
INDX     =  (4, 3, 1, 10, 7)

Output
X        =  (7.0, 4.0, 6.0, 0.0, -2.0)

Example 3

This example shows how to use CGTHR to copy specified elements of a vector, y, into a sparse vector, x, of length 3, where the elements of array INDX are in random order.

Call Statement and Input
            NZ  Y   X   INDX
            |   |   |    |
CALL CGTHR( 3 , Y , X , INDX )
 
Y        =  ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))
INDX     =  (4, 1, 3)

Output
X        =  ((9.0, 0.0), (6.0, 5.0), (15.0, 4.0))

SGTHRZ, DGTHRZ, CGTHRZ, and ZGTHRZ--Gather Specified Elements of a Sparse Vector Y in Full-Vector Mode into a Sparse Vector X in Compressed-Vector Mode, and Zero the Same Specified Elements of Y

These subprograms gather specified elements of sparse vector y, stored in full-vector storage mode, into sparse vector x, stored in compressed-vector storage mode, and zero the same specified elements of vector y.

Table 59. Data Types
x, y Subprogram
Short-precision real SGTHRZ
Long-precision real DGTHRZ
Short-precision complex CGTHRZ
Long-precision complex ZGTHRZ

Syntax

Fortran CALL SGTHRZ | DGTHRZ | CGTHRZ | ZGTHRZ (nz, y, x, indx)
C and C++ sgthrz | dgthrz | cgthrz | zgthrz (nz, y, x, indx);
PL/I CALL SGTHRZ | DGTHRZ | CGTHRZ | ZGTHRZ (nz, y, x, indx);

On Entry

nz

is the number of elements in sparse vector x, stored in compressed-vector storage mode. Specified as: a fullword integer; nz >= 0.

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz, from which nz elements are copied from positions indicated by the indices array INDX.

Specified as: a one-dimensional array of (at least) length max(INDX(i)) for i = 1, nz, containing numbers of the data type indicated in Table 59.

x

See 'On Return'.

indx

is the array, referred to as INDX, containing the nz indices that indicate the positions of the elements of the sparse vector x when in full-vector storage mode. They also indicate the positions in vector y from which elements are copied then set to zero.

Specified as: a one-dimensional array of (at least) length nz, containing fullword integers.

On Return

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz, whose elements are set to zero at positions indicated by the indices array INDX.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 59.

x

is the sparse vector x, containing nz elements stored in compressed-vector storage mode in an array, referred to as X, into which are copied the elements of vector y from positions indicated by the indices array INDX.

Returned as: a one-dimensional array of (at least) length nz, containing numbers of the data type indicated in Table 59.

Notes

  1. Each value specified in array INDX must be unique; otherwise, results are unpredictable.

  2. Vectors x and y must have no common elements; otherwise, results are unpredictable. See "Concepts".

  3. For a description of how sparse vectors are stored, see "Sparse Vector".

Function

The copy is expressed as follows:

xi <-- yINDX(i)
yINDX(i) <-- 0.0    (for SGTHRZ and DGTHRZ)
yINDX(i) <-- (0.0,0.0)    (for CGTHRZ and ZGTHRZ)
   for i = 1,nz

where:

x is a sparse vector, stored in compressed-vector storage mode.
INDX is the indices array for sparse vector x.
y is a sparse vector, stored in full-vector storage mode.

See reference [29]. If nz is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors

nz < 0

Example 1

This example shows how to use SGTHRZ to copy specified elements of a vector y into a sparse vector x of length 5, where the elements of array INDX are in ascending order.

Call Statement and Input
             NZ  Y   X   INDX
             |   |   |    |
CALL SGTHRZ( 5 , Y , X , INDX )
 
Y        =  (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)
INDX     =  (1, 3, 4, 7, 9)

Output
Y        =  (0.0, 2.0, 0.0, 0.0, 6.0, 10.0, 0.0, 8.0, 0.0, 0.0)
X        =  (6.0, 4.0, 7.0, -2.0, 9.0)

Example 2

This example shows how to use SGTHRZ to copy specified elements of a vector y into a sparse vector x of length 5, where the elements of array INDX are in random order. (Note that the element 0.0 occurs in output vector x. This does not produce an error.)

Call Statement and Input
             NZ  Y   X   INDX
             |   |   |    |
CALL SGTHRZ( 5 , Y , X , INDX )
 
Y        =  (6.0, 2.0, 4.0, 7.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)
INDX     =  (4, 3, 1, 10, 7)

Output
Y        =  (0.0, 2.0, 0.0, 0.0, 6.0, 10.0, 0.0, 8.0, 9.0, 0.0)
X        =  (7.0, 4.0, 6.0, 0.0, -2.0)

Example 3

This example shows how to use CGTHRZ to copy specified elements of a vector y into a sparse vector x of length 3, where the elements of array INDX are in random order.

Call Statement and Input
             NZ  Y   X   INDX
             |   |   |    |
CALL CGTHRZ( 3 , Y , X , INDX )
 
Y        =  ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))
INDX     =  (4, 1, 3)

Output
Y        =  ((0.0, 0.0), (-2.0, 3.0), (0.0, 0.0), (0.0, 0.0))
X        =  ((9.0, 0.0), (6.0, 5.0), (15.0, 4.0))

SAXPYI, DAXPYI, CAXPYI, and ZAXPYI--Multiply a Sparse Vector X in Compressed-Vector Storage Mode by a Scalar, Add to a Sparse Vector Y in Full-Vector Storage Mode, and Store in the Vector Y

These subprograms multiply sparse vector x, stored in compressed-vector storage mode, by scalar alpha, add it to sparse vector y, stored in full-vector storage mode, and store the result in vector y.

Table 60. Data Types
alpha, x, y Subprogram
Short-precision real SAXPYI
Long-precision real DAXPYI
Short-precision complex CAXPYI
Long-precision complex ZAXPYI

Syntax

Fortran CALL SAXPYI | DAXPYI | CAXPYI | ZAXPYI (nz, alpha, x, indx, y)
C and C++ saxpyi | daxpyi | caxpyi | zaxpyi (nz, alpha, x, indx, y);
PL/I CALL SAXPYI | DAXPYI | CAXPYI | ZAXPYI (nz, alpha, x, indx, y);

On Entry

nz

is the number of elements in sparse vector x, stored in compressed-vector storage mode. Specified as: a fullword integer; nz >= 0.

alpha

is the scalar alpha. Specified as: a number of the data type indicated in Table 60.

x

is the sparse vector x, containing nz elements, stored in compressed-vector storage mode in an array, referred to as X. Specified as: a one-dimensional array of (at least) length nz, containing numbers of the data type indicated in Table 60.

indx

is the array, referred to as INDX, containing the nz indices that indicate the positions of the elements of the sparse vector x when in full-vector storage mode. They also indicate the positions of the elements in vector y that are used in the computation.

Specified as: a one-dimensional array of (at least) length nz, containing fullword integers.

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz. Specified as: a one-dimensional array of (at least) length max(INDX(i)) for i = 1, nz, containing numbers of the data type indicated in Table 60.

On Return

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz containing the results of the computation, stored at positions indicated by the indices array INDX.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 60.

Notes

  1. Each value specified in array INDX must be unique; otherwise, results are unpredictable.

  2. Vectors x and y must have no common elements; otherwise, results are unpredictable. See "Concepts".

  3. For a description of how sparse vectors are stored, see "Sparse Vector".

Function

The computation is expressed as follows:

yINDX(i) <-- yINDX(i) + alphaxi    for i = 1, nz

where:

x is a sparse vector, stored in compressed-vector storage mode.
INDX is the indices array for sparse vector x.
y is a sparse vector, stored in full-vector storage mode.

See reference [29]. If alpha or nz is zero, no computation is performed. For SAXPYI and CAXPYI, intermediate results are accumulated in long-precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

nz < 0

Example 1

This example shows how to use SAXPYI to perform a computation using a sparse vector x of length 5, where the elements of array INDX are in ascending order.

Call Statement and Input
             NZ ALPHA  X   INDX   Y
             |    |    |    |     |
CALL SAXPYI( 5 , 2.0 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (1, 3, 4, 7, 10)
Y        =  (1.0, 5.0, 4.0, 3.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)

Output
Y        =  (3.0, 5.0, 8.0, 9.0, 6.0, 10.0, 6.0, 8.0, 9.0, 10.0)

Example 2

This example shows how to use SAXPYI to perform a computation using a sparse vector x of length 5, where the elements of array INDX are in random order.

Call Statement and Input
             NZ ALPHA  X   INDX   Y
             |    |    |    |     |
CALL SAXPYI( 5 , 2.0 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (4, 3, 1, 10, 7)
Y        =  (1.0, 5.0, 4.0, 3.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)

Output
Y        =  (7.0, 5.0, 8.0, 5.0, 6.0, 10.0, 8.0, 8.0, 9.0, 8.0)

Example 3

This example shows how to use CAXPYI to perform a computation using a sparse vector x of length 3, where the elements of array INDX are in random order.

Call Statement and Input
             NZ  ALPHA   X   INDX   Y
             |     |     |    |     |
CALL CAXPYI( 3 , ALPHA , X , INDX , Y )
 
ALPHA    =  (2.0, 3.0)
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
INDX     =  (4, 1, 3)
Y        =  ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))

Output
Y        =  ((0.0, 22.0), (-2.0, 3.0), (7.0, 31.0), (5.0, 7.0))

SDOTI, DDOTI, CDOTUI, ZDOTUI, CDOTCI, and ZDOTCI--Dot Product of a Sparse Vector X in Compressed-Vector Storage Mode and a Sparse Vector Y in Full-Vector Storage Mode

SDOTI, DDOTI, CDOTUI, and ZDOTUI compute the dot product of sparse vector x, stored in compressed-vector storage mode, and full vector y, stored in full-vector storage mode.

CDOTCI and ZDOTCI compute the dot product of the complex conjugate of sparse vector x, stored in compressed-vector storage mode, and full vector y, stored in full-vector storage mode.

Table 61. Data Types
x, y, Result Subprogram
Short-precision real SDOTI
Long-precision real DDOTI
Short-precision complex CDOTUI
Long-precision complex ZDOTUI
Short-precision complex CDOTCI
Long-precision complex ZDOTCI

Syntax

Fortran SDOTI | DDOTI | CDOTUI | ZDOTUI | CDOTCI | ZDOTCI (nz, x, indx, y)
C and C++ sdoti | ddoti | cdotui | zdotui | cdotci | zdotci (nz, x, indx, y);
PL/I SDOTI | DDOTI | CDOTUI | ZDOTUI | CDOTCI | ZDOTCI (nz, x, indx, y);

On Entry

nz

is the number of elements in sparse vector x, stored in compressed-vector storage mode. Specified as: a fullword integer; nz >= 0.

x

is the sparse vector x, containing nz elements, stored in compressed-vector storage mode in an array, referred to as X. Specified as: a one-dimensional array of (at least) length nz, containing numbers of the data type indicated in Table 61.

indx

is the array, referred to as INDX, containing the nz indices that indicate the positions of the elements of the sparse vector x when in full-vector storage mode. They also indicate the positions of elements in vector y that are used in the computation.

Specified as: a one-dimensional array of (at least) length nz, containing fullword integers.

y

is the sparse vector y, stored in full-vector storage mode, of (at least) length max(INDX(i)) for i = 1, nz. Specified as: a one-dimensional array of (at least) length max(INDX(i)) for i = 1, nz, containing numbers of the data type indicated in Table 61.

On Return

Function value

 

is the result of the dot product computation.

Returned as: a number of the data type indicated in Table 61.

Note

  1. Declare this function in your program as returning a value of the data type indicated in Table 61.

  2. For a description of how sparse vectors are stored, see "Sparse Vector".

Function

For SDOTI, DDOTI, CDOTUI, and ZDOTUI, the dot product computation is expressed as follows:



Figure ESYGR89 not displayed.

For CDOTCI and ZDOTCI, the dot product computation is expressed as follows:



Figure ESYGR90 not displayed.

where:

x is a sparse vector, stored in compressed-vector storage mode.



Figure ESYGR91 not displayed.

INDX is the indices array for sparse vector x.

y is a sparse vector, stored in full-vector storage mode.

See reference [29]. The result is returned as the function value. If nz is 0, then zero is returned as the value of the function.

For SDOTI, CDOTUI, and CDOTCI, intermediate results are accumulated in long-precision.

Error Conditions

Computational Errors

None

Input-Argument Errors

nz < 0

Example 1

This example shows how to use SDOTI to compute a dot product using a sparse vector x of length 5, where the elements of array INDX are in ascending order.

Function Reference and Input
              NZ  X   INDX   Y
              |   |    |     |
DOTT = SDOTI( 5 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (1, 3, 4, 7, 10)
Y        =  (1.0, 5.0, 4.0, 3.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)

Output
DOTT     =  (1.0 + 8.0 + 9.0 -8.0 + 0.0) = 10.0

Example 2

This example shows how to use SDOTI to compute a dot product using a sparse vector x of length 5, where the elements of array INDX are in random order.

Function Reference and Input
              NZ  X   INDX   Y
              |   |    |     |
DOTT = SDOTI( 5 , X , INDX , Y )
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
INDX     =  (4, 3, 1, 10, 7)
Y        =  (1.0, 5.0, 4.0, 3.0, 6.0, 10.0, -2.0, 8.0, 9.0, 0.0)

Output
DOTT     =  (3.0 + 8.0 + 3.0 + 0.0 -10.0) = 4.0

Example 3

This example shows how to use CDOTUI to compute a dot product using a sparse vector x of length 3, where the elements of array INDX are in ascending order.

Function Reference and Input
               NZ  X   INDX   Y
               |   |    |     |
DOTT = CDOTUI( 3 , X , INDX , Y )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
INDX     =  (1, 3, 4)
Y        =  ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))

Output
DOTT     =  (70.0, 143.0)

Example 4

This example shows how to use CDOTCI to compute a dot product using the complex conjugate of a sparse vector x of length 3, where the elements of array INDX are in random order.

Function Reference and Input
               NZ  X   INDX   Y
               |   |    |     |
DOTT = CDOTCI( 3 , X , INDX , Y )
 
X        =  ((1.0, 2.0), (3.0, 4.0), (5.0, 6.0))
INDX     =  (4, 1, 3)
Y        =  ((6.0, 5.0), (-2.0, 3.0), (15.0, 4.0), (9.0, 0.0))

Output
DOTT     =  (146.0, -97.0)

Matrix-Vector Subprograms

This section contains the matrix-vector subprogram descriptions.

SGEMV, DGEMV, CGEMV, ZGEMV, SGEMX, DGEMX, SGEMTX, and DGEMTX--Matrix-Vector Product for a General Matrix, Its Transpose, or Its Conjugate Transpose

SGEMV and DGEMV compute the matrix-vector product for either a real general matrix or its transpose, using the scalars alpha and beta, vectors x and y, and matrix A or its transpose:

y <-- betay+alphaAx

y <-- betay+alphaATx

CGEMV and ZGEMV compute the matrix-vector product for either a complex general matrix, its transpose, or its conjugate transpose, using the scalars alpha and beta, vectors x and y, and matrix A, its transpose, or its conjugate transpose:

y <-- betay+alphaAx
y <-- betay+alphaATx
y <-- betay+alphaAHx

SGEMX and DGEMX compute the matrix-vector product for a real general matrix, using the scalar alpha, vectors x and y, and matrix A:

y <-- y+alphaAx

SGEMTX and DGEMTX compute the matrix-vector product for the transpose of a real general matrix, using the scalar alpha, vectors x and y, and the transpose of matrix A:

y <-- y+alphaATx

Table 62. Data Types
alpha, beta, x, y, A Subprogram
Short-precision real SGEMV, SGEMX, and SGEMTX
Long-precision real DGEMV, DGEMX, and DGEMTX
Short-precision complex CGEMV
Long-precision complex ZGEMV
Note: SGEMV and DGEMV are Level 2 BLAS subroutines. It is suggested that these subroutines be used instead of SGEMX, DGEMX, SGEMTX, and DGEMTX, which are provided only for compatibility with earlier releases of ESSL.

Syntax

Fortran CALL SGEMV | DGEMV | CGEMV | ZGEMV (transa, m, n, alpha, a, lda, x, incx, beta, y, incy)

CALL SGEMX | DGEMX | SGEMTX | DGEMTX ( m, n, alpha, a, lda, x, incx, y, incy)

C and C++ sgemv | dgemv | cgemv | zgemv (transa, m, n, alpha, a, lda, x, incx, beta, y, incy);

sgemx | dgemx | sgemtx | dgemtx ( m, n, alpha, a, lda, x, incx, y, incy);

PL/I CALL SGEMV | DGEMV | CGEMV | ZGEMV (transa, m, n, alpha, a, lda, x, incx, beta, y, incy);

CALL SGEMX | DGEMX | SGEMTX | DGEMTX ( m, n, alpha, a, lda, x, incx, y, incy);

On Entry

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation.

If transa = 'T', AT is used in the computation.

If transa = 'C', AH is used in the computation.

Specified as: a single character. It must be 'N', 'T', or 'C'.

m

is the number of rows in matrix A, and:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it is the length of vector y.
If transa = 'T' or 'C', it is the length of vector x.

For SGEMX and DGEMX, it is the length of vector y.

For SGEMTX and DGEMTX, it is the length of vector x.

Specified as: a fullword integer; 0 <= m <= lda.

n

is the number of columns in matrix A, and:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it is the length of vector x.
If transa = 'T' or 'C', it is the length of vector y.

For SGEMX and DGEMX, it is the length of vector x.

For SGEMTX and DGEMTX, it is the length of vector y.

Specified as: a fullword integer; n >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 62.

a

is the m by n matrix A, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', A is used in the computation.
If transa = 'T', AT is used in the computation.
If transa = 'C', AH is used in the computation.

For SGEMX and DGEMX, A is used in the computation.

For SGEMTX and DGEMTX, AT is used in the computation.
Note: No data should be moved to form AT or AH; that is, the matrix A should always be stored in its untransposed form.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 62.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= m.

x

is the vector x, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it has length n.
If transa = 'T' or 'C', it has length m.

For SGEMX and DGEMX, it has length n.

For SGEMTX and DGEMTX, it has length m.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 62, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it must have at least 1+(n-1)|incx| elements.
If transa = 'T' or 'C', it must have at least 1+(m-1)|incx| elements.

For SGEMX and DGEMX, it must have at least 1+(n-1)|incx| elements.

For SGEMTX and DGEMTX, it must have at least 1+(m-1)|incx| elements.

incx

is the stride for vector x. Specified as: a fullword integer; It can have any value.

beta

is the scaling constant beta. Specified as: a number of the data type indicated in Table 62.

y

is the vector y, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it has length m.
If transa = 'T' or 'C', it has length n.

For SGEMX and DGEMX, it has length m.

For SGEMTX and DGEMTX, it has length n.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 62, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it must have at least 1+(m-1)|incy| elements.
If transa = 'T' or 'C', it must have at least 1+(n-1)|incy| elements.

For SGEMX and DGEMX, it must have at least 1+(m-1)|incy| elements.

For SGEMTX and DGEMTX, it must have at least 1+(n-1)|incy| elements.

incy

is the stride for vector y. Specified as: a fullword integer; incy > 0 or incy < 0.

On Return

y

is the vector y, containing the result of the computation, where:

For SGEMV, DGEMV, CGEMV, and ZGEMV:

If transa = 'N', it has length m.
If transa = 'T' or 'C', it has length n.

For SGEMX and DGEMX, it has length m.

For SGEMTX and DGEMTX, it has length n.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 62.

Notes

  1. For SGEMV and DGEMV, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.

  2. The SGEMV, DGEMV, CGEMV, and ZGEMV subroutines accept lowercase letters for the transa argument.

  3. In the SGEMV, DGEMV, CGEMV, and ZGEMV subroutines, incx = 0 is valid; however, the Level 2 BLAS standard considers incx = 0 to be invalid. See references [34] and [35].

  4. Vector y must have no common elements with matrix A or vector x; otherwise, results are unpredictable. See "Concepts".

Function

The possible computations that can be performed by these subroutines are described in the following sections. Varying implementation techniques are used for this computation to improve performance. As a result, accuracy of the computational result may vary for different computations.

For SGEMV, CGEMV, SGEMX, and SGEMTX, intermediate results are accumulated in long precision. Occasionally, for performance reasons, these intermediate results are stored.

See references [34], [35], [38], [46], and [73]. No computation is performed if m or n is 0 or if alpha is zero and beta is one.

General Matrix

For SGEMV, DGEMV, CGEMV, and ZGEMV, the matrix-vector product for a general matrix:

y <-- betay+alphaAx

is expressed as follows:



Figure ESYGR92 not displayed.


For SGEMX and DGEMX, the matrix-vector product for a real general matrix:

y <-- y+alphaAx

is expressed as follows:



Figure ESYGR93 not displayed.


In these expressions:

y is a vector of length m.
alpha is a scalar.
beta is a scalar.
A is an m by n matrix.
x is a vector of length n.

Transpose of a General Matrix

For SGEMV, DGEMV, CGEMV and ZGEMV, the matrix-vector product for the transpose of a general matrix:

y <-- betay+alphaATx

is expressed as follows:



Figure ESYGR94 not displayed.


For SGEMTX and DGEMTX, the matrix-vector product for the transpose of a real general matrix:

y <-- y+alphaATx

is expressed as follows:



Figure ESYGR95 not displayed.


In these expressions:

y is a vector of length n.
alpha is a scalar.
beta is a scalar.
AT is the transpose of matrix A, where A is an m by n matrix.
x is a vector of length m.

Conjugate Transpose of a General Matrix

For CGEMV and ZGEMV, the matrix-vector product for the conjugate transpose of a general matrix:

y <-- betay+alphaAHx

is expressed as follows:



Figure ESYGR96 not displayed.


where:

y is a vector of length n.
alpha is a scalar.
beta is a scalar.
AH is the conjugate transpose of matrix A, where A is an m by n matrix.
x is a vector of length m.

Error Conditions

Resource Errors

Unable to allocate internal work area (for SGEMV, DGEMV, CGEMV, and ZGEMV).

Computational Errors

None

Input-Argument Errors
  1. transa <> 'N', 'T', or 'C'
  2. m < 0
  3. m > lda
  4. n < 0
  5. lda <= 0
  6. incy = 0

Example 1

This example shows the computation for TRANSA equal to 'N', where the real general matrix A is used in the computation. Because lda is 10 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=10 and F2-F1+1 >= 3. In this example, array A is declared as A(1:10,0:2).

Call Statement and Input
           TRANSA M   N  ALPHA    A      LDA  X  INCX BETA   Y  INCY
             |    |   |    |      |       |   |   |     |    |   |
CALL SGEMV( 'N' , 4 , 3 , 1.0 , A(1,0) , 10 , X , 1  , 1.0 , Y , 2  )
    *               *
    | 1.0  2.0  3.0 |
    | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
A = |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    *               *
X        =  (3.0, 2.0, 1.0)
Y        =  (4.0, . , 5.0, . , 2.0, . , 3.0)

Output
Y        =  (14.0, . , 19.0, . , 17.0, . , 20.0)

Example 2

This example shows the computation for TRANSA equal to 'T', where the transpose of the real general matrix A is used in the computation. Array A must follow the same rules as given in Example 1. In this example, array A is declared as A(-1:8,1:3).

Call Statement and Input
           TRANSA M   N  ALPHA     A     LDA   X  INCX  BETA  Y  INCY
             |    |   |    |       |      |    |   |     |    |   |
CALL SGEMV( 'T' , 4 , 3 , 1.0 , A(-1,1) , 10 , X , 1  , 2.0 , Y , 2  )
A        =(same as input A in Example 1)
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, . , 2.0, . , 3.0)

Output
Y        =  (28.0, . , 24.0, . , 29.0)

Example 3

This example shows the computation for TRANSA equal to 'N', where the complex general matrix A is used in the computation.

Call Statement and Input
           TRANSA M   N   ALPHA   A   LDA  X  INCX  BETA   Y  INCY
             |    |   |     |     |    |   |   |     |     |   |
CALL CGEMV( 'N' , 5 , 3 , ALPHA , A , 10 , X , 1  , BETA , Y , 1  )
 
ALPHA    =  (1.0, 0.0)
    *                                    *
    | (1.0, 2.0)  (3.0, 5.0)  (2.0, 0.0) |
    | (2.0, 3.0)  (7.0, 9.0)  (4.0, 8.0) |
    | (7.0, 4.0)  (1.0, 4.0)  (6.0, 0.0) |
    | (8.0, 2.0)  (2.0, 5.0)  (8.0, 0.0) |
A = | (9.0, 1.0)  (3.0, 6.0)  (1.0, 0.0) |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    *                                    *
X        =  ((1.0, 2.0), (4.0, 0.0), (1.0, 1.0))
BETA     =  (1.0, 0.0)
Y        =  ((1.0, 2.0), (4.0, 0.0), (1.0, -1.0), (3.0, 4.0),
             (2.0, 0.0))

Output
Y        =  ((12.0, 28.0), (24.0, 55.0), (10.0, 39.0), (23.0, 50.0),
             (22.0, 44.0))

Example 4

This example shows the computation for TRANSA equal to 'T', where the transpose of complex general matrix A is used in the computation. Because beta is zero, the result of the computation is alphaATx

Call Statement and Input
           TRANSA M   N   ALPHA   A   LDA  X  INCX  BETA   Y  INCY
             |    |   |     |     |    |   |   |     |     |   |
CALL CGEMV( 'T' , 5 , 3 , ALPHA , A , 10 , X , 1  , BETA , Y , 1  )
ALPHA    =  (1.0, 0.0)
A        =(same as input A in Example 3)
X        =  ((1.0, 2.0), (4.0, 0.0), (1.0, 1.0), (3.0, 4.0),
             (2.0, 0.0))
BETA     =  (0.0, 0.0)
Y        =(not relevant)

Output
Y        =  ((42.0, 67.0), (10.0, 87.0), (50.0, 74.0))

Example 5

This example shows the computation for TRANSA equal to 'C', where the conjugate transpose of the complex general matrix A is used in the computation.

Call Statement and Input
           TRANSA M   N   ALPHA   A   LDA  X  INCX  BETA   Y  INCY
             |    |   |     |     |    |   |   |     |     |   |
CALL CGEMV( 'C' , 5 , 3 , ALPHA , A , 10 , X , 1  , BETA , Y , 1  )
ALPHA    =  (-1.0, 0.0)
A        =(same as input A in Example 3)
X        =  ((1.0, 2.0), (4.0, 0.0), (1.0, 1.0), (3.0, 4.0),
             (2.0, 0.0))
BETA     =  (1.0, 0.0)
Y        =  ((1.0, 2.0), (4.0, 0.0), (1.0, -1.0))

Output
Y        =  ((-73.0, -13.0), (-74.0, 57.0), (-49.0, -11.0))

Example 6

This example shows a matrix, A, contained in a larger array, A. The strides of vectors x and y are positive. Because lda is 10 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=10 and F2-F1+1 >= 3. For this example, array A is declared as A(1:10,0:2).

Call Statement and Input
            M   N  ALPHA    A     LDA   X  INCX  Y  INCY
            |   |    |      |      |    |   |    |   |
CALL SGEMX( 4 , 3 , 1.0 , A(1,0) , 10 , X , 1  , Y , 2  )
    *               *
    | 1.0  2.0  3.0 |
    | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
A = |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    *               *
X        =  (3.0, 2.0, 1.0)
Y        =  (4.0, . , 5.0, . , 2.0, . , 3.0)

Output
Y        =  (14.0, . , 19.0, . , 17.0, . , 20.0)

Example 7

This example shows a matrix, A, contained in a larger array, A. The strides of vectors x and y are of opposite sign. For y, which has negative stride, processing begins at element Y(7), which is 4.0. Array A must follow the same rules as given in Example 6. For this example, array A is declared as A(-1:8,1:3).

Call Statement and Input
            M   N  ALPHA     A     LDA   X  INCX  Y   INCY
            |   |    |       |      |    |   |    |    |
CALL SGEMX( 4 , 3 , 1.0 , A(-1,1) , 10 , X , 1  , Y , -2  )
A        =(same as input A in Example 6)
X        =  (3.0, 2.0, 1.0)
Y        =  (3.0, . , 2.0, . , 5.0, . , 4.0)

Output
Y        =  (20.0, . , 17.0, . , 19.0, . , 14.0)

Example 8

This example shows a matrix, A, contained in a larger array, A, and the first element of the matrix is not the first element of the array. Array A must follow the same rules as given in Example 6. For this example, array A is declared as A(1:10,1:3).

Call Statement and Input
            M   N  ALPHA    A     LDA   X  INCX  Y  INCY
            |   |    |      |      |    |   |    |   |
CALL SGEMX( 4 , 3 , 1.0 , A(5,1) , 10 , X , 1  , Y , 1  )
    *               *
    | .    .    .   |
    | .    .    .   |
    | .    .    .   |
    | .    .    .   |
A = | 1.0  2.0  3.0 |
    | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
    |  .    .    .  |
    |  .    .    .  |
    *               *
X        =  (3.0, 2.0, 1.0)
Y        =  (4.0, 5.0, 2.0, 3.0)

Output
Y        =  (14.0, 19.0, 17.0, 20.0)

Example 9

This example shows a matrix, A, and an array, A, having the same number of rows. For this case, m and lda are equal. Because lda is 4 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=4 and F2-F1+1 >= 3. For this example, array A is declared as A(1:4,0:2).

Call Statement and Input
            M   N  ALPHA    A     LDA  X  INCX  Y  INCY
            |   |    |      |      |   |   |    |   |
CALL SGEMX( 4 , 3 , 1.0 , A(1,0) , 4 , X , 1  , Y , 1  )
 
    *               *
    | 1.0  2.0  3.0 |
A = | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
    *               *
 
X        =  (3.0, 2.0, 1.0)
Y        =  (4.0, 5.0, 2.0, 3.0)

Output
Y        =  (14.0, 19.0, 17.0, 20.0)

Example 10

This example shows a matrix, A, and an array, A, having the same number of rows. For this case, m and lda are equal. Because lda is 4 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=4 and F2-F1+1 >= 3. For this example, array A is declared as A(1:4,0:2).

Call Statement and Input
             M   N  ALPHA    A     LDA  X  INCX  Y  INCY
             |   |    |      |      |   |   |    |   |
CALL SGEMTX( 4 , 3 , 1.0 , A(1,0) , 4 , X , 1  , Y , 1  )
 
    *               *
    | 1.0  2.0  3.0 |
A = | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
    *               *
 
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, 2.0, 3.0)

Output
Y        =  (27.0, 22.0, 26.0)

Example 11

This example shows a computation in which alpha is greater than 1. Array A must follow the same rules as given in Example 10. For this example, array A is declared as A(-1:2,1:3).

Call Statement and Input
             M   N  ALPHA     A     LDA  X  INCX  Y  INCY
             |   |    |       |      |   |   |    |   |
CALL SGEMTX( 4 , 3 , 2.0 , A(-1,1) , 4 , X , 1  , Y , 1  )
A        =(same as input A in Example 10)
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, 2.0, 3.0)

Output
Y        =  (53.0, 42.0, 49.0)

SGER, DGER, CGERU, ZGERU, CGERC, and ZGERC--Rank-One Update of a General Matrix

SGER, DGER, CGERU, and ZGERU compute the rank-one update of a general matrix, using the scalar alpha, matrix A, vector x, and the transpose of vector y:

A <-- A+alphaxyT

CGERC and ZGERC compute the rank-one update of a general matrix, using the scalar alpha, matrix A, vector x, and the conjugate transpose of vector y:

A <-- A+alphaxyH

Table 63. Data Types
alpha, A, x, y Subprogram
Short-precision real SGER
Long-precision real DGER
Short-precision complex CGERU and CGERC
Long-precision complex ZGERU and ZGERC
Note: For compatibility with earlier releases of ESSL, you can use the names SGER1 and DGER1 for SGER and DGER, respectively.

Syntax

Fortran CALL SGER | DGER | CGERU | ZGERU | CGERC | ZGERC (m, n, alpha, x, incx, y, incy, a, lda)
C and C++ sger | dger | cgeru | zgeru | cgerc | zgerc (m, n, alpha, x, incx, y, incy, a, lda);
PL/I CALL SGER | DGER | CGERU | ZGERU | CGERC | ZGERC (m, n, alpha, x, incx, y, incy, a, lda);

On Entry

m

is the number of rows in matrix A and the number of elements in vector x. Specified as: a fullword integer; 0 <= m <= lda.

n

is the number of columns in matrix A and the number of elements in vector y. Specified as: a fullword integer; n >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 63.

x

is the vector x of length m. Specified as: a one-dimensional array of (at least) length 1+(m-1)|incx|, containing numbers of the data type indicated in Table 63.

incx

is the stride for vector x. Specified as: a fullword integer. It can have any value.

y

is the vector y of length n, whose transpose or conjugate transpose is used in the computation.
Note: No data should be moved to form yT or yH; that is, the vector y should always be stored in its untransposed form.

Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 63.

incy

is the stride for vector y. Specified as: a fullword integer. It can have any value.

a

is the m by n matrix A. Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 63.

lda

is the size of the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= m.

On Return

a

is the m by n matrix A, containing the result of the computation.

Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 63.

Notes

  1. In these subroutines, incx = 0 and incy = 0 are valid; however, the Level 2 BLAS standard considers incx = 0 and incy = 0 to be invalid. See references [34] and [35].

  2. Matrix A can have no common elements with vectors x and y; otherwise, results are unpredictable. See "Concepts".

Function

SGER, DGER, CGERU, and ZGERU compute the rank-one update of a general matrix:

A <-- A+alphaxyT

where:

A is an m by n matrix.
alpha is a scalar.
x is a vector of length m.
yT is the transpose of vector y of length n.

It is expressed as follows:



Figure ESYGR97 not displayed.


It can also be expressed as:



Figure ESYGR98 not displayed.


CGERC and ZGERC compute a slightly different rank-one update of a general matrix:

A <-- A+alphaxyH

where:

A is an m by n matrix.
alpha is a scalar.
x is a vector of length m.
yH is the conjugate transpose of vector y of length n.

It is expressed as follows:



Figure ESYGR99 not displayed.


It can also be expressed as:



Figure ESYGR100 not displayed.


See references [34], [35], and [73]. No computation is performed if m, n, or alpha is zero. For CGERU and CGERC, intermediate results are accumulated in long precision. For SGER, intermediate results are accumulated in long precision on some platforms.

Error Conditions

Resource Errors

Unable to allocate internal work area.

Computational Errors

None

Input-Argument Errors
  1. m < 0
  2. n < 0
  3. lda <= 0
  4. m > lda

Example 1

This example shows a matrix, A, contained in a larger array, A. The strides of vectors x and y are positive. Because lda is 10 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=10 and F2-F1+1 >= 3. For this example, array A is declared as A(1:10,0:2).

Call Statement and Input
           M   N  ALPHA  X  INCX  Y  INCY    A     LDA
           |   |    |    |   |    |   |      |      |
CALL SGER( 4 , 3 , 1.0 , X , 1  , Y , 2  , A(1,0) , 10 )
 
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, . , 2.0, . , 3.0)
    *               *
    | 1.0  2.0  3.0 |
    | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
A = |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    *               *

Output
    *                 *
    | 4.0   8.0  12.0 |
    | 4.0   6.0  10.0 |
    | 4.0   4.0   5.0 |
    | 8.0  10.0  13.0 |
A = |  .     .     .  |
    |  .     .     .  |
    |  .     .     .  |
    |  .     .     .  |
    |  .     .     .  |
    |  .     .     .  |
    *                 *

Example 2

This example shows a matrix, A, contained in a larger array, A. The strides of vectors x and y are of opposite sign. For y, which has negative stride, processing begins at element Y(5), which is 1.0. Array A must follow the same rules as given in Example 1. For this example, array A is declared as A(-1:8,1:3).

Call Statement and Input
           M   N  ALPHA  X  INCX  Y   INCY     A     LDA
           |   |    |    |   |    |    |       |      |
CALL SGER( 4 , 3 , 1.0 , X , 1  , Y , -2  , A(-1,1) , 10 )
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (3.0, . , 2.0, . , 1.0)
A        =(same as input A in Example 1)

Output
A        =(same as input A in Example 1)

Example 3

This example shows a matrix, A, contained in a larger array, A, and the first element of the matrix is not the first element of the array. Array A must follow the same rules as given in Example 1. For this example, array A is declared as A(1:10,1:3).

Call Statement and Input
           M   N  ALPHA  X  INCX  Y  INCY    A     LDA
           |   |    |    |   |    |   |      |      |
CALL SGER( 4 , 3 , 1.0 , X , 3  , Y , 1  , A(4,1) , 10 )
 
X        =  (3.0, . , . , 2.0, . , . , 1.0, . , . , 4.0)
Y        =  (1.0, 2.0, 3.0)
    *               *
    | .    .    .   |
    | .    .    .   |
    | .    .    .   |
    | 1.0  2.0  3.0 |
A = | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
    |  .    .    .  |
    |  .    .    .  |
    |  .    .    .  |
    *               *

Output
    *                 *
    | .     .     .   |
    | .     .     .   |
    | .     .     .   |
    | 4.0   8.0  12.0 |
A = | 4.0   6.0  10.0 |
    | 4.0   4.0   5.0 |
    | 8.0  10.0  13.0 |
    |  .     .     .  |
    |  .     .     .  |
    |  .     .     .  |
    *                 *

Example 4

This example shows a matrix, A, and array, A, having the same number of rows. For this case, m and lda are equal. Because lda is 4 and n is 3, array A must be declared as A(E1:E2,F1:F2), where E2-E1+1=4 and F2-F1+1 >= 3. For this example, array A is declared as A(1:4,0:2).

Call Statement and Input
           M   N  ALPHA  X  INCX  Y  INCY    A     LDA
           |   |    |    |   |    |   |      |      |
CALL SGER( 4 , 3 , 1.0 , X , 1  , Y , 1  , A(1,0) , 4 )
 
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, 2.0, 3.0)
    *               *
    | 1.0  2.0  3.0 |
A = | 2.0  2.0  4.0 |
    | 3.0  2.0  2.0 |
    | 4.0  2.0  1.0 |
    *               *

Output
    *                 *
    | 4.0   8.0  12.0 |
A = | 4.0   6.0  10.0 |
    | 4.0   4.0   5.0 |
    | 8.0  10.0  13.0 |
    *                 *

Example 5

This example shows a computation in which scalar value for alpha is greater than 1. Array A must follow the same rules as given in Example 4. For this example, array A is declared as A(-1:2,1:3).

Call Statement and Input
           M   N  ALPHA  X  INCX  Y  INCY     A     LDA
           |   |    |    |   |    |   |       |      |
CALL SGER( 4 , 3 , 2.0 , X , 1  , Y , 1  , A(-1,1) , 4 )
X        =  (3.0, 2.0, 1.0, 4.0)
Y        =  (1.0, 2.0, 3.0)
A        =(same as input A in Example 4)

Output
    *                  *
    |  7.0  14.0  21.0 |
A = |  6.0  10.0  16.0 |
    |  5.0   6.0   8.0 |
    | 12.0  18.0  25.0 |
    *                  *

Example 6

This example shows a rank-one update in which all data items contain complex numbers, and the transpose yT is used in the computation. Matrix A is contained in a larger array, A. The strides of vectors x and y are positive. The Fortran DIMENSION statement for array A must follow the same rules as given in Example 1. For this example, array A is declared as A(1:10,0:2).

Call Statement and Input
            M   N   ALPHA   X  INCX  Y  INCY    A     LDA
            |   |     |     |   |    |   |      |      |
CALL CGERU( 5 , 3 , ALPHA , X , 1  , Y , 1  , A(1,0) , 10 )
 
ALPHA    =  (1.0, 0.0)
X        =  ((1.0, 2.0), (4.0, 0.0), (1.0, 1.0), (3.0, 4.0),
             (2.0, 0.0))
Y        =  ((1.0, 2.0), (4.0, 0.0), (1.0, -1.0))
    *                                    *
    | (1.0, 2.0)  (3.0, 5.0)  (2.0, 0.0) |
    | (2.0, 3.0)  (7.0, 9.0)  (4.0, 8.0) |
    | (7.0, 4.0)  (1.0, 4.0)  (6.0, 0.0) |
    | (8.0, 2.0)  (2.0, 5.0)  (8.0, 0.0) |
A = | (9.0, 1.0)  (3.0, 6.0)  (1.0, 0.0) |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    |     .           .           .      |
    *                                    *

Output
    *                                             *
    | (-2.0,   6.0)   (7.0,  13.0)   (5.0,   1.0) |
    |  (6.0,  11.0)  (23.0,   9.0)   (8.0,   4.0) |
    |  (6.0,   7.0)   (5.0,   8.0)   (8.0,   0.0) |
    |  (3.0,  12.0)  (14.0,  21.0)  (15.0,   1.0) |
A = | (11.0,   5.0)  (11.0,   6.0)   (3.0,  -2.0) |
    |      .              .              .        |
    |      .              .              .        |
    |      .              .              .        |
    |      .              .              .        |
    |      .              .              .        |
    *                                             *

Example 7

This example shows a rank-one update in which all data items contain complex numbers, and the conjugate transpose yH is used in the computation. Matrix A is contained in a larger array, A. The strides of vectors x and y are positive. The Fortran DIMENSION statement for array A must follow the same rules as given in Example 1. For this example, array A is declared as A(1:10,0:2).

Call Statement and Input
            M   N   ALPHA   X  INCX  Y  INCY    A     LDA
            |   |     |     |   |    |   |      |      |
CALL CGERC( 5 , 3 , ALPHA , X , 1  , Y , 1  , A(1,0) , 10 )
ALPHA    =  (1.0, 0.0)
X        =  ((1.0, 2.0), (4.0, 0.0), (1.0, 1.0), (3.0, 4.0),
             (2.0, 0.0))
Y        =  ((1.0, 2.0), (4.0, 0.0), (1.0, -1.0))
A        =(same as input A in Example 6 )

Output
    *                                            *
    |  (6.0,   2.0)   (7.0,  13.0)  (1.0,   3.0) |
    |  (6.0,  -5.0)  (23.0,   9.0)  (8.0,  12.0) |
    | (10.0,   3.0)   (5.0,   8.0)  (6.0,   2.0) |
    | (19.0,   0.0)  (14.0,  21.0)  (7.0,   7.0) |
A = | (11.0,  -3.0)  (11.0,   6.0)  (3.0,   2.0) |
    |      .              .             .        |
    |      .              .             .        |
    |      .              .             .        |
    |      .              .             .        |
    |      .              .             .        |
    *                                            *

SSPMV, DSPMV, CHPMV, ZHPMV, SSYMV, DSYMV, CHEMV, ZHEMV, SSLMX, and DSLMX--Matrix-Vector Product for a Real Symmetric or Complex Hermitian Matrix

SSPMV, DSPMV, CHPMV, ZHPMV, SSYMV, DSYMV, CHEMV, and ZHEMV compute the matrix-vector product for either a real symmetric matrix or a complex Hermitian matrix, using the scalars alpha and beta, matrix A, and vectors x and y:

y <-- betay+alphaAx

SSLMX and DSLMX compute the matrix-vector product for a real symmetric matrix, using the scalar alpha, matrix A, and vectors x and y:

y <-- y+alphaAx

The following storage modes are used:


Table 64. Data Types
alpha, beta, A, x, y Subprogram
Short-precision real SSPMV, SSYMV, and SSLMX
Long-precision real DSPMV, DSYMV, and DSLMX
Short-precision complex CHPMV and CHEMV
Long-precision complex ZHPMV and ZHEMV
Note: SSPMV and DSPMV are Level 2 BLAS subroutines. You should use these subroutines instead of SSLMX and DSLMX, which are provided only for compatibility with earlier releases of ESSL.

Syntax

Fortran CALL SSPMV | DSPMV | CHPMV | ZHPMV (uplo, n, alpha, ap, x, incx, beta, y, incy)

CALL SSYMV | DSYMV | CHEMV | ZHEMV (uplo, n, alpha, a, lda, x, incx, beta, y, incy)

CALL SSLMX | DSLMX (n, alpha, ap, x, incx, y, incy)

C and C++ sspmv | dspmv | chpmv | zhpmv (uplo, n, alpha, ap, x, incx, beta, y, incy);

ssymv | dsymv | chemv | zhemv (uplo, n, alpha, a, lda, x, incx, beta, y, incy);

sslmx | dslmx (n, alpha, ap, x, incx, y, incy);

PL/I CALL SSPMV | DSPMV | CHPMV | ZHPMV (uplo, n, alpha, ap, x, incx, beta, y, incy);

CALL SSYMV | DSYMV | CHEMV | ZHEMV (uplo, n, alpha, a, lda, x, incx, beta, y, incy);

CALL SSLMX | DSLMX (n, alpha, ap, x, incx, y, incy);

On Entry

uplo

indicates the storage mode used for matrix A, where:

If uplo = 'U', A is stored in upper-packed or upper storage mode.

If uplo = 'L', A is stored in lower-packed or lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

n

is the number of elements in vectors x and y and the order of matrix A. Specified as: a fullword integer; n >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 64.

ap

has the following meaning:

For SSPMV and DSPMV, ap is the real symmetric matrix A of order n, stored in upper- or lower-packed storage mode.

For CHPMV and ZHPMV, ap is the complex Hermitian matrix A of order n, stored in upper- or lower-packed storage mode.

For SSLMX and DSLMX, ap is the real symmetric matrix A of order n, stored in lower-packed storage mode.

Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 64.

a

has the following meaning:

For SSYMV and DSYMV, a is the real symmetric matrix A of order n, stored in upper or lower storage mode.

For CHEMV and ZHEMV, a is the complex Hermitian matrix A of order n, stored in upper or lower storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 64.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 64.

incx

is the stride for vector x. Specified as: a fullword integer, where:

For SSPMV, DSPMV, CHPMV, ZHPMV, SSYMV, DSYMV, CHEMV, and ZHEMV, incx < 0 or incx > 0.

For SSLMX and DSLMX, incx can have any value.

beta

is the scaling constant beta. Specified as: a number of the data type indicated in Table 64.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 64.

incy

is the stride for vector y. Specified as: a fullword integer; incy > 0 or incy < 0.

On Return

y

is the vector y of length n, containing the result of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 64.

Notes

  1. All subroutines accept lowercase letters for the uplo argument.

  2. The vector y must have no common elements with vector x or matrix A; otherwise, results are unpredictable. See "Concepts".

  3. On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

  4. For a description of how symmetric matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Symmetric Matrix". For a description of how complex Hermitian matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Complex Hermitian Matrix".

Function

These subroutines perform the computations described in the two sections below. See references [34], [35], and [73]. For SSPMV, DSPMV, CHPMV, ZHPMV, SSYMV, DSYMV, CHEMV, and ZHEMV, if n is zero or if alpha is zero and beta is one, no computation is performed. For SSLMX and DSLMX, if n or alpha is zero, no computation is performed.

For SSLMX, SSPMV, SSYMV, CHPMV, and CHEMV, intermediate results are accumulated in long precision. However, several intermediate stores may occur for each element of the vector y.

For SSPMV, DSPMV, CHPMV, ZHPMV, SSYMV, DSYMV, CHEMV, and ZHEMV

These subroutines compute the matrix-vector product for either a real symmetric matrix or a complex Hermitian matrix:

y <-- betay+alphaAx

where:

y is a vector of length n.
alpha is a scalar.
beta is a scalar.
A is a real symmetric or complex Hermitian matrix of order n.
x is a vector of length n.

It is expressed as follows:



Figure ESYGR101 not displayed.


For SSLMX and DSLMX

These subroutines compute the matrix-vector product for a real symmetric matrix stored in lower-packed storage mode:

y <-- y+alphaAx

where:

y is a vector of length n.
alpha is a scalar.
A is a real symmetric matrix of order n.
x is a vector of length n.

It is expressed as follows:



Figure ESYGR102 not displayed.


Error Conditions

Computational Errors

None

Input-Argument Errors
  1. uplo <>  'L' or 'U'
  2. n < 0
  3. lda < n
  4. lda <= 0
  5. incx = 0
  6. incy = 0

Example 1

This example shows vectors x and y with positive strides and a real symmetric matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

                      *               *
                      | 8.0  4.0  2.0 |
                      | 4.0  6.0  7.0 |
                      | 2.0  7.0  3.0 |
                      *               *

Call Statement and Input
            UPLO  N  ALPHA  AP   X  INCX BETA  Y  INCY
             |    |    |     |   |   |    |    |   |
CALL SSPMV( 'L' , 3 , 1.0 , AP , X , 1 , 1.0 , Y , 2 )
 
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)

Output
Y        =  (39.0, . , 34.0, . , 25.0)

Example 2

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is 1.0. The real symmetric matrix A of order 3 is stored in upper-packed storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  AP   X  INCX  BETA  Y  INCY
             |    |    |     |   |    |    |    |   |
CALL SSPMV( 'U' , 3 , 1.0 , AP , X , -2 , 2.0 , Y , 1 )
 
AP       =  (8.0, 4.0, 6.0, 2.0, 7.0, 3.0)
X        =  (4.0, . , 2.0, . , 1.0)
Y        =  (6.0, 5.0, 4.0)

Output
Y        =  (36.0, 54.0, 36.0)

Example 3

This example shows vector x and y with positive stride and a complex Hermitian matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

            *                                      *
            |  (1.0, 0.0)  (3.0, 5.0)  (2.0, -3.0) |
            | (3.0, -5.0)  (7.0, 0.0)  (4.0, -8.0) |
            |  (2.0, 3.0)  (4.0, 8.0)   (6.0, 0.0) |
            *                                      *
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

Call Statement and Input
            UPLO  N   ALPHA   AP   X  INCX BETA   Y  INCY
             |    |     |      |   |   |    |     |   |
CALL CHPMV( 'L' , 3 , ALPHA , AP , X , 1 , BETA , Y , 2 )
ALPHA    =  (1.0, 0.0)
AP       =  ((1.0, . ), (3.0, -5.0), (2.0, 3.0), (7.0, . ),
             (4.0, 8.0), (6.0, . ))
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
BETA     =  (1.0, 0.0)
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))

Output
Y        =  ((32.0, 21.0), . , (87.0, -8.0), . , (32.0, 64.0))

Example 4

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is (1.0, 2.0). The complex Hermitian matrix A of order 3 is stored in upper-packed storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

Call Statement and Input
            UPLO  N   ALPHA   AP   X  INCX  BETA   Y INCY
             |    |     |     |    |    |    |     |   |
CALL CHPMV( 'U' , 3 , ALPHA , AP , X , -2 , BETA , Y , 2 )
ALPHA    =  (1.0, 0.0)
AP       =  ((1.0, . ), (3.0, 5.0), (7.0, . ), (2.0, -3.0),
             (4.0, -8.0), (6.0, . ))
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
BETA     =  (0.0, 0.0)
Y        =(not relevant)

Output
Y        =  ((31.0, 21.0), . , (85.0, -7.0), . , (30.0, 63.0))

Example 5

This example shows vectors x and y with positive strides and a real symmetric matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  A  LDA  X  INCX BETA  Y INCY
             |    |    |    |   |   |   |    |    |   |
CALL SSYMV( 'L' , 3 , 1.0 , A , 3 , X , 1 , 1.0 , Y , 2 )
 
    *               *
    | 8.0  .    .   |
A = | 4.0  6.0  .   |
    | 2.0  7.0  3.0 |
    *               *
 
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)

Output
Y        =  (39.0, . , 34.0, . , 25.0)

Example 6

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is 1.0. The real symmetric matrix A of order 3 is stored in upper storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  A  LDA  X  INCX  BETA  Y  INCY
             |    |    |    |   |   |    |    |    |   |
CALL SSYMV( 'U' , 3 , 1.0 , A , 4 , X , -2 , 2.0 , Y , 1 )
 
    *               *
    | 8.0  4.0  2.0 |
A = |  .   6.0  7.0 |
    |  .    .   3.0 |
    |  .    .    .  |
    *               *
 
X        =  (4.0, . , 2.0, . , 1.0)
Y        =  (6.0, 5.0, 4.0)

Output
A        =  (36.0, 54.0, 36.0)

Example 7

This example shows vector x and y with positive stride and a complex Hermitian matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

Call Statement and Input
            UPLO  N   ALPHA   A  LDA  X  INCX BETA   Y INCY
             |    |     |     |   |   |   |    |     |   |
CALL CHEMV( 'L' , 3 , ALPHA , A , 3 , X , 1 , BETA , Y , 2 )
 
ALPHA    =  (1.0, 0.0)
 
        *                                    *
        | (1.0,  . )       .          .      |
A    =  | (3.0, -5.0)  (7.0, .  )     .      |
        | (2.0,  3.0)  (4.0, 8.0) (6.0,  . ) |
        *                                    *
 
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
BETA     =  (1.0, 0.0)
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))

Output
Y        =  ((32.0, 21.0), . , (87.0, -8.0), . , (32.0, 64.0))

Example 8

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is (1.0, 2.0). The complex Hermitian matrix A of order 3 is stored in upper storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

Call Statement and Input
            UPLO  N   ALPHA   A  LDA  X  INCX  BETA   Y INCY
             |    |     |     |   |   |    |    |     |   |
CALL CHEMV( 'U' , 3 , ALPHA , A , 3 , X , -2 , BETA , Y , 2 )
 
ALPHA    =  (1.0, 0.0)
        *                                    *
        | (1.0, . )  (3.0, 5.0)  (2.0, -3.0) |
A    =  |     .      (7.0,  . )  (4.0, -8.0) |
        |     .          .       (6.0,   . ) |
        *                                    *
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
BETA     =  (0.0, 0.0)
Y        =(not relevant)

Output
Y        =  ((31.0, 21.0), . , (85.0, -7.0), . , (30.0, 63.0))

Example 9

This example shows vectors x and y with positive strides and a real symmetric matrix A of order 3. Matrix A is:

                    *               *
                    | 8.0  4.0  2.0 |
                    | 4.0  6.0  7.0 |
                    | 2.0  7.0  3.0 |
                    *               *

Call Statement and Input
            N  ALPHA  AP   X  INCX  Y  INCY
            |    |    |    |   |    |   |
CALL SSLMX( 3 , 1.0 , AP , X , 1  , Y , 2  )
 
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)

Output
Y        =  (39.0, . , 34.0, . , 25.0)

SSPR, DSPR, CHPR, ZHPR, SSYR, DSYR, CHER, ZHER, SSLR1, and DSLR1 --Rank-One Update of a Real Symmetric or Complex Hermitian Matrix

SSPR, DSPR, SSYR, DSYR, SSLR1, and DSLR1 compute the rank-one update of a real symmetric matrix, using the scalar alpha, matrix A, vector x, and its transpose xT:

A <-- A+alphaxxT

CHPR, ZHPR, CHER, and ZHER compute the rank-one update of a complex Hermitian matrix, using the scalar alpha, matrix A, vector x, and its conjugate transpose xH:

A <-- A+alphaxxH

The following storage modes are used:


Table 65. Data Types
A, x alpha Subprogram
Short-precision real Short-precision real SSPR, SSYR, and SSLR1
Long-precision real Long-precision real DSPR, DSYR, and DSLR1
Short-precision complex Short-precision real CHPR and CHER
Long-precision complex Long-precision real ZHPR and ZHER
Note: SSPR and DSPR are Level 2 BLAS subroutines. You should use these subroutines instead of SSLR1 and DSLR1, which are only provided for compatibility with earlier releases of ESSL.

Syntax

Fortran CALL SSPR | DSPR | CHPR | ZHPR (uplo, n, alpha, x, incx, ap)

CALL SSYR | DSYR | CHER | ZHER (uplo, n, alpha, x, incx, a, lda)

CALL SSLR1 | DSLR1 (n, alpha, x, incx, ap)

C and C++ sspr | dspr | chpr | zhpr (uplo, n, alpha, x, incx, ap);

ssyr | dsyr | cher | zher (uplo, n, alpha, x, incx, a, lda);

sslr1 | dslr1 (n, alpha, x, incx, ap);

PL/I CALL SSPR | DSPR | CHPR | ZHPR (uplo, n, alpha, x, incx, ap);

CALL SSYR | DSYR | CHER | ZHER (uplo, n, alpha, x, incx, a, lda);

CALL SSLR1 | DSLR1 (n, alpha, x, incx, ap);

On Entry

uplo

indicates the storage mode used for matrix A, where:

If uplo = 'U', A is stored in upper-packed or upper storage mode.

If uplo = 'L', A is stored in lower-packed or lower storage mode.

Specified as: a single character. It must be 'U' or 'L'.

n

is the number of elements in vector x and the order of matrix A. Specified as: a fullword integer; n >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 65.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 65.

incx

is the stride for vector x. Specified as: a fullword integer, where:

For SSPR, DSPR, CHPR, ZHPR, SSYR, DSYR, CHER, and ZHER, incx < 0 or incx > 0.

For SSLR1 and DSLR1, incx can have any value.

ap

has the following meaning:

For SSPR and DSPR, ap is the real symmetric matrix A of order n, stored in upper- or lower-packed storage mode.

For CHPR and ZHPR, ap is the complex Hermitian matrix A of order n, stored in upper- or lower-packed storage mode.

For SSLR1 and DSLR1, ap is the real symmetric matrix A of order n, stored in lower-packed storage mode.

Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 65.

a

has the following meaning:

For SSYR and DSYR, a is the real symmetric matrix A of order n, stored in upper or lower storage mode.

For CHER and ZHER, a is the complex Hermitian matrix A of order n, stored in upper or lower storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 65.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

On Return

ap

is the matrix A of order n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 65.

a

is the matrix A of order n, containing the results of the computation. Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 65.

Notes

  1. All subroutines accept lowercase letters for the uplo argument.

  2. The vector x must have no common elements with matrix A; otherwise, results are unpredictable. See "Concepts".

  3. On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <> 0.0, they are set to zero.

  4. For a description of how symmetric matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Symmetric Matrix". For a description of how complex Hermitian matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Complex Hermitian Matrix".

Function

These subroutines perform the computations described in the two sections below. See references [34], [35], and [73]. If n or alpha is 0, no computation is performed.

For CHPR and CHER, intermediate results are accumulated in long precision. For SSPR, SSYR, and SSLR1, intermediate results are accumulated in long precision on some platforms.

For SSPR, DSPR, SSYR, DSYR, SSLR1, and DSLR1

These subroutines compute the rank-one update of a real symmetric matrix:

A <-- A+alphaxxT

where:

A is a real symmetric matrix of order n.
alpha is a scalar.
x is a vector of length n.
xT is the transpose of vector x.

It is expressed as follows:



Figure ESYGR103 not displayed.


For CHPR, ZHPR, CHER, and ZHER

These subroutines compute the rank-one update of a complex Hermitian matrix:

A <-- A+alphaxxH

where:

A is a complex Hermitian matrix of order n.
alpha is a scalar.
x is a vector of length n.
xH is the conjugate transpose of vector x.

It is expressed as follows:



Figure ESYGR104 not displayed.


Error Condition

Computational Errors

None

Input-Argument Errors
  1. uplo <>  'L' or 'U'
  2. n < 0
  3. incx = 0
  4. lda <= 0
  5. lda < n

Example 1

This example shows a vector x with a positive stride, and a real symmetric matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

                       *               *
                       | 8.0  4.0  2.0 |
                       | 4.0  6.0  7.0 |
                       | 2.0  7.0  3.0 |
                       *               *

Call Statement and Input
          UPLO   N  ALPHA  X  INCX  AP
            |    |    |    |   |    |
CALL SSPR( 'L' , 3 , 1.0 , X , 1  , AP )
 
X        =  (3.0, 2.0, 1.0)
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)

Output
AP       =  (17.0, 10.0, 5.0, 10.0, 9.0, 4.0)

Example 2

This example shows a vector x with a negative stride, and a real symmetric matrix A of order 3, stored in upper-packed storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
          UPLO   N  ALPHA  X  INCX  AP
            |    |    |    |    |   |
CALL SSPR( 'U' , 3 , 1.0 , X , -2 , AP )
 
X        =  (1.0, . , 2.0, . , 3.0)
AP       =  (8.0, 4.0, 6.0, 2.0, 7.0, 3.0)

Output
AP       =  (17.0, 10.0, 10.0, 5.0, 9.0, 4.0)

Example 3

This example shows a vector x with a positive stride, and a complex Hermitian matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

              *                                      *
              |  (1.0, 0.0)  (3.0, 5.0)  (2.0, -3.0) |
              | (3.0, -5.0)  (7.0, 0.0)  (4.0, -8.0) |
              |  (2.0, 3.0)  (4.0, 8.0)   (6.0, 0.0) |
              *                                      *
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <> 0.0, they are set to zero.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX AP
            |    |    |    |   |   |
CALL CHPR( 'L' , 3 , 1.0 , X , 1 , AP )
 
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
AP       =  ((1.0, . ), (3.0, -5.0), (2.0, 3.0), (7.0, . ),
             (4.0, 8.0), (6.0, . ))

Output
AP       =  ((6.0, 0.0), (7.0, -13.0), (13.0, 1.0), (23.0, 0.0),
             (16.0, 24.0), (31.0, 0.0))

Example 4

This example shows a vector x with a negative stride, and a complex Hermitian matrix A of order 3, stored in upper-packed storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <> 0.0, they are set to zero.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX  AP
            |    |    |    |    |   |
CALL CHPR( 'U' , 3 , 1.0 , X , -2 , AP )
 
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
AP       =  ((1.0, . ), (3.0, 5.0), (7.0, . ), (2.0, -3.0),
             (4.0, -8.0), (6.0, . ))

Output
AP       =  ((6.0, 0.0), (7.0, 13.0), (23.0, 0.0), (13.0, -1.0),
             (16.0, -24.0), (31.0, 0.0))

Example 5

This example shows a vector x with a positive stride, and a real symmetric matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX A  LDA
            |    |    |    |   |   |   |
CALL SSYR( 'L' , 3 , 1.0 , X , 1 , A , 3 )
 
X        =  (3.0, 2.0, 1.0)
 
        *               *
        | 8.0   .    .  |
A    =  | 4.0  6.0   .  |
        | 2.0  7.0  3.0 |
        *               *

Output
        *                 *
        | 17.0    .    .  |
A    =  | 10.0  10.0   .  |
        |  5.0   9.0  4.0 |
        *                 *

Example 6

This example shows a vector x with a negative stride, and a real symmetric matrix A of order 3, stored in upper storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX  A  LDA
            |    |    |    |    |   |   |
CALL SSYR( 'U' , 3 , 1.0 , X , -2 , A , 4 )
 
X        =  (1.0, . , 2.0, . , 3.0)
 
        *               *
        | 8.0  4.0  2.0 |
A    =  |  .   6.0  7.0 |
        |  .    .   3.0 |
        |  .    .    .  |
        *               *

Output
        *                 *
        | 17.0  10.0  5.0 |
A    =  |   .   10.0  9.0 |
        |   .     .   4.0 |
        |   .     .    .  |
        *                 *

Example 7

This example shows a vector x with a positive stride, and a complex Hermitian matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <> 0.0, they are set to zero.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX A  LDA
            |    |    |    |   |   |   |
CALL CHER( 'L' , 3 , 1.0 , X , 1 , A , 3 )
 
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
 
        *                                    *
        |   (1.0, . )       .            .   |
A    =  | (3.0, -5.0)   (7.0, . )        .   |
        |  (2.0, 3.0)  (4.0, 8.0)  (6.0, . ) |
        *                                    *

Output
        *                                         *
        |   (6.0, 0.0)        .            .      |
A    =  | (7.0, -13.0)   (23.0, 0.0)       .      |
        |  (13.0, 1.0)  (16.0, 24.0)  (31.0, 0.0) |
        *                                         *

Example 8

This example shows a vector x with a negative stride, and a complex Hermitian matrix A of order 3, stored in upper storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <> 0.0, they are set to zero.

Call Statement and Input
           UPLO  N  ALPHA  X  INCX  A  LDA
            |    |    |    |    |   |   |
CALL CHER( 'U' , 3 , 1.0 , X , -2 , A , 3 )
 
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
 
        *                                    *
        | (1.0, . )  (3.0, 5.0)  (2.0, -3.0) |
A    =  |     .      (7.0,  . )  (4.0, -8.0) |
        |     .          .         (6.0, . ) |
        *                                    *

Output
        *                                        *
        | (6.0, 0.0)  (7.0, 13.0)   (13.0, -1.0) |
A    =  |     .       (23.0, 0.0)  (16.0, -24.0) |
        |     .            .         (31.0, 0.0) |
        *                                        *

Example 9

This example shows a vector x with a positive stride, and a real symmetric matrix A of order 3, stored in lower-packed storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            N  ALPHA  X  INCX  AP
            |    |    |   |    |
CALL SSLR1( 3 , 1.0 , X , 1  , AP )
 
X        =  (3.0, 2.0, 1.0)
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)

Output
AP       =  (17.0, 10.0, 5.0, 10.0, 9.0, 4.0)

SSPR2, DSPR2, CHPR2, ZHPR2, SSYR2, DSYR2, CHER2, ZHER2, SSLR2, and DSLR2--Rank-Two Update of a Real Symmetric or Complex Hermitian Matrix

SSPR2, DSPR2, SSYR2, DSYR2, SSLR2, and DSLR2 compute the rank-two update of a real symmetric matrix, using the scalar alpha, matrix A, vectors x and y, and their transposes xT and yT:

A <-- A+alphaxyT + alphayxT

CHPR2, ZHPR2, CHER2, and ZHER2, compute the rank-two update of a complex Hermitian matrix, using the scalar alpha, matrix A, vectors x and y, and their conjugate transposes xH and yH:



Figure ESYGR105 not displayed.

The following storage modes are used:


Table 66. Data Types
alpha, A, x, y Subprogram
Short-precision real SSPR2, SSYR2, and SSLR2
Long-precision real DSPR2, DSYR2, and DSLR2
Short-precision complex CHPR2 and CHER2
Long-precision complex ZHPR2 and ZHER2
Note: SSPR2 and DSPR2 are Level 2 BLAS subroutines. You should use these subroutines instead of SSLR2 and DSLR2, which are only provided for compatibility with earlier releases of ESSL.

Syntax

Fortran CALL SSPR2 | DSPR2 | CHPR2 | ZHPR2 (uplo, n, alpha, x, incx, y, incy, ap)

CALL SSYR2 | DSYR2 | CHER2 | ZHER2 (uplo, n, alpha, x, incx, y, incy, a, lda)

CALL SSLR2 | DSLR2 (n, alpha, x, incx, y, incy, ap)

C and C++ sspr2 | dspr2 | chpr2 | zhpr2 (uplo, n, alpha, x, incx, y, incy, ap);

ssyr2 | dsyr2 | cher2 | zher2 (uplo, n, alpha, x, incx, y, incy, a, lda);

sslr2 | dslr2 (n, alpha, x, incx, y, incy, ap);

PL/I CALL SSPR2 | DSPR2 | CHPR2 | ZHPR2 (uplo, n, alpha, x, incx, y, incy, ap);

CALL SSYR2 | DSYR2 | CHER2 | ZHER2 (uplo, n, alpha, x, incx, y, incy, a lda);

CALL SSLR2 | DSLR2 (n, alpha, x, incx, y, incy, ap);

On Entry

uplo

indicates the storage mode used for matrix A, where:

If uplo = 'U', A is stored in upper-packed or upper storage mode.

If uplo = 'L', A is stored in lower-packed or lower storage mode.

Specified as: a single character. It must be 'U&csq or 'L'.

n

is the number of elements in vectors x and y and the order of matrix A. Specified as: a fullword integer; n >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 66.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 66.

incx

is the stride for vector x.

Specified as: a fullword integer, where:

For SSPR2, DSPR2, CHPR2, ZHPR2, SSYR2, DSYR2, CHER2, and ZHER2, incx < 0 or incx > 0.

For SSLR2 and DSLR2, incx can have any value.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incy|, containing numbers of the data type indicated in Table 66.

incy

is the stride for vector y. Specified as: a fullword integer, where:

For SSPR2, DSPR2, CHPR2, ZHPR2, SSYR2, DSYR2, CHER2, and ZHER2, incy < 0 or incy > 0.

For SSLR2 and DSLR2, incy can have any value.

ap

has the following meaning:

For SSPR2 and DSPR2, ap is the real symmetric matrix A of order n, stored in upper- or lower-packed storage mode.

For CHPR2 and ZHPR2, ap is the complex Hermitian matrix A of order n, stored in upper- or lower-packed storage mode.

For SSLR2 and DSLR2, ap is the real symmetric matrix A of order n, stored in lower-packed storage mode.

Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 66.

a

has the following meaning:

For SSYR2 and DSYR2, a is the real symmetric matrix A of order n, stored in upper or lower storage mode.

For CHER2 and ZHER2, a is the complex Hermitian matrix A of order n, stored in upper or lower storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 66.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

On Return

ap

is the matrix A of order n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 66.

a

is the matrix A of order n, containing the results of the computation. Returned as: a two-dimensional array, containing numbers of the data type indicated in Table 66.

Notes

  1. All subroutines accept lowercase letters for the uplo argument.

  2. The vectors x and y must have no common elements with matrix A; otherwise, results are unpredictable. See "Concepts".

  3. On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <>  zero, the imaginary parts of the diagonal elements are set to zero.

  4. For a description of how symmetric matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Symmetric Matrix". For a description of how complex Hermitian matrices are stored in upper- or lower-packed storage mode and upper or lower storage mode, see "Complex Hermitian Matrix".

Function

These subroutines perform the computation described in the two sections below. See references [34], [35], and [73]. If n or alpha is zero, no computation is performed.

For SSPR2, SSYR2, SSLR2, CHPR2, and CHER2, intermediate results are accumulated in long precision.

SSPR2, DSPR2, SSYR2, DSYR2, SSLR2, and DSLR2

These subroutines compute the rank-two update of a real symmetric matrix:

A <-- A + alphaxyT + alphayxT

where:

A is a real symmetric matrix of order n.
alpha is a scalar.
x is a vector of length n.
xT is the transpose of vector x.
y is a vector of length n.
yT is the transpose of vector y.

It is expressed as follows:



Figure ESYGR106 not displayed.


CHPR2, ZHPR2, CHER2, and ZHER2

These subroutines compute the rank-two update of a complex Hermitian matrix:



Figure ESYGR107 not displayed.

where:

A is a complex Hermitian matrix of order n.
alpha is a scalar.
x is a vector of length n.
xH is the conjugate transpose of vector x.
y is a vector of length n.
yH is the conjugate transpose of vector y.

It is expressed as follows:



Figure ESYGR108 not displayed.


Error Condition

Computational Errors

None

Input-Argument Errors
  1. uplo <>  'L' or 'U'
  2. n < 0
  3. incx = 0
  4. incy = 0
  5. lda <= 0
  6. lda < n

Example 1

This example shows vectors x and y with positive strides and a real symmetric matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

                         *               *
                         | 8.0  4.0  2.0 |
                         | 4.0  6.0  7.0 |
                         | 2.0  7.0  3.0 |
                         *               *

Call Statement and Input
            UPLO  N  ALPHA  X  INCX  Y  INCY AP
             |    |    |    |   |    |   |   |
CALL SSPR2( 'L' , 3 , 1.0 , X , 1 ,  Y , 2 , AP )
 
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)

Output
AP       =  (38.0, 23.0, 13.0, 18.0, 14.0, 7.0)

Example 2

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is 3.0. The real symmetric matrix A of order 3 is stored in upper-packed storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  X   INCX Y  INCY AP
             |    |    |    |    |   |   |   |
CALL SSPR2( 'U' , 3 , 1.0 , X , -2 , Y , 2 , AP )
 
X        =  (1.0, . , 2.0, . , 3.0)
Y        =  (5.0, . , 3.0, . , 2.0)
AP       =  (8.0, 4.0, 6.0, 2.0, 7.0, 3.0)

Output
AP       =  (38.0, 23.0, 18.0, 13.0, 14.0, 7.0)

Example 3

This example shows vector x and y with positive stride and a complex Hermitian matrix A of order 3, stored in lower-packed storage mode. Matrix A is:

                    *                                      *
                    |  (1.0, 0.0)  (3.0, 5.0)  (2.0, -3.0) |
                    | (3.0, -5.0)  (7.0, 0.0)  (4.0, -8.0) |
                    |  (2.0, 3.0)  (4.0, 8.0)   (6.0, 0.0) |
                    *                                      *
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <>  zero, the imaginary parts of the diagonal elements are set to zero.

Call Statement and Input
            UPLO  N   ALPHA   X  INCX Y  INCY AP
             |    |     |     |   |   |   |   |
CALL CHPR2( 'L' , 3 , ALPHA , X , 1 , Y , 2 , AP )
 
ALPHA    =  (1.0, 0.0)
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))
AP       =  ((1.0, . ), (3.0, -5.0), (2.0, 3.0), (7.0, . ),
             (4.0, 8.0), (6.0, . ))

Output
AP       =  ((3.0, 0.0), (7.0, -10.0), (9.0, 4.0), (23.0, 0.0),
             (14.0, 23.0), (26.0, 0.0))

Example 4

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is (1.0,2.0). The complex Hermitian matrix A of order 3 is stored in upper-packed storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <>  zero, the imaginary parts of the diagonal elements are set to zero.

Call Statement and Input
            UPLO  N   ALPHA   X  INCX  Y  INCY AP
             |    |     |     |    |   |   |   |
CALL CHPR2( 'U' , 3 , ALPHA , X , -2 , Y , 2 , AP )
 
ALPHA    =  (1.0, 0.0)
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))
AP       =  ((1.0, . ), (3.0, 5.0), (7.0, . ), (2.0, -3.0),
             (4.0, -8.0), (6.0, . ))

Output
AP       =  ((3.0, 0.0), (7.0, 10.0), (23.0, 0.0), (9.0, -4.0),
             (14.0, -23.0), (26.0, 0.0))

Example 5

This example shows vectors x and y with positive strides, and a real symmetric matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  X  INCX Y  INCY A  LDA
             |    |    |    |   |   |   |   |   |
CALL SSYR2( 'L' , 3 , 1.0 , X , 1 , Y , 2 , A , 3 )
 
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)
 
        *               *
        | 8.0   .    .  |
A    =  | 4.0  6.0   .  |
        | 2.0  7.0  3.0 |
        *               *

Output
        *                 *
        | 38.0    .    .  |
A    =  | 23.0  18.0   .  |
        | 13.0  14.0  7.0 |
        *                 *

Example 6

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is 3.0. The real symmetric matrix A of order 3 is stored in upper storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            UPLO  N  ALPHA  X  INCX  Y  INCY A  LDA
             |    |    |    |    |   |   |   |   |
CALL SSYR2( 'U' , 3 , 1.0 , X , -2 , Y , 2 , A , 4 )
 
X        =  (1.0, . , 2.0, . , 3.0)
Y        =  (5.0, . , 3.0, . , 2.0)
 
        *               *
        | 8.0  4.0  2.0 |
A    =  |  .   6.0  7.0 |
        |  .    .   3.0 |
        |  .    .    .  |
        *               *

Output
        *                  *
        | 38.0  23.0  13.0 |
A    =  |   .   18.0  14.0 |
        |   .     .    7.0 |
        |   .     .     .  |
        *                  *

Example 7

This example shows vector x and y with positive stride, and a complex Hermitian matrix A of order 3, stored in lower storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <>  zero, the imaginary parts of the diagonal elements are set to zero.

Call Statement and Input
            UPLO  N   ALPHA   X  INCX Y  INCY A  LDA
             |    |     |     |   |   |   |   |   |
CALL CHER2( 'L' , 3 , ALPHA , X , 1 , Y , 2 , A , 3 )
 
ALPHA    =  (1.0, 0.0)
X        =  ((1.0, 2.0), (4.0, 0.0), (3.0, 4.0))
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))
        *                                    *
        |   (1.0, . )       .          .     |
A    =  | (3.0, -5.0)   (7.0, . )      .     |
        |  (2.0, 3.0)  (4.0, 8.0)  (6.0, . ) |
        *                                    *

Output
        *                                          *
        |   (3.0, 0.0)       .             .       |
A    =  | (7.0, -10.0)  (23.0, 0.0 )       .       |
        |   (9.0, 4.0)  (14.0, 23.0)  (26.0, 0.0 ) |
        *                                          *

Example 8

This example shows vector x and y having strides of opposite signs. For x, which has negative stride, processing begins at element X(5), which is (1.0, 2.0). The complex Hermitian matrix A of order 3 is stored in upper storage mode. It uses the same input matrix A as in Example 3.
Note: On input, the imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values. On output, if alpha <>  zero, the imaginary parts of the diagonal elements are set to zero.

Call Statement and Input
            UPLO  N   ALPHA   X  INCX  Y  INCY A  LDA
             |    |     |     |    |   |   |   |   |
CALL CHER2( 'U' , 3 , ALPHA , X , -2 , Y , 2 , A , 3 )
 
ALPHA    =  (1.0, 0.0)
X        =  ((3.0, 4.0), . , (4.0, 0.0), . , (1.0, 2.0))
Y        =  ((1.0, 0.0), . , (2.0, -1.0), . , (2.0, 1.0))
        *                                    *
        | (1.0, . ) (3.0, 5.0)   (2.0, -3.0) |
A    =  |     .      (7.0, . )   (4.0, -8.0) |
        |     .          .         (6.0, . ) |
        *                                    *

Output
        *                                        *
        | (3.0, 0.0)  (7.0, 10.0)    (9.0, -4.0) |
A    =  |     .       (23.0, 0.0)  (14.0, -23.0) |
        |     .            .         (26.0, 0.0) |
        *                                        *

Example 9

This example shows vectors x and y with positive strides and a real symmetric matrix A of order 3, stored in lower-packed storage mode. It uses the same input matrix A as in Example 1.

Call Statement and Input
            N  ALPHA  X  INCX  Y  INCY  AP
            |    |    |   |    |   |    |
CALL SSLR2( 3 , 1.0 , X , 1  , Y , 2  , AP )
 
X        =  (3.0, 2.0, 1.0)
Y        =  (5.0, . , 3.0, . , 2.0)
AP       =  (8.0, 4.0, 2.0, 6.0, 7.0, 3.0)

Output
AP       =  (38.0, 23.0, 13.0, 18.0, 14.0, 7.0)

SGBMV, DGBMV, CGBMV, and ZGBMV--Matrix-Vector Product for a General Band Matrix, Its Transpose, or Its Conjugate Transpose

SGBMV and DGBMV compute the matrix-vector product for either a real general band matrix or its transpose, where the general band matrix is stored in BLAS-general-band storage mode. It uses the scalars alpha and beta, vectors x and y, and general band matrix A or its transpose:

y <-- betay+alphaAx

y <-- betay+alphaATx

CGBMV and ZGBMV compute the matrix-vector product for either a complex general band matrix, its transpose, or its conjugate transpose, where the general band matrix is stored in BLAS-general-band storage mode. It uses the scalars alpha and beta, vectors x and y, and general band matrix A, its transpose, or its conjugate transpose:

y <-- betay+alphaAx
y <-- betay+alphaATx
y <-- betay+alphaAHx

Table 67. Data Types
alpha, beta, x, y, A Subprogram
Short-precision real SGBMV
Long-precision real DGBMV
Short-precision complex CGBMV
Long-precision complex ZGBMV

Syntax

Fortran CALL SGBMV | DGBMV | CGBMV | ZGBMV (transa, m, n, ml, mu, alpha, a, lda, x, incx, beta, y, incy)
C and C++ sgbmv | dgbmv | cgbmv | zgbmv (transa, m, n, ml, mu, alpha, a, lda, x, incx, beta, y, incy);
PL/I CALL SGBMV | DGBMV | CGBMV | ZGBMV (transa, m, n, ml, mu, alpha, a, lda, x, incx, beta, y, incy);

On Entry

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation.

If transa = 'T', AT is used in the computation.

If transa = 'C', AH is used in the computation.

Specified as: a single character. It must be 'N', 'T', or 'C'.

m

is the number of rows in matrix A, and:

If transa = 'N', it is the length of vector y.

If transa = 'T' or 'C', it is the length of vector x.

Specified as: a fullword integer; m >= 0.

n

is the number of columns in matrix A, and:

If transa = 'N', it is the length of vector x.

If transa = 'T' or 'C', it is the length of vector y.

Specified as: a fullword integer; n >= 0.

ml

is the lower band width ml of the matrix A. Specified as: a fullword integer; ml >= 0.

mu

is the upper band width mu of the matrix A. Specified as: a fullword integer; mu >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 67.

a

is the m by n general band matrix A, stored in BLAS-general-band storage mode. It has an upper band width mu and a lower band width ml. Also:

If transa = 'N', A is used in the computation.

If transa = 'T', AT is used in the computation.

If transa = 'C', AH is used in the computation.
Note: No data should be moved to form AT or AH; that is, the matrix A should always be stored in its untransposed form in BLAS-general-band storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 67, where lda >= ml+mu+1.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= ml+mu+1.

x

is the vector x, where:

If transa = 'N', it has length n.

If transa = 'T' or 'C', it has length m.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 67, where:

If transa = 'N', it must have at least 1+(n-1)|incx| elements.

If transa = 'T' or 'C', it must have at least 1+(m-1)|incx| elements.

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

beta

is the scaling constant beta. Specified as: a number of the data type indicated in Table 67.

y

is the vector y, where:

If transa = 'N', it has length m.

If transa = 'T' or 'C', it has length n.

Specified as: a one-dimensional array, containing numbers of the data type indicated in Table 67, where:

If transa = 'N', it must have at least 1+(m-1)|incy| elements.

If transa = 'T' or 'C', it must have at least 1+(n-1)|incy| elements.

incy

is the stride for vector y. Specified as: a fullword integer; incy > 0 or incy < 0.

On Return

y

is the vector y, containing the result of the computation, where:

If transa = 'N', it has length m.

If transa = 'T' or 'C', it has length n.

Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 67.

Notes

  1. For SGBMV and DGBMV, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.

  2. All subroutines accept lowercase letters for the transa argument.

  3. Vector y must have no common elements with matrix A or vector x; otherwise, results are unpredictable. See "Concepts".

  4. To achieve optimal performance, use lda = mu+ml+1.

  5. For general band matrices, if you specify ml >= m or mu >= n, ESSL assumes, only for purposes of the computation, that the lower band width is m-1 or the upper band width is n-1, respectively. However, ESSL uses the original values for ml and mu for the purposes of finding the locations of element a11 and all other elements in the array specified for A, as described in "General Band Matrix". For an illustration of this technique, see "Example 4".

  6. For a description of how a general band matrix is stored in BLAS-general-band storage mode in an array, see "General Band Matrix".

Function

The possible computations that can be performed by these subroutines are described in the following sections. Varying implementation techniques are used for this computation to improve performance. As a result, accuracy of the computational result may vary for different computations.

In all the computations, general band matrix A is stored in its untransposed form in an array, using BLAS-general-band storage mode.

For SGBMV and CGBMV, intermediate results are accumulated in long precision. Occasionally, for performance reasons, these intermediate results are truncated to short precision and stored.

See references [34], [35], [38], [46], and [73]. No computation is performed if m or n is 0 or if alpha is zero and beta is one.

General Band Matrix

For SGBMV, DGBMV, CGBMV, and ZGBMV, the matrix-vector product for a general band matrix is expressed as follows:

y <-- betay+alphaAx

where:

x is a vector of length n.
y is a vector of length m.
alpha is a scalar.
beta is a scalar.
A is an m by n general band matrix, having a lower band width of ml and an upper band width of mu.

Transpose of a General Band Matrix

For SGBMV, DGBMV, CGBMV, and ZGBMV, the matrix-vector product for the transpose of a general band matrix is expressed as:

y <-- betay+alphaATx

where:

x is a vector of length m.
y is a vector of length n.
alpha is a scalar.
beta is a scalar.
AT is the transpose of an m by n general band matrix A, having a lower band width of ml and an upper band width of mu.

Conjugate Transpose of a General Band Matrix

For CGBMV and ZGBMV, the matrix-vector product for the conjugate transpose of a general band matrix is expressed as follows:

y <-- betay+alphaAHx

where:

x is a vector of length m.
y is a vector of length n.
alpha is a scalar.
beta is a scalar.
AH is the conjugate transpose of an m by n general band matrix A of order n, having a lower band width of ml and an upper band width of mu.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. transa <> 'N', 'T', or 'C'
  2. m < 0
  3. n < 0
  4. ml < 0
  5. mu < 0
  6. lda <= 0
  7. lda < ml+mu+1
  8. incx = 0
  9. incy = 0

Example 1

This example shows how to use DGBMV to perform the computation y <-- betay+alphaAx, where TRANSA is equal to 'N', and the following real general band matrix A is used in the computation. Matrix A is:

                     *                    *
                     | 1.0  1.0  1.0  0.0 |
                     | 2.0  2.0  2.0  2.0 |
                     | 3.0  3.0  3.0  3.0 |
                     | 4.0  4.0  4.0  4.0 |
                     | 0.0  5.0  5.0  5.0 |
                     *                    *

Call Statement and Input
           TRANSA M   N   ML  MU  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |   |   |     |     |   |
CALL SGBMV( 'N' , 5 , 4 , 3 , 2 ,  2.0 , A , 8 , X , 1  , 10.0 , Y , 2  )
        *                    *
        |  .    .   1.0  2.0 |
        |  .   1.0  2.0  3.0 |
        | 1.0  2.0  3.0  4.0 |
A    =  | 2.0  3.0  4.0  5.0 |
        | 3.0  4.0  5.0   .  |
        | 4.0  5.0   .    .  |
        |  .    .    .    .  |
        |  .    .    .    .  |
        *                    *
X        =  (1.0, 2.0, 3.0, 4.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0, . )

Output
Y        =  (22.0, . , 60.0, . , 90.0, . , 120.0, . , 140.0, . )

Example 2

This example shows how to use SGBMV to perform the computation y <-- betay+alphaATx, where TRANSA is equal to 'T', and the transpose of a real general band matrix A is used in the computation. It uses the same input as Example 1.

Call Statement and Input
           TRANSA M   N   ML  MU  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |   |   |     |     |   |
CALL SGBMV( 'T' , 5 , 4 , 3 , 2 ,  2.0 , A , 8 , X , 1  , 10.0 , Y , 2  )

Output
Y        =  (70.0, . , 130.0, . , 140.0, . , 148.0, . )

Example 3

This example shows how to use CGBMV to perform the computation y <-- betay+alphaAHx, where TRANSA is equal to 'C', and the complex conjugate of the following general band matrix A is used in the computation. Matrix A is:

             *                                                *
             | (1.0, 1.0)  (1.0, 1.0)  (1.0, 1.0)  (0.0, 0.0) |
             | (2.0, 2.0)  (2.0, 2.0)  (2.0, 2.0)  (2.0, 2.0) |
             | (3.0, 3.0)  (3.0, 3.0)  (3.0, 3.0)  (3.0, 3.0) |
             | (4.0, 4.0)  (4.0, 4.0)  (4.0, 4.0)  (4.0, 4.0) |
             | (0.0, 0.0)  (5.0, 5.0)  (5.0, 5.0)  (0.0, 0.0) |
             *                                                *

Call Statement and Input
           TRANSA M   N   ML  MU  ALPHA   A  LDA  X  INCX  BETA   Y  INCY
             |    |   |   |   |     |     |   |   |   |     |     |   |
CALL CGBMV( 'C' , 5 , 4 , 3 , 2 , ALPHA , A , 8 , X , 1  , BETA , Y , 2  )
        *                                                *
        |     .           .       (1.0, 1.0)  (2.0, 2.0) |
        |     .       (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0) |
        | (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0) |
A    =  | (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0) |
        | (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0)      .      |
        | (4.0, 4.0)  (5.0, 5.0)      .           .      |
        |     .           .           .           .      |
        |     .           .           .           .      |
        *                                                *
X        =  ((1.0, 2.0), (2.0, 3.0), (3.0, 4.0), (4.0, 5.0),
             (5.0, 6.0))
ALPHA    =  (1.0, 1.0)
BETA     =  (10.0, 0.0)
Y        =  ((1.0, 2.0), . , (2.0, 3.0), . , (3.0, 4.0), . ,
             (4.0, 5.0), . )

Output
Y        =  ((70.0, 100.0), . , (130.0, 170.0), . ,
             (140.0, 180.0), . , (148.0, 186.0), . )

Example 4

This example shows how to use SGBMV to perform the computation y <-- betay+alphaAx, where ml >= m and mu  >= n, TRANSA is equal to 'N', and the following real general band matrix A is used in the computation. Matrix A is:

                       *                         *
                       | 1.0  1.0  1.0  1.0  1.0 |
                       | 2.0  2.0  2.0  2.0  2.0 |
                       | 3.0  3.0  3.0  3.0  3.0 |
                       | 4.0  4.0  4.0  4.0  4.0 |
                       *                         *

Call Statement and Input
           TRANSA M   N   ML  MU  ALPHA  A  LDA   X  INCX  BETA   Y  INCY
             |    |   |   |   |     |    |   |    |   |     |     |   |
CALL SGBMV( 'N' , 4 , 5 , 6 , 5 ,  2.0 , A , 12 , X , 1  , 10.0 , Y , 2  )
        *                         *
        |  .    .    .    .    .  |
        |  .    .    .    .   1.0 |
        |  .    .    .   1.0  2.0 |
        |  .    .   1.0  2.0  3.0 |
        |  .   1.0  2.0  3.0  4.0 |
A    =  | 1.0  2.0  3.0  4.0   .  |
        | 2.0  3.0  4.0   .    .  |
        | 3.0  4.0   .    .    .  |
        | 4.0   .    .    .    .  |
        |  .    .    .    .    .  |
        |  .    .    .    .    .  |
        |  .    .    .    .    .  |
        *                         *
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . )

Output
Y        =  (40.0, . , 80.0, . , 120.0, . , 160.0, . )

SSBMV, DSBMV, CHBMV, and ZHBMV--Matrix-Vector Product for a Real Symmetric or Complex Hermitian Band Matrix

SSBMV and DSBMV compute the matrix-vector product for a real symmetric band matrix. CHBMV and ZHBMV compute the matrix-vector product for a complex Hermitian band matrix. The band matrix A is stored in either upper- or lower-band-packed storage mode. It uses the scalars alpha and beta, vectors x and y, and band matrix A:

y <-- betay+alphaAx
y <-- betay+alphaAx

Table 68. Data Types
alpha, beta, x, y, A Subprogram
Short-precision real SSBMV
Long-precision real DSBMV
Short-precision complex CHBMV
Long-precision complex ZHBMV

Syntax

Fortran CALL SSBMV | DSBMV | CHBMV | ZHBMV (uplo, n, k, alpha, a, lda, x, incx, beta, y, incy)
C and C++ ssbmv | dsbmv | chbmv | zhbmv (uplo, n, k, alpha, a, lda, x, incx, beta, y, incy);
PL/I CALL SSBMV | DSBMV | CHBMV | ZHBMV (uplo, n, k, alpha, a, lda, x, incx, beta, y, incy);

On Entry

uplo

indicates the storage mode used for matrix A, where either the upper or lower triangle can be stored:

If uplo = 'U', A is stored in upper-band-packed storage mode.

If uplo = 'L', A is stored in lower-band-packed storage mode.

Specified as: a single character. It must be 'U' or 'L'.

n

is the order of matrix A and the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

k

is the half band width k of the matrix A. Specified as: a fullword integer; k >= 0.

alpha

is the scaling constant alpha. Specified as: a number of the data type indicated in Table 68.

a

is the real symmetric or complex Hermitian band matrix A of order n, having a half band width of k, where:

If uplo = 'U', A is stored in upper-band-packed storage mode.

If uplo = 'L', A is stored in lower-band-packed storage mode.

Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 68, where lda >= k+1.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= k+1.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 68.

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

beta

is the scaling constant beta. Specified as: a number of the data type indicated in Table 68.

y

is the vector y of length n. Specified as: a one-dimensional array of (at least) length n, containing numbers of the data type indicated in Table 68.

incy

is the stride for vector y. Specified as: a fullword integer; incy > 0 or incy < 0.

On Return

y

is the vector y of length n, containing the result of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 68.

Notes

  1. All subroutines accept lowercase letters for the uplo argument.

  2. Vector y must have no common elements with matrix A or vector x; otherwise, results are unpredictable. See "Concepts".

  3. To achieve optimal performance in these subroutines, use lda = k+1.

  4. The imaginary parts of the diagonal elements of the complex Hermitian matrix A are assumed to be zero, so you do not have to set these values.

  5. For real symmetric and complex Hermitian band matrices, if you specify k >= n, ESSL assumes, only for purposes of the computation, that the half band width of matrix A is n-1; that is, it processes matrix A, of order n, as though it is a (nonbanded) real symmetric or complex Hermitian matrix. However, ESSL uses the original value for k for the purposes of finding the locations of element a11 and all other elements in the array specified for A, as described in the storage modes referenced in the next note. For an illustration of this technique, see "Example 3".

  6. For a description of how a real symmetric band matrix is stored, see "Upper-Band-Packed Storage Mode" or "Lower-Band-Packed Storage Mode". For a description of how a complex Hermitian band matrix is stored, see "Complex Hermitian Matrix".

Function

These subroutines perform the following matrix-vector product, using a real symmetric or complex Hermitian band matrix A, stored in either upper- or lower-band-packed storage mode:

y <-- betay+alphaAx

where:

x and y are vectors of length n.
alpha and beta are scalars.
A is an real symmetric or complex Hermitian band matrix of order n, having a half band width of k.

For SSBMV and CHBMV, intermediate results are accumulated in long precision. Occasionally, for performance reasons, these intermediate results are truncated to short precision and stored.

See references [34], [38], [46], and [73]. No computation is performed if n is 0 or if alpha is zero and beta is one.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. uplo <> 'U' or 'L'
  2. n < 0
  3. k < 0
  4. lda <= 0
  5. lda < k+1
  6. incx = 0
  7. incy = 0

Example 1

This example shows how to use SSBMV to perform the matrix-vector product, where the real symmetric band matrix A of order 7 and half band width of 3 is stored in upper-band-packed storage mode. Matrix A is:

                  *                                   *
                  | 1.0  1.0  1.0  1.0  0.0  0.0  0.0 |
                  | 1.0  2.0  2.0  2.0  2.0  0.0  0.0 |
                  | 1.0  2.0  3.0  3.0  3.0  3.0  0.0 |
                  | 1.0  2.0  3.0  4.0  4.0  4.0  4.0 |
                  | 0.0  2.0  3.0  4.0  5.0  5.0  5.0 |
                  | 0.0  0.0  3.0  4.0  5.0  6.0  6.0 |
                  | 0.0  0.0  0.0  4.0  5.0  6.0  7.0 |
                  *                                   *

Call Statement and Input
            UPLO  N   K  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |    |    |   |   |   |     |     |   |
CALL SSBMV( 'U' , 7 , 3 , 2.0 , A , 5 , X , 1  , 10.0 , Y , 2  )
 
        *                                   *
        |  .    .    .   1.0  2.0  3.0  4.0 |
        |  .    .   1.0  2.0  3.0  4.0  5.0 |
A    =  |  .   1.0  2.0  3.0  4.0  5.0  6.0 |
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0 |
        |  .    .    .    .    .    .    .  |
        *                                   *
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0, . , 6.0, . , 7.0)

Output
Y        =  (30.0, . , 78.0, . , 148.0, . , 244.0, . , 288.0, . ,
             316.0, . , 322.0)

Example 2

This example shows how to use CHBMV to perform the matrix-vector product, where the complex Hermitian band matrix A of order 7 and half band width of 3 is stored in lower-band-packed storage mode. Matrix A is:


        *                                                                                     *
        |  (1.0, 0.0)  (1.0, 1.0)  (1.0, 1.0)  (1.0, 1.0)  (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0) |
        | (1.0, -1.0)  (2.0, 0.0)  (2.0, 2.0)  (2.0, 2.0)  (2.0, 2.0)  (0.0, 0.0)  (0.0, 0.0) |
        | (1.0, -1.0) (2.0, -2.0)  (3.0, 0.0)  (3.0, 3.0)  (3.0, 3.0)  (3.0, 3.0)  (0.0, 0.0) |
        | (1.0, -1.0) (2.0, -2.0) (3.0, -3.0)  (4.0, 0.0)  (4.0, 4.0)  (4.0, 4.0)  (4.0, 4.0) |
        |  (0.0, 0.0) (2.0, -2.0) (3.0, -3.0) (4.0, -4.0)  (5.0, 0.0)  (5.0, 5.0)  (5.0, 5.0) |
        |  (0.0, 0.0)  (0.0, 0.0) (3.0, -3.0) (4.0, -4.0) (5.0, -5.0)  (6.0, 0.0)  (6.0, 6.0) |
        |  (0.0, 0.0)  (0.0, 0.0)  (0.0, 0.0) (4.0, -4.0) (5.0, -5.0) (6.0, -6.0)  (7.0, 0.0) |
        *                                                                                     *

Note: The imaginary parts of the diagonal elements of a complex Hermitian matrix are assumed to be zero, so you do not need to set these values.

Call Statement and Input
            UPLO  N   K   ALPHA   A  LDA  X  INCX  BETA   Y  INCY
             |    |   |     |     |   |   |   |     |     |   |
CALL CHBMV( 'L' , 7 , 3 , ALPHA , A , 5 , X , 1  , BETA , Y , 2  )
 
ALPHA    =  (2.0, 0.0)
BETA     =  (10.0, 0.0)


        *                                                                             *
        |  (1.0, . )  (2.0, . )  (3.0, . )  (4.0, . )  (5.0, . )  (6.0, . ) (7.0, . ) |
        | (1.0, 1.0) (2.0, 2.0) (3.0, 3.0) (4.0, 4.0) (5.0, 5.0) (6.0, 6.0)     .     |
A    =  | (1.0, 1.0) (2.0, 2.0) (3.0, 3.0) (4.0, 4.0) (5.0, 5.0)   .            .     |
        | (1.0, 1.0) (2.0, 2.0) (3.0, 3.0) (4.0, 4.0)   .          .            .     |
        |     .          .          .          .        .          .            .     |
        *                                                                             *

X        =  ((1.0, 1.0), (2.0, 2.0), (3.0, 3.0), (4.0, 4.0),
             (5.0, 5.0), (6.0, 6.0), (7.0, 7.0))
Y        =  ((1.0, 1.0), . , (2.0, 2.0), . , (3.0, 3.0), . ,
             (4.0, 4.0), . , (5.0, 5.0), . , (6.0, 6.0), . ,
             (7.0, 7.0))

Output
Y        =  ((48.0, 12.0), . , (124.0, 32.0), . , (228.0, 68.0), . ,
             (360.0, 128.0), . , (360.0, 216.0), . ,
             (300.0, 332.0), . , (168.0, 476.0))

Example 3

This example shows how to use SSBMV to perform the matrix-vector product, where n >= k. Matrix A is a real 5 by 5 symmetric band matrix with a half band width of 5, stored in upper-band-packed storage mode. Matrix A is:

                       *                         *
                       | 1.0  1.0  1.0  1.0  1.0 |
                       | 1.0  2.0  2.0  2.0  2.0 |
                       | 1.0  2.0  3.0  3.0  3.0 |
                       | 1.0  2.0  3.0  4.0  4.0 |
                       | 1.0  2.0  3.0  4.0  5.0 |
                       *                         *

Call Statement and Input
            UPLO  N   K  ALPHA  A  LDA  X  INCX  BETA   Y  INCY
             |    |   |    |    |   |   |   |     |     |   |
CALL SSBMV( 'U' , 5 , 5 , 2.0 , A , 7 , X , 1  , 10.0 , Y , 2  )
 
        *                         *
        |  .    .    .    .    .  |
        |  .    .    .    .   1.0 |
        |  .    .    .   1.0  2.0 |
A    =  |  .    .   1.0  2.0  3.0 |
        |  .   1.0  2.0  3.0  4.0 |
        | 1.0  2.0  3.0  4.0  5.0 |
        |  .    .    .    .    .  |
        *                         *
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0)
Y        =  (1.0, . , 2.0, . , 3.0, . , 4.0, . , 5.0, . )

Output
Y        =  (40.0, . , 78.0, . , 112.0, . , 140.0, . , 160.0, . )

STRMV, DTRMV, CTRMV, ZTRMV, STPMV, DTPMV, CTPMV, and ZTPMV--Matrix-Vector Product for a Triangular Matrix, Its Transpose, or Its Conjugate Transpose

STRMV, DTRMV, STPMV, and DTPMV compute one of the following matrix-vector products, using the vector x and triangular matrix A or its transpose:

x <-- Ax
x <-- ATx

CTRMV, ZTRMV, CTPMV, and ZTPMV compute one of the following matrix-vector products, using the vector x and triangular matrix A, its transpose, or its conjugate transpose:

x <-- Ax
x <-- ATx
x <-- AHx

Matrix A can be either upper or lower triangular, where:


Table 69. Data Types
A, x Subprogram
Short-precision real STRMV and STPMV
Long-precision real DTRMV and DTPMV
Short-precision complex CTRMV and CTPMV
Long-precision complex ZTRMV and ZTPMV

Syntax

Fortran CALL STRMV | DTRMV | CTRMV | ZTRMV (uplo, transa, diag, n, a, lda, x, incx)

CALL STPMV | DTPMV | CTPMV | ZTPMV (uplo, transa, diag, n, ap, x, incx)

C and C++ strmv | dtrmv | ctrmv | ztrmv (uplo, transa, diag, n, a, lda, x, incx);

stpmv | dtpmv | ctpmv | ztpmv (uplo, transa, diag, n, ap, x, incx);

PL/I CALL STRMV | DTRMV | CTRMV | ZTRMV (uplo, transa, diag, n, a, lda, x, incx);

CALL STPMV | DTPMV | CTPMV | ZTPMV (uplo, transa, diag, n, ap, x, incx);

On Entry

uplo

indicates whether matrix A is an upper or lower triangular matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation.

If transa = 'T', AT is used in the computation.

If transa = 'C', AH is used in the computation.

Specified as: a single character. It must be 'N', 'T', or 'C'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

n

is the order of triangular matrix A. Specified as: a fullword integer; 0 <= n <= lda.

a

is the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular storage mode, respectively.
Note: No data should be moved to form AT or AH; that is, the matrix A should always be stored in its untransposed form.
Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 69.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= n.

ap

is the upper or lower triangular matrix A of order n, stored in upper- or lower-triangular-packed storage mode, respectively. Specified as: a one-dimensional array of (at least) length n(n+1)/2, containing numbers of the data type indicated in Table 69.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 69.

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

On Return

x

is the vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 69.

Notes

  1. These subroutines accept lowercase letters for the uplo, transa, and diag arguments.

  2. For STRMV, DTRMV, STPMV, and DTPMV if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.

  3. Matrix A and vector x must have no common elements; otherwise, results are unpredictable.

  4. ESSL assumes certain values in your array for parts of a triangular matrix. As a result, you do not have to set these values. For unit triangular matrices, the elements of the diagonal are assumed to be 1.0 for real matrices and (1.0, 0.0) for complex matrices. When using upper- or lower-triangular storage, the unreferenced elements in the lower and upper triangular part, respectively, are assumed to be zero.

  5. For a description of triangular matrices and how they are stored in upper- and lower-triangular storage mode and in upper- and lower-triangular-packed storage mode, see "Triangular Matrix".

Function

These subroutines can perform the following matrix-vector product computations, using the triangular matrix A, its transpose, or its conjugate transpose, where A can be either upper or lower triangular:

x <-- Ax
x <-- ATx
x <-- AHx (for CTRMV, ZTRMV, CTPMV, and ZTPMV only)

where:

x is a vector of length n.
A is an upper or lower triangular matrix of order n. For _TRMV, it is stored in upper- or lower-triangular storage mode, respectively. For _TPMV, it is stored in upper- or lower-triangular-packed storage mode, respectively.

See references [32] and [38]. If n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. uplo <>  'L' or 'U'
  2. transa <>  'T', 'N', or 'C'
  3. diag <>  'N' or 'U'
  4. n < 0
  5. lda <= 0
  6. lda < n
  7. incx = 0

Example 1

This example shows the computation x <-- Ax. Matrix A is a real 4 by 4 lower triangular matrix that is unit triangular, stored in lower-triangular storage mode. Vector x is a vector of length 4. Matrix A is:

                            *                    *
                            | 1.0   .    .    .  |
                            | 1.0  1.0   .    .  |
                            | 2.0  3.0  1.0   .  |
                            | 3.0  4.0  3.0  1.0 |
                            *                    *
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input
            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL STRMV( 'L' , 'N' , 'U' , 4 , A , 4 , X , 1  )
 
        *                   *
        |  .    .    .    . |
A    =  | 1.0   .    .    . |
        | 2.0  3.0   .    . |
        | 3.0  4.0  3.0   . |
        *                   *
 
X        =  (1.0, 2.0, 3.0, 4.0)

Output
X        =  (1.0, 3.0, 11.0, 24.0)

Example 2

This example shows the computation x <-- ATx. Matrix A is a real 4 by 4 upper triangular matrix that is unit triangular, stored in upper-triangular storage mode. Vector x is a vector of length 4. Matrix A is:

                        *                    *
                        | 1.0  2.0  3.0  2.0 |
                        |  .   1.0  2.0  5.0 |
                        |  .    .   1.0  3.0 |
                        |  .    .    .   1.0 |
                        *                    *
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input
            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL STRMV( 'U' , 'T' , 'U' , 4 , A , 4 , X , 1  )
 
        *                    *
        |  .   2.0  3.0  2.0 |
A    =  |  .    .   2.0  5.0 |
        |  .    .    .   3.0 |
        |  .    .    .    .  |
        *                    *
 
X        =  (5.0, 4.0, 3.0, 2.0)

Output
X        =  (5.0, 14.0, 26.0, 41.0)

Example 3

This example shows the computation x <-- AHx. Matrix A is a complex 4 by 4 upper triangular matrix that is unit triangular, stored in upper-triangular storage mode. Vector x is a vector of length 4. Matrix A is:

               *                                                *
               | (1.0, 0.0)  (2.0, 2.0)  (3.0, 3.0)  (2.0, 2.0) |
               |     .       (1.0, 0.0)  (2.0, 2.0)  (5.0, 5.0) |
               |     .           .       (1.0, 0.0)  (3.0, 3.0) |
               |     .           .           .       (1.0, 0.0) |
               *                                                *
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of (1.0, 0.0) for the diagonal elements.

Call Statement and Input
            UPLO TRANSA DIAG  N   A  LDA  X  INCX
             |     |     |    |   |   |   |   |
CALL CTRMV( 'U' , 'C' , 'U' , 4 , A , 4 , X , 1  )
 
        *                                       *
        | .  (2.0, 2.0)  (3.0, 3.0)  (2.0, 2.0) |
A    =  | .      .       (2.0, 2.0)  (5.0, 5.0) |
        | .      .           .       (3.0, 3.0) |
        | .      .           .           .      |
        *                                       *
 
X        =  ((5.0, 5.0), (4.0, 4.0), (3.0, 3.0), (2.0, 2.0))

Output
X        =  ((5.0, 5.0), (24.0, 4.0), (49.0, 3.0), (80.0, 2.0))

Example 4

This example shows the computation x <-- Ax. Matrix A is a real 4 by 4 lower triangular matrix that is unit triangular, stored in lower-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                          *                    *
                          | 1.0   .    .    .  |
                          | 1.0  1.0   .    .  |
                          | 2.0  3.0  1.0   .  |
                          | 3.0  4.0  3.0  1.0 |
                          *                    *
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of 1.0 for the diagonal elements.

Call Statement and Input
            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL STPMV( 'L' , 'N' , 'U' , 4 , AP , X , 1  )
 
AP       =  ( . , 1.0, 2.0, 3.0, . , 3.0, 4.0, . , 3.0, . )
X        =  (1.0, 2.0, 3.0, 4.0)

Output
X        =  (1.0, 3.0, 11.0, 24.0)

Example 5

This example shows the computation x <-- ATx. Matrix A is a real 4 by 4 upper triangular matrix that is not unit triangular, stored in upper-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                          *                    *
                          | 1.0  2.0  3.0  2.0 |
                          |  .   2.0  2.0  5.0 |
                          |  .    .   3.0  3.0 |
                          |  .    .    .   1.0 |
                          *                    *

Call Statement and Input
            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL STPMV( 'U' , 'T' , 'N' , 4 , AP , X , 1  )
 
AP       =  (1.0, 2.0, 2.0, 3.0, 2.0, 3.0, 2.0, 5.0, 3.0, 1.0)
X        =  (5.0, 4.0, 3.0, 2.0)

Output
X        =  (5.0, 18.0, 32.0, 41.0)

Example 6

This example shows the computation x <-- AHx. Matrix A is a complex 4 by 4 upper triangular matrix that is unit triangular, stored in upper-triangular-packed storage mode. Vector x is a vector of length 4. Matrix A is:

          *                                             *
          | (1.0, 0.0) (2.0, 2.0) (3.0, 3.0) (2.0, 2.0) |
          |     .      (1.0, 0.0) (2.0, 2.0) (5.0, 5.0) |
          |     .          .      (1.0, 0.0) (3.0, 3.0) |
          |     .          .          .      (1.0, 0.0) |
          *                                             *
Note: Because matrix A is unit triangular, the diagonal elements are not referenced. ESSL assumes a value of (1.0, 0.0) for the diagonal elements.

Call Statement and Input
            UPLO TRANSA DIAG  N   AP   X  INCX
             |     |     |    |   |    |   |
CALL CTPMV( 'U' , 'C' , 'U' , 4 , AP , X , 1  )
 
AP       =  ( . , (2.0, 2.0), . , (3.0, 3.0), (2.0, 2.0), . ,
             (2.0, 2.0), (5.0, 5.0), (3.0, 3.0), . )
X        =  ((5.0, 5.0), (4.0, 4.0), (3.0, 3.0), (2.0, 2.0))

Output
X        =  ((5.0, 5.0), (24.0, 4.0), (49.0, 3.0), (80.0, 2.0))

STBMV, DTBMV, CTBMV, and ZTBMV--Matrix-Vector Product for a Triangular Band Matrix, Its Transpose, or Its Conjugate Transpose

STBMV and DTBMV compute one of the following matrix-vector products, using the vector x and triangular band matrix A or its transpose:

x <-- Ax
x <-- ATx

CTBMV and ZTBMV compute one of the following matrix-vector products, using the vector x and triangular band matrix A, its transpose, or its conjugate transpose:

x <-- Ax
x <-- ATx
x <-- AHx

Matrix A can be either upper or lower triangular and is stored in upper- or lower-triangular-band-packed storage mode, respectively.

Table 70. Data Types
A, x Subprogram
Short-precision real STBMV
Long-precision real DTBMV
Short-precision complex CTBMV
Long-precision complex ZTBMV

Syntax

Fortran CALL STBMV | DTBMV | CTBMV | ZTBMV (uplo, transa, diag, n, k, a, lda, x, incx)
C and C++ stbmv | dtbmv | ctbmv | ztbmv (uplo, transa, diag, n, k, a, lda, x, incx);
PL/I CALL STBMV | DTBMV | CTBMV | ZTBMV (uplo, transa, diag, n, k, a, lda, x, incx);

On Entry

uplo

indicates whether matrix A is an upper or lower triangular band matrix, where:

If uplo = 'U', A is an upper triangular matrix.

If uplo = 'L', A is a lower triangular matrix.

Specified as: a single character. It must be 'U' or 'L'.

transa

indicates the form of matrix A to use in the computation, where:

If transa = 'N', A is used in the computation.

If transa = 'T', AT is used in the computation.

If transa = 'C', AH is used in the computation.

Specified as: a single character. It must be 'N', 'T', or 'C'.

diag

indicates the characteristics of the diagonal of matrix A, where:

If diag = 'U', A is a unit triangular matrix.

If diag = 'N', A is not a unit triangular matrix.

Specified as: a single character. It must be 'U' or 'N'.

n

is the order of triangular band matrix A. Specified as: a fullword integer; n >= 0.

k

is the upper or lower band width k of the matrix A. Specified as: a fullword integer; k >= 0.

a

is the upper or lower triangular band matrix A of order n, stored in upper- or lower-triangular-band-packed storage mode, respectively.
Note: No data should be moved to form AT or AH; that is, the matrix A should always be stored in its untransposed form.
Specified as: an lda by (at least) n array, containing numbers of the data type indicated in Table 70.

lda

is the leading dimension of the array specified for a. Specified as: a fullword integer; lda > 0 and lda >= k+1.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length 1+(n-1)|incx|, containing numbers of the data type indicated in Table 70.

incx

is the stride for vector x. Specified as: a fullword integer; incx > 0 or incx < 0.

On Return

x

is the vector x of length n, containing the results of the computation. Returned as: a one-dimensional array, containing numbers of the data type indicated in Table 70.

Notes

  1. These subroutines accept lowercase letters for the uplo, transa, and diag arguments.

  2. For STBMV and DTBMV, if you specify 'C' for the transa argument, it is interpreted as though you specified 'T'.

  3. Matrix A and vector x must have no common elements; otherwise, results are unpredictable.

  4. To achieve optimal performance in these subroutines, use lda = k+1.

  5. For unit triangular matrices, the elements of the diagonal are assumed to be 1.0 for real matrices and (1.0, 0.0) for complex matrices. As a result, you do not have to set these values.

  6. For both upper and lower triangular band matrices, if you specify k >= n, ESSL assumes, only for purposes of the computation, that the upper or lower band width of matrix A is n-1; that is, it processes matrix A, of order n, as though it is a (nonbanded) triangular matrix. However, ESSL uses the original value for k for the purposes of finding the locations of element a11 and all other elements in the array specified for A, as described in "Triangular Band Matrix". For an illustration of this technique, see "Example 4".

  7. For a description of triangular band matrices and how they are stored in upper- and lower-triangular-band-packed storage mode, see "Triangular Band Matrix".

  8. If you are using a lower triangular band matrix, you may want to use this alternate approach instead of using lower-triangular-band-packed storage mode. Leave matrix A in full-matrix storage mode when you pass it to ESSL and specify the lda argument to be lda+1, which is the leading dimension of matrix A plus 1. ESSL then processes the matrix elements in the same way as though you had set them up in lower-triangular-band-packed storage mode.

Function

These subroutines can perform the following matrix-vector product computations, using the triangular band matrix A, its transpose, or its conjugate transpose, where A can be either upper or lower triangular:

x <-- Ax
x <-- ATx
x <-- AHx (for CTBMV and ZTBMV only)

where:

x is a vector of length n.
A is an upper or lower triangular band matrix of order n, stored in upper- or lower-triangular-band-packed storage mode, respectively.

See references [34], [46], and [38]. If n is 0, no computation is performed.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. uplo <> 'L' or 'U'
  2. transa <> 'T', 'N', or 'C'
  3. diag <> 'N' or 'U'
  4. n < 0
  5. k < 0
  6. lda <= 0
  7. lda < k+1
  8. incx = 0

Example 1

This example shows the computation x <-- Ax. Matrix A is a real 7 by 7 upper triangular band matrix with a half band width of 3 that is not unit triangular, stored in upper-triangular-band-packed storage mode. Vector x is a vector of length 7. Matrix A is:

                  *                                   *
                  | 1.0  1.0  1.0  1.0  0.0  0.0  0.0 |
                  | 0.0  2.0  2.0  2.0  2.0  0.0  0.0 |
                  | 0.0  0.0  3.0  3.0  3.0  3.0  0.0 |
                  | 0.0  0.0  0.0  4.0  4.0  4.0  4.0 |
                  | 0.0  0.0  0.0  0.0  5.0  5.0  5.0 |
                  | 0.0  0.0  0.0  0.0  0.0  6.0  6.0 |
                  | 0.0  0.0  0.0  0.0  0.0  0.0  7.0 |
                  *                                   *

Call Statement and Input
            UPLO TRANSA DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBMV( 'U' , 'N' , 'N' , 7 , 3 , A , 5 , X , 1  )
 
        *                                   *
        |  .    .    .   1.0  2.0  3.0  4.0 |
        |  .    .   1.0  2.0  3.0  4.0  5.0 |
A    =  |  .   1.0  2.0  3.0  4.0  5.0  6.0 |
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0 |
        |  .    .    .    .    .    .    .  |
        *                                   *
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0)

Output
X        =  (10.0, 28.0, 54.0, 88.0, 90.0, 78.0, 49.0)

Example 2

This example shows the computation x <-- ATx. Matrix A is a real 7 by 7 lower triangular band matrix with a half band width of 3 that is not unit triangular, stored in lower-triangular-band-packed storage mode. Vector x is a vector of length 7. Matrix A is:

                  *                                   *
                  | 1.0  0.0  0.0  0.0  0.0  0.0  0.0 |
                  | 1.0  2.0  0.0  0.0  0.0  0.0  0.0 |
                  | 1.0  2.0  3.0  0.0  0.0  0.0  0.0 |
                  | 1.0  2.0  3.0  4.0  0.0  0.0  0.0 |
                  | 0.0  2.0  3.0  4.0  5.0  0.0  0.0 |
                  | 0.0  0.0  3.0  4.0  5.0  6.0  0.0 |
                  | 0.0  0.0  0.0  4.0  5.0  6.0  7.0 |
                  *                                   *

Call Statement and Input
            UPLO TRANSA DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBMV( 'L' , 'T' , 'N' , 7 , 3 , A , 5 , X , 1  )
 
        *                                   *
        | 1.0  2.0  3.0  4.0  5.0  6.0  7.0 |
        | 1.0  2.0  3.0  4.0  5.0  6.0   .  |
A    =  | 1.0  2.0  3.0  4.0  5.0   .    .  |
        | 1.0  2.0  3.0  4.0   .    .    .  |
        |  .    .    .    .    .    .    .  |
        *                                   *
 
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0, 7.0)

Output
X        =  (10.0, 28.0, 54.0, 88.0, 90.0, 78.0, 49.0)

Example 3

This example shows the computation x <-- AHx. Matrix A is a complex 7 by 7 upper triangular band matrix with a half band width of 3 that is not unit triangular, stored in upper-triangular-band-packed storage mode. Vector x is a vector of length 7. Matrix A is:


        *                                                                              *
        | (1.0, 1.0) (1.0, 1.0) (1.0, 1.0) (1.0, 1.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) |
        | (0.0, 0.0) (2.0, 2.0) (2.0, 2.0) (2.0, 2.0) (2.0, 2.0) (0.0, 0.0) (0.0, 0.0) |
        | (0.0, 0.0) (0.0, 0.0) (3.0, 3.0) (3.0, 3.0) (3.0, 3.0) (3.0, 3.0) (0.0, 0.0) |
        | (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (4.0, 4.0) (4.0, 4.0) (4.0, 4.0) (4.0, 4.0) |
        | (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (5.0, 5.0) (5.0, 5.0) (5.0, 5.0) |
        | (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (6.0, 6.0) (6.0, 6.0) |
        | (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (0.0, 0.0) (7.0, 7.0) |
        *                                                                              *

Call Statement and Input
            UPLO TRANSA DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL CTBMV( 'U' , 'C' , 'N' , 7 , 3 , A , 5 , X , 1  )


        *                                                                                    *
        |     .           .           .       (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0) |
        |     .           .       (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0) |
A    =  |     .       (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0)  (6.0, 6.0) |
        | (1.0, 1.0)  (2.0, 2.0)  (3.0, 3.0)  (4.0, 4.0)  (5.0, 5.0)  (6.0, 6.0)  (7.0, 7.0) |
        |     .           .           .           .           .           .           .      |
        *                                                                                    *

X        =  ((1.0, 2.0), (2.0, 4.0), (3.0, 6.0), (4.0, 8.0),
             (5.0, 10.0), (6.0, 12.0), (7.0, 14.0))

Output
X        =  ((1.0, 2.0), (7.0, 9.0), (24.0, 23.0), (58.0, 46.0),
             (112.0, 79.0), (186.0, 122.0), (280.0, 175.0))

Example 4

This example shows the computation x <-- ATx, where k > n. Matrix A is a real 4 by 4 upper triangular band matrix with a half band width of 5 that is not unit triangular, stored in upper-triangular-band-packed storage mode. Vector x is a vector of length 4. Matrix A is:

                         *                    *
                         | 1.0  1.0  1.0  1.0 |
                         |  .   2.0  2.0  2.0 |
                         |  .    .   3.0  3.0 |
                         |  .    .    .   4.0 |
                         *                    *

Call Statement and Input
            UPLO TRANSA DIAG  N   K   A  LDA  X  INCX
             |     |     |    |   |   |   |   |   |
CALL STBMV( 'U' , 'T' , 'N' , 4 , 5 , A , 6 , X , 1  )
 
        *                     *
        |  .    .    .    .   |
A    =  |  .    .    .    .   |
        |  .    .    .   1.0  |
        |  .    .   1.0  2.0  |
        |  .   1.0  2.0  3.0  |
        | 1.0  2.0  3.0  4.0  |
        *                     *
 
X        =  (1.0, 2.0, 3.0, 4.0)

Output
X        =  (1.0, 5.0, 14.0, 30.0)

Sparse Matrix-Vector Subprograms

This section contains the sparse matrix-vector subprogram descriptions.

DSMMX--Matrix-Vector Product for a Sparse Matrix in Compressed-Matrix Storage Mode

This subprogram computes the matrix-vector product for sparse matrix A, stored in compressed-matrix storage mode, using the matrix and vectors x and y:

y <-- Ax

where A, x, and y contain long-precision real numbers. You can use DSMTM to transpose matrix A before calling this subroutine. The resulting computation performed by this subroutine is then y <-- ATx.

Syntax

Fortran CALL DSMMX (m, nz, ac, ka, lda, x, y)
C and C++ dsmmx (m, nz, ac, ka, lda, x, y);
PL/I CALL DSMMX (m, nz, ac, ka, lda, x, y);

On Entry

m

is the number of rows in sparse matrix A and the number of elements in vector y. Specified as: a fullword integer; m >= 0.

nz

is the maximum number of nonzero elements in each row of sparse matrix A. Specified as: a fullword integer; nz >= 0.

ac

is the m by n sparse matrix A, stored in compressed-matrix storage mode in an array, referred to as AC. Specified as: an lda by (at least) nz array, containing long-precision real numbers.

ka

is the array, referred to as KA, containing the column numbers of the matrix A elements stored in the corresponding positions in array AC. Specified as: an lda by (at least) nz array, containing fullword integers, where 1 <= (elements of KA) <= n.

lda

is the size of the leading dimension of the arrays specified for ac and ka. Specified as: a fullword integer; lda > 0 and lda >= m.

x

is the vector x of length n. Specified as: a one-dimensional array of (at least) length n, containing long-precision real numbers.

y

See 'On Return'.

On Return

y

is the vector y of length m, containing the result of the computation. Returned as: a one-dimensional array of (at least) length m, containing long-precision real numbers.

Notes

  1. Matrix A must have no common elements with vectors x and y; otherwise, results are unpredictable.

  2. For the KA array, where there are no corresponding nonzero elements in AC, you must still fill in a number between 1 and n. See the "Example".

  3. For a description of how sparse matrices are stored in compressed-matrix storage mode, see "Compressed-Matrix Storage Mode".

  4. If your sparse matrix is stored by rows, as defined in "Storage-by-Rows", you should first use the DSRSM utility subroutine, described in DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode, to convert your sparse matrix to compressed-matrix storage mode.

Function

The matrix-vector product is computed for a sparse matrix, stored in compressed matrix mode:

y <-- Ax

where:

A is an m by n sparse matrix, stored in compressed-matrix storage mode in arrays AC and KA.
x is a vector of length n.
y is a vector of length m.

It is expressed as follows:



Figure ESYGR109 not displayed.


See reference [67]. If m is 0, no computation is performed; if nz is 0, output vector y is set to zero, because matrix A contains all zeros.

If your program uses a sparse matrix stored by rows and you want to use this subroutine, you should first convert your sparse matrix to compressed-matrix storage mode by using the DSRSM utility subroutine described in DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. m < 0
  2. lda <= 0
  3. m > lda
  4. nz < 0

Example

This example shows the matrix-vector product computed for the following sparse matrix A, which is stored in compressed-matrix storage mode in arrays AC and KA. Matrix A is:

                     *                              *
                     | 4.0  0.0  7.0  0.0  0.0  0.0 |
                     | 3.0  4.0  0.0  2.0  0.0  0.0 |
                     | 0.0  2.0  4.0  0.0  4.0  0.0 |
                     | 0.0  0.0  7.0  4.0  0.0  1.0 |
                     | 1.0  0.0  0.0  3.0  4.0  0.0 |
                     | 1.0  1.0  0.0  0.0  3.0  4.0 |
                     *                              *

Call Statement and Input
            M   NZ  AC   KA  LDA  X   Y
            |   |   |    |    |   |   |
CALL DSMMX( 6 , 4 , AC , KA , 6 , X , Y )
        *                    *
        | 4.0  7.0  0.0  0.0 |
        | 4.0  3.0  2.0  0.0 |
AC   =  | 4.0  2.0  4.0  0.0 |
        | 4.0  7.0  1.0  0.0 |
        | 4.0  1.0  3.0  0.0 |
        | 4.0  1.0  1.0  3.0 |
        *                    *
        *            *
        | 1  3  1  1 |
        | 2  1  4  1 |
KA   =  | 3  2  5  1 |
        | 4  3  6  1 |
        | 5  1  4  1 |
        | 6  1  2  5 |
        *            *
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0)

Output
Y        =  (25.0, 19.0, 36.0, 43.0, 33.0, 42.0)

DSMTM--Transpose a Sparse Matrix in Compressed-Matrix Storage Mode

This subprogram transposes sparse matrix A, stored in compressed-matrix storage mode, where A contains long-precision real numbers.

Syntax

Fortran CALL DSMTM (m, nz, ac, ka, lda, n, nt, at, kt, ldt, aux, naux)
C and C++ dsmtm (m, nz, ac, ka, lda, n, nt, at, kt, ldt, aux, naux);
PL/I CALL DSMTM (m, nz, ac, ka, lda, n, nt, at, kt, ldt, aux, naux);

On Entry

m

is the number of rows in sparse matrix A. Specified as: a fullword integer; m >= 0.

nz

is the maximum number of nonzero elements in each row of sparse matrix A. Specified as: a fullword integer; nz >= 0.

ac

is the m by n sparse matrix A, stored in compressed-matrix storage mode in an array, referred to as AC. Specified as: an lda by (at least) nz array, containing long-precision real numbers.

ka

is the array, referred to as KA, containing the column numbers of the matrix A elements stored in the corresponding positions in array AC. Specified as: an lda by (at least) nz array, containing fullword integers, where 1 <= (elements of KA) <= n.

lda

is the size of the leading dimension of the arrays specified for ac and ka. Specified as: a fullword integer; lda > 0 and lda >= m.

n

is the number of columns in sparse matrix A. Specified as: a fullword integer; 0 <= n <= ldt and n >=  (maximum column index in KA).

nt

is the number of columns in output arrays AT and KT that are available for use. Specified as: a fullword integer; nt > 0.

at

See 'On Return'.

kt

See 'On Return'.

ldt

is the size of the leading dimension of the arrays specified for at and kt. Specified as: a fullword integer; ldt > 0 and ldt >= n.

aux

has the following meaning:

If naux = 0 and error 2015 is unrecoverable, aux is ignored.

Otherwise, it is a storage work area used by this subroutine. Its size is specified by naux.

Specified as: an area of storage, containing long-precision real numbers. They can have any value.

naux

is the size of the work area specified by aux--that is, the number of elements in aux. Specified as: a fullword integer, where:

If naux = 0 and error 2015 is unrecoverable, DSMTM dynamically allocates the work area used by this subroutine. The work area is deallocated before control is returned to the calling program.

Otherwise, naux >= n.

On Return

n

is the number of rows in the transposed matrix AT. Returned as: a fullword integer; n = (maximum column index in KA).

nt

is the maximum number of nonzero elements, nt, in each row of the transposed matrix AT. Returned as: a fullword integer; nt <= m.

at

is the n by (at least) m sparse matrix transpose AT, stored in compressed-matrix storage mode in an array, referred to as AT. Returned as: an ldt by (at least) nt array, containing long-precision real numbers.

kt

is the array, referred to as KT, containing the column numbers of the transposed matrix AT elements, stored in the corresponding positions in array AT. Returned as: an ldt by (at least) nt array, containing fullword integers, where 1 <= (elements of KT) <= m.

Notes

  1. In your C program, arguments n and nt must be passed by reference.

  2. The value specified for input argument nt should be greater than or equal to the number of nonzero elements you estimate to be in each row of the transposed sparse matrix AT. The output value is less than or equal to the input value you specify.

  3. For the KA array, where there are no corresponding nonzero elements in AC, you must still fill in a number between 1 and n. See the "Example".

  4. For a description of how sparse matrices are stored in compressed-matrix storage mode, see "Compressed-Matrix Storage Mode".

  5. If your sparse matrix is stored by rows, as defined in "Storage-by-Rows", you should first use the DSRSM utility subroutine, described in DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode, to convert your sparse matrix to compressed-matrix storage mode.

  6. You have the option of having the minimum required value for naux dynamically returned to your program. For details, see "Using Auxiliary Storage in ESSL".

Function

A sparse matrix A, stored in arrays AC and KA in compressed-matrix storage mode, is transposed, forming AT, and is stored in arrays AT and KT in compressed-matrix storage mode. See reference [67]. This subroutine is provided for when you want to do a matrix-vector product using a transposed matrix, AT. First, you transpose a matrix, A, using this subroutine, then you call DSMMX with the transposed matrix AT. This results in the following computation being performed: y <-- ATx.

If your program uses a sparse matrix stored by rows and you want to use this subroutine, you should first convert your sparse matrix to compressed-matrix storage mode by using the DSRSM utility subroutine described in DSRSM--Convert a Sparse Matrix from Storage-by-Rows to Compressed-Matrix Storage Mode.

Error Conditions

Resource Errors

Error 2015 is unrecoverable, naux = 0, and unable to allocate work area.

Computational Errors

None

Input-Argument Errors
  1. m, n < 0
  2. lda, ldt < 1
  3. lda < m
  4. ldt < n
  5. nz < 0
  6. n is less than the maximum column index in KA.
  7. nt or ldt are too small.
  8. When the following two errors occur, arrays AT, KT, and AUX are overwritten:
    naux < n
    nt <= 0
  9. Error 2015 is recoverable or naux<>0, and naux is too small--that is, less than the minimum required value. Return code 1 is returned if error 2015 is recoverable.

Example

This example shows how to transpose the following 5 by 4 sparse matrix A, which is stored in compressed-matrix storage mode in arrays AC and KA. Matrix A is:

                        *                        *
                        | 11.0   0.0   0.0   0.0 |
                        | 21.0   0.0  23.0   0.0 |
                        |  0.0   0.0  33.0  34.0 |
                        |  0.0  42.0   0.0  44.0 |
                        | 51.0   0.0  53.0   0.0 |
                        *                        *

The resulting 4 by 5 matrix transpose AT, stored in compressed-matrix storage mode in arrays AT and KT, is as follows. Matrix AT is:

                     *                              *
                     | 11.0  21.0   0.0   0.0  51.0 |
                     |  0.0   0.0   0.0  42.0   0.0 |
                     |  0.0  23.0  33.0   0.0  53.0 |
                     |  0.0   0.0  34.0  44.0   0.0 |
                     *                              *

As shown here, the value of N is larger than the actual number of columns in the matrix A. On output, the exact number of rows in the transposed matrix is returned in the output argument N.

On output, row 6 of AT and KT is is not accessed or modified by the subroutine. Column 4 and row 5 are accessed and modified. They are of no use in further computations and will not be used, because NT = 3 and M = 4.

Call Statement and Input
            M   NZ  AC   KA  LDA  N   NT  AT   KT   LDT  AUX  NAUX
            |   |   |    |    |   |   |   |    |     |    |    |
CALL DSMTM( 5 , 2 , AC , KA , 5 , 5 , 4 , AT , KT ,  6 , AUX , 5  )
        *            *
        | 11.0   0.0 |
        | 21.0  23.0 |
AC   =  | 33.0  34.0 |
        | 42.0  44.0 |
        | 51.0  53.0 |
        *            *
        *      *
        | 1  1 |
        | 1  3 |
KA   =  | 3  4 |
        | 2  4 |
        | 1  3 |
        *      *

Output
N        =  4
NT       =  3
        *                       *
        | 11.0  21.0  51.0  0.0 |
        | 42.0   0.0   0.0  0.0 |
AT   =  | 33.0  23.0  53.0  0.0 |
        | 34.0  44.0   0.0  0.0 |
        |  0.0   0.0   0.0  0.0 |
        |   .     .     .    .  |
        *                       *
        *            *
        | 1  2  5  1 |
        | 4  1  1  1 |
KT   =  | 3  2  5  1 |
        | 3  4  1  1 |
        | 1  1  1  1 |
        | .  .  .  . |
        *            *

DSDMX--Matrix-Vector Product for a Sparse Matrix or Its Transpose in Compressed-Diagonal Storage Mode

This subprogram computes the matrix-vector product for square sparse matrix A, stored in compressed-diagonal storage mode, using either the matrix or its transpose, and vectors x and y:

y <-- Ax
y <-- ATx

where A, x, and y contain long-precision real numbers.

Syntax

Fortran CALL DSDMX (iopt, n, nd, ad, lda, trans, la, x, y)
C and C++ dsdmx (iopt, n, nd, ad, lda, trans, la, x, y);
PL/I CALL DSDMX (iopt, n, nd, ad, lda, trans, la, x, y);

On Entry

iopt

indicates the storage variation used for sparse matrix A, stored in compressed-diagonal storage mode, where:

If iopt = 0, matrix A is a general sparse matrix, where all the nonzero diagonals in matrix A are used to set up the storage arrays.

If iopt = 1, matrix A is a symmetric sparse matrix, where only the nonzero main diagonal and one of each of the unique nonzero diagonals are used to set up the storage arrays.

Specified as: a fullword integer; iopt = 0 or 1.

n

is the order of sparse matrix A and the number of elements in vectors x and y. Specified as: a fullword integer; n >= 0.

nd

is the number of diagonals stored in the columns of array AD, as well as the number of columns in AD and the number of elements in array LA. Specified as: a fullword integer; nd >= 0.

ad

is the sparse matrix A of order n, stored in compressed diagonal storage in an array, referred to as AD. The iopt argument indicates the storage variation used for storing matrix A. The trans argument indicates the following:

If trans = 'N', A is used in the computation.

If trans = 'T', AT is used in the computation.
Note: No data should be moved to form AT; that is, the matrix A should always be stored in its untransposed form.

Specified as: an lda by (at least) nd array, containing long-precision real numbers; lda >= n.

lda

is the size of the leading dimension of the array specified for ad. Specified as: a fullword integer; lda > 0 and lda >= n.

trans

indicates the form of matrix A to use in the computation, where:

If trans = 'N', A is used in the computation.

If trans = 'T', AT is used in the computation.

Specified as: a single character; trans = 'N' or 'T'.

la

is the array, referred to as LA, containing the diagonal numbers k for the diagonals stored in each corresponding column in array AD. (For an explanation of how diagonal numbers are assigned, see "Compressed-Diagonal Storage Mode".)

Specified as: a one-dimensional array of (at least) length nd, containing fullword integers; 1-n <= LA(i) <= n-1.

x

is the vector x of length n. Specified as: a one-dimensional array, containing long-precision real numbers.

y

See 'On Return'.

On Return

y

is the vector y of length n, containing the result of the computation. Returned as: a one-dimensional array, containing long-precision real numbers.

Notes

  1. All subroutines accept lowercase letters for the trans argument.

  2. Matrix A must have no common elements with vectors x and y; otherwise, results are unpredictable.

  3. For a description of how sparse matrices are stored in compressed-diagonal storage mode, see "Compressed-Diagonal Storage Mode".

Function

The matrix-vector product of a square sparse matrix or its transpose, is computed for a matrix stored in compressed-diagonal storage mode:

y <-- Ax
y <-- ATx

where:

A is a sparse matrix of order n, stored in compressed-diagonal storage mode in AD and LA, using the storage variation for either general or symmetric sparse matrices, as indicated by the iopt argument.
x and y are vectors of length n.

It is expressed as follows for y <-- Ax:



Figure ESYGR110 not displayed.


It is expressed as follows for y <-- ATx:



Figure ESYGR111 not displayed.


If n is 0, no computation is performed; if nd is 0, output vector y is set to zero, because matrix A contains all zeros.

Error Conditions

Computational Errors

None

Input-Argument Errors
  1. iopt <> 0 or 1
  2. n < 0
  3. lda <= 0
  4. n > lda
  5. trans <>  'N' or 'T'
  6. nd < 0
  7. LA(j) <= -n or LA(j) >= n, for any j = 1, n

Example 1

This example shows the matrix-vector product using trans = 'N', which is computed for the following sparse matrix A of order 6. The matrix is stored in compressed-matrix storage mode in arrays AD and LA using the storage variation for general sparse matrices, storing all nonzero diagonals. Matrix A is:

                     *                              *
                     | 4.0  0.0  7.0  0.0  0.0  0.0 |
                     | 3.0  4.0  0.0  2.0  0.0  0.0 |
                     | 0.0  2.0  4.0  0.0  4.0  0.0 |
                     | 0.0  0.0  7.0  4.0  0.0  1.0 |
                     | 1.0  0.0  0.0  3.0  4.0  0.0 |
                     | 1.0  1.0  0.0  0.0  3.0  4.0 |
                     *                              *

Call Statement and Input
           IOPT  N   ND  AD  LDA TRANS  LA   X   Y
            |    |   |   |    |    |    |    |   |
CALL DSDMX( 0  , 6 , 5 , AD , 6 , 'N' , LA , X , Y )
        *                         *
        | 4.0  0.0  0.0  0.0  7.0 |
        | 4.0  0.0  0.0  3.0  2.0 |
AD   =  | 4.0  0.0  0.0  2.0  4.0 |
        | 4.0  0.0  0.0  7.0  1.0 |
        | 4.0  0.0  1.0  3.0  0.0 |
        | 4.0  1.0  1.0  3.0  0.0 |
        *                         *
LA       =  (0, -5, -4, -1, 2)
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0)

Output
Y        =  (25.0, 19.0, 36.0, 43.0, 33.0, 42.0)

Example 2

This example shows the matrix-vector product using trans = 'N', which is computed for the following sparse matrix A of order 6. The matrix is stored in compressed-matrix storage mode in arrays AD and LA using the storage variation for symmetric sparse matrices, storing the nonzero main diagonal and one of each of the unique nonzero diagonals. Matrix A is:

                  *                                    *
                  | 11.0   0.0  13.0   0.0  15.0   0.0 |
                  |  0.0  22.0   0.0  24.0   0.0  26.0 |
                  | 13.0   0.0  33.0   0.0  35.0   0.0 |
                  |  0.0  24.0   0.0  44.0   0.0  46.0 |
                  | 15.0   0.0  35.0   0.0  55.0   0.0 |
                  |  0.0  26.0   0.0  46.0   0.0  66.0 |
                  *                                    *

Call Statement and Input
           IOPT  N   ND  AD  LDA TRANS  LA   X   Y
            |    |   |   |    |    |    |    |   |
CALL DSDMX( 1  , 6 , 3 , AD , 6 , 'N' , LA , X , Y )
        *                  *
        | 11.0  13.0   0.0 |
        | 22.0  24.0   0.0 |
AD   =  | 33.0  35.0   0.0 |
        | 44.0  46.0   0.0 |
        | 55.0   0.0  15.0 |
        | 66.0   0.0  26.0 |
        *                  *
LA       =  (0, 2, -4)
X        =  (1.0, 2.0, 3.0, 4.0, 5.0, 6.0)

Output
Y        =  (125.0, 296.0, 287.0, 500.0, 395.0, 632.0)

Example 3

This example is the same as Example 1 except that it shows the matrix-vector product for the transpose of a matrix, using trans = 'T'. It is computed using the transpose of the following sparse matrix A of order 6, which is stored in compressed-matrix storage mode in arrays AD and LA, using the storage variation for general sparse matrices, storing all nonzero diagonals. It uses the same matrix A as in Example 1.

Call Statement and Input
           IOPT  N   ND  AD  LDA TRANS  LA   X   Y
            |    |   |   |    |    |    |    |   |
CALL DSDMX( 0  , 6 , 5 , AD , 6 , 'T' , LA , X , Y )
AD       =(same as input AD in Example 1)
LA       =(same as input LA in Example 1)
X        =(same as input X in Example 1)

Output
Y        =  (21.0, 20.0, 47.0, 35.0, 50.0, 28.0)


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]