Guide and Reference

Fourier Transforms (Message Passing)

This chapter describes the Fourier Transforms subroutines.

Overview of the Fourier Transforms Subroutines

The Fourier transform subroutines perform mixed-radix transforms in two and three dimensions. See references [1] and [3].

Descriptive Name	Short-Precision Subroutine	Long-Precision Subroutine	Page
Complex Fourier Transforms in Two Dimensions	PSCFT2	PDCFT2	PSCFT2 and PDCFT2--Complex Fourier Transforms in Two Dimensions
Real-to-Complex Fourier Transforms in Two Dimensions	PSRCFT2	PDRCFT2	PSRCFT2 and PDRCFT2--Real-to-Complex Fourier Transforms in Two Dimensions
Complex-to-Real Fourier Transforms in Two Dimensions	PSCRFT2	PDCRFT2	PSCRFT2 and PDCRFT2--Complex-to-Real Fourier Transforms in Two Dimensions
Complex Fourier Transforms in Three Dimensions	PSCFT3	PDCFT3	PSCFT3 and PDCFT3--Complex Fourier Transforms in Three Dimensions
Real-to-Complex Fourier Transforms in Three Dimensions	PSRCFT3	PDRCFT3	PSRCFT3 and PDRCFT3--Real-to-Complex Fourier Transforms in Three Dimensions
Complex-to-Real Fourier Transforms in Three Dimensions	PSCRFT3	PDCRFT3	PSCRFT3 and PDCRFT3--Complex-to-Real Fourier Transforms in Three Dimensions

Acceptable Lengths for the Transforms

Use the following formula to determine acceptable transform lengths:

n = (2^h) (3ⁱ) (5^j) (7^k) (11^m) for n <= 37748736

where:

h = 1, 2, ..., 25

i = 0, 1, 2

j, k, m = 0, 1

Figure 12 lists all the acceptable values for transform lengths in the Fourier transform subroutines.

Figure 12. Table of Acceptable Lengths for the Transforms

2 4 6 8 10 12 14 16 18 20 22 24 28 30 32 36 40 42 44 48 56 60 64 66 70 72 80 84 88 90 96 110 112 120 126 128 132 140 144 154 160 168 176 180 192 198 210 220 224 240 252 256 264 280 288 308 320 330 336 352 360 384 396 420 440 448 462 480 504 512 528 560 576 616 630 640 660 672 704 720 768 770 792 840 880 896 924 960 990 1008 1024 1056 1120 1152 1232 1260 1280 1320 1344 1386 1408 1440 1536 1540 1584 1680 1760 1792 1848 1920 1980 2016 2048 2112 2240 2304 2310 2464 2520 2560 2640 2688 2772 2816 2880 3072 3080 3168 3360 3520 3584 3696 3840 3960 4032 4096 4224 4480 4608 4620 4928 5040 5120 5280 5376 5544 5632 5760 6144 6160 6336 6720 6930 7040 7168 7392 7680 7920 8064 8192 8448 8960 9216 9240 9856 10080 10240 10560 10752 11088 11264 11520 12288 12320 12672 13440 13860 14080 14336 14784 15360 15840 16128 16384 16896 17920 18432 18480 19712 20160 20480 21120 21504 22176 22528 23040 24576 24640 25344 26880 27720 28160 28672 29568 30720 31680 32256 32768 33792 35840 36864 36960 39424 40320 40960 42240 43008 44352 45056 46080 49152 49280 50688 53760 55440 56320 57344 59136 61440 63360 64512 65536 67584 71680 73728 73920 78848 80640 81920 84480 86016 88704 90112 92160 98304 98560 101376 107520 110880 112640 114688 118272 122880 126720 129024 131072 135168 143360 147456 147840 157696 161280 163840 168960 172032 177408 180224 184320 196608 197120 202752 215040 221760 225280 229376 236544 245760 253440 258048 262144 270336 286720 294912 295680 315392 322560 327680 337920 344064 354816 360448 368640 393216 394240 405504 430080 443520 450560 458752 473088 491520 506880 516096 524288 540672 573440 589824 591360 630784 645120 655360 675840 688128 709632 720896 737280 786432 788480 811008 860160 887040 901120 917504 946176 983040 1013760 1032192 1048576 1081344 1146880 1179648 1182720 1261568 1290240 1310720 1351680 1376256 1419264 1441792 1474560 1572864 1576960 1622016 1720320 1774080 1802240 1835008 1892352 1966080 2027520 2064384 2097152 2162688 2293760 2359296 2365440 2523136 2580480 2621440 2703360 2752512 2838528 2883584 2949120 3145728 3153920 3244032 3440640 3548160 3604480 3670016 3784704 3932160 4055040 4128768 4194304 4325376 4587520 4718592 4730880 5046272 5160960 5242880 5406720 5505024 5677056 5767168 5898240 6291456 6307840 6488064 6881280 7096320 7208960 7340032 7569408 7864320 8110080 8257536 8388608 8650752 9175040 9437184 9461760 10092544 10321920 10485760 10813440 11010048 11354112 11534336 11796480 12582912 12615680 12976128 13762560 14192640 14417920 14680064 15138816 15728640 16220160 16515072 16777216 17301504 18350080 18874368 18923520 20185088 20643840 20971520 21626880 22020096 22708224 23068672 23592960 25165824 25231360 25952256 27525120 28385280 28835840 29360128 30277632 31457280 32440320 33030144 33554432 34603008 36700160 37748736

Fourier Transforms Subroutines

This section contains the Fourier transform subroutine descriptions.

PSCFT2 and PDCFT2--Complex Fourier Transforms in Two Dimensions

These subroutines compute the mixed-radix two-dimensional discrete Fourier transform of complex data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

where:

and where:

x_j1,j2 are elements of array X.

y_k1,k2 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

For scale = 1 and isign being positive, you obtain the discrete Fourier transform. For scale = 1/((n1)(n2)) and isign being negative, you obtain the inverse Fourier transform.

See references [1] and [3].

Table 101. Data Types

X, Y scale Subroutine
Short-precision complex Short-precision real PSCFT2
Long-precision complex Long-precision real PDCFT2

Syntax

Fortran	CALL PSCFT2 \| PDCFT2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	pscft2 \| pdcft2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

is the local array X, containing the two-dimensional data to be transformed that has been block-column distributed over a 1 × q process grid, where q is the number of processes. (The value of ldx is set in the IP array.)

Scope: local

Specified as: an array of (at least) length ldx × LOCq(n2), containing numbers of the data type indicated in Table 101. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the first dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the second dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 101, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- y is returned in transposed form; that is, global y has dimensions n2 × n1
- ldx, the leading dimension of the array specified for X, equals n1
- ldy, the leading dimension of the array specified for Y, equals n2
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate whether y is stored in normal or transposed form, and indicate values for ldx and ldy.
IP(2) indicates whether y is to be stored in normal or transposed form.
If IP(2) = 0, then y is to be stored in transposed form on output.
If IP(2) = 1, then y is to be stored in normal form on output.
IP(3-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx, of the array specified for X, where:
If IP(20) = 0, then ldx = n1.
If IP(20) <> 0, then ldx is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldy, of the array specified for Y, where:
If IP(21) = 0 and y is to be stored in normal form, then ldy = n1.
If IP(21) = 0 and y is to be stored in transposed form, then ldy = n2.
If IP(21) <> 0, then ldy is this value of IP(21).
IP(22-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(2) = 0 or 1

IP(20) >= n1 or IP(20) = 0

IP(21) >= n1 (for normal form) or IP(21) = 0

IP(21) >= n2 (for transposed form) or IP(21) = 0

On Return

y

is the local array Y that is block-column distributed and contains the results of the computation, where:

If IP(1) = 0, the local array Y is stored in transposed form and has dimensions n2 × LOCq(n1).

If IP(1) <> 0 and IP(2) = 0, the local array Y is stored in transposed form and has dimensions ldy × LOCq(n1).

If IP(1) <> 0 and IP(2) = 1, the local array Y is stored in normal form and has dimensions ldy × LOCq(n2).

Scope: local

Returned as: an ldy × LOCq(n2) array (for normal form) or an ldy × LOCq(n1) array (for transposed form), containing the numbers of the data type indicated in Table 101. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

You may specify the same array for both X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For the output array Y, these subroutines may use any extra space available when ldy is greater than its minimum value.
For more information on LOCq(_) and how sequences are block-column distributed, see "Two-Dimensional Sequence".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.
An example of the use of this subroutine in a thermal diffusion application program is shown in Appendix B. "Sample Programs". See subroutine fourier in "Module Fourier (Message Passing)".

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
The length of n1 or n2 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(2) <> 0 or 1
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n1 (that is, ldx < n1)
IP(1) <> 0 and IP(2) = 1 (for normal mode) and IP(21) <> 0 and IP(21) < n1 (that is, ldy < n1)
IP(1) <> 0 and IP(2) = 0 (for transpose mode) and IP(21) <> 0 and IP(21) < n2 (that is, ldy < n2)

Example 1

This example shows how to compute a two-dimensional transform. In this example, the IP array is set to 0, which means array Y is returned in transposed form, ldx=n1, and ldy=n2. The data is block-column distributed over a 1 × 2 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:7,0:2), Y(0:5,0:3)
  INTEGER*4  IP(40)
  REAL*8     SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 0
 
             X   Y   N1  N2  ISIGN     SCALE       ICONTXT   IP
             |   |   |   |     |         |            |      |
CALL PDCFT2( X , Y , 8 , 6 ,  -1  , 1.0D0/48.0D0 , ICONTXT , IP)

Global matrix X of order 8 × 6:

B,D                    0                                     1
     *                                                                         *
     |  (48.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
 0   |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0) |
     *                                                                         *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for X:

p,q  |                 0                  |                  1
-----|------------------------------------|------------------------------------
     |  (48.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
 0   |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |    (0.0,0.0)  (0.0,0.0)  (0.0,0.0)

Output

Global matrix for Y:

B,D                         0                                                1
     *                                                                                               *
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
 0   |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     *                                                                                               *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local matrix for Y:

p,q  |                      0                        |                       1
-----|-----------------------------------------------|-----------------------------------------------
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
 0   |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |    (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)

Example 2

This example shows how to compute a two-dimensional transform. This is an example of uneven block-column distribution over a 1 × 3 process grid. In this example, the IP array is set to 0, which means array Y is returned in transposed form, ldx=n1, and ldy=n2. The arrays are declared as follows:

  COMPLEX*16 X(0:7,0:2), Y(0:7,0:2)
  INTEGER*4  IP(40)
  REAL*8     SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 3
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 0
 
             X   Y   N1  N2  ISIGN     SCALE       ICONTXT   IP
             |   |   |   |     |         |            |      |
CALL PDCFT2( X , Y , 8 , 8 ,   1  , 1.0D0/16.0D0 , ICONTXT , IP)

Global matrix X of order 8 × 8:

B,D                    0                                     1                              2
   *                                                                                                     *
   |  (0.0,98.0) (67.0,27.0) (67.0,82.0) | (84.0,99.0) (26.0,41.0) (24.0,15.0) | (27.0,55.0)  (48.0,9.0) |
   | (13.0,49.0) (93.0,91.0)  (0.0,12.0) | (52.0,88.0)  (4.0,84.0) (98.0,57.0) | (43.0,89.0) (89.0,27.0) |
   | (75.0,26.0) (38.0,52.0)  (38.0,1.0) |  (9.0,23.0) (73.0,26.0) (72.0,80.0) | (76.0,62.0)  (90.0,0.0) |
   |  (45.0,9.0) (51.0,46.0)  (6.0,68.0) | (65.0,30.0) (32.0,41.0)  (75.0,3.0) | (47.0,84.0)  (6.0,41.0) |
 0 | (53.0,94.0) (83.0,94.0) (41.0,86.0) | (41.0,35.0) (63.0,53.0) (65.0,53.0) | (23.0,15.0)  (90.0,2.0) |
   |  (21.0,7.0)   (3.0,5.0) (68.0,62.0) | (70.0,51.0) (75.0,46.0)  (7.0,49.0) | (27.0,21.0) (50.0,70.0) |
   |  (4.0,50.0)  (5.0,76.0) (58.0,73.0) | (91.0,59.0) (99.0,28.0) (63.0,95.0) | (35.0,71.0) (51.0,93.0) |
   | (67.0,38.0) (52.0,77.0) (93.0,72.0) | (76.0,84.0) (36.0,17.0) (88.0,74.0) | (16.0,13.0) (31.0,23.0) |
   *                                                                                                     *

The following is the 1 × 3 process grid:

B,D 0 1 2
0 P₀₀ P₀₁ P₀₂

B,D	0	1	2
0	P₀₀	P₀₁	P₀₂

Local arrays for X:

p,q  |                 0                  |                  1                     |             2
-----|------------------------------------|----------------------------------------|--------------------------
     | (0.0,98.0) (67.0,27.0) (67.0,82.0) |  (84.0,99.0)  (26.0,41.0)  (24.0,15.0) | (27.0,55.0)   (48.0,9.0)
     |(13.0,49.0) (93.0,91.0)  (0.0,12.0) |  (52.0,88.0)   (4.0,84.0)  (98.0,57.0) | (43.0,89.0)  (89.0,27.0)
     |(75.0,26.0) (38.0,52.0)  (38.0,1.0) |   (9.0,23.0)  (73.0,26.0)  (72.0,80.0) | (76.0,62.0)   (90.0,0.0)
     | (45.0,9.0) (51.0,46.0)  (6.0,68.0) |  (65.0,30.0)  (32.0,41.0)   (75.0,3.0) | (47.0,84.0)   (6.0,41.0)
 0   |(53.0,94.0) (83.0,94.0) (41.0,86.0) |  (41.0,35.0)  (63.0,53.0)  (65.0,53.0) | (23.0,15.0)   (90.0,2.0)
     | (21.0,7.0)   (3.0,5.0) (68.0,62.0) |  (70.0,51.0)  (75.0,46.0)   (7.0,49.0) | (27.0,21.0)  (50.0,70.0)
     | (4.0,50.0)  (5.0,76.0) (58.0,73.0) |  (91.0,59.0)  (99.0,28.0)  (63.0,95.0) | (35.0,71.0)  (51.0,93.0)
     |(67.0,38.0) (52.0,77.0) (93.0,72.0) |  (76.0,84.0)  (36.0,17.0)  (88.0,74.0) | (16.0,13.0)  (31.0,23.0)

Output

Global matrix for Y:

B,D                         0                                       1                                2
   *                                                                                                                *
   |(198.6,200.1)  (-10.6,9.8)     (0.8,7.2)  |   (5.8,-5.2)   (11.2,9.1) (-38.3,-18.7) |(-10.2,-1.9)   (14.0,12.6) |
   |  (-0.3,-6.8)  (19.3,-18.7)  (28.7,-3.6)  |   (-7.2,2.5)   (1.5,14.6) (-22.0,-20.7) |(29.8,-15.0)   (-10.7,0.8) |
   |  (11.3,-6.2)  (-24.0,-8.1)   (8.6,11.6)  |  (-29.9,6.5)  (13.7,13.5)  (-16.7,-4.4) |(-26.6,-0.8)    (-3.3,9.5) |
   |   (5.7,17.1)    (3.7,-7.0)  (-2.5,13.9)  |(-19.5,-15.9) (-18.4,20.1)   (11.6,-1.8) | (-0.3,-8.2)   (26.8,30.0) |
0  | (-29.8,-3.4)    (-0.5,7.4) (-17.1,27.5)  |  (18.5,32.6)    (9.4,9.6)    (7.6,-8.0) |(-13.1,13.9) (-26.6,-16.5) |
   |  (-10.2,1.6)   (-5.0,28.8)  (-5.0,25.0)  |   (5.0,12.1)  (-13.5,9.9)     (2.5,0.6) |  (0.0,-5.6)  (-11.8,-8.3) |
   | (-8.7,-13.6)   (10.0,11.1)    (0.6,9.4)  | (12.2,-21.2)  (-9.3,-0.9)  (14.5,-15.6) |  (2.4,11.1)   (-22.7,0.2) |
   | (-27.7,-3.1) (-21.8,-21.3)  (-22.6,6.0)  |   (0.2,11.6)   (-1.6,6.6)   (-7.2,-0.4) |  (0.5,25.6)   (20.3,23.8) |
   *                                                                                                                *

The following is the 1 × 3 process grid:

B,D 0 1 2
0 P₀₀ P₀₁ P₀₂

B,D	0	1	2
0	P₀₀	P₀₁	P₀₂

Local matrix for Y:

p,q|                      0                   |                     1                   |              2
---|------------------------------------------|-----------------------------------------|--------------------------
   |(198.6,200.1)   (-10.6,9.8)    (0.8,7.2)  |   (5.8,-5.2)   (11.2,9.1) (-38.3,-18.7) |(-10.2,-1.9)   (14.0,12.6)
   |  (-0.3,-6.8)  (19.3,-18.7)  (28.7,-3.6)  |   (-7.2,2.5)   (1.5,14.6) (-22.0,-20.7) |(29.8,-15.0)   (-10.7,0.8)
   |  (11.3,-6.2)  (-24.0,-8.1)   (8.6,11.6)  |  (-29.9,6.5)  (13.7,13.5)  (-16.7,-4.4) |(-26.6,-0.8)    (-3.3,9.5)
   |   (5.7,17.1)    (3.7,-7.0)  (-2.5,13.9)  |(-19.5,-15.9) (-18.4,20.1)   (11.6,-1.8) | (-0.3,-8.2)   (26.8,30.0)
0  | (-29.8,-3.4)    (-0.5,7.4) (-17.1,27.5)  |  (18.5,32.6)    (9.4,9.6)    (7.6,-8.0) |(-13.1,13.9) (-26.6,-16.5)
   |  (-10.2,1.6)   (-5.0,28.8)  (-5.0,25.0)  |   (5.0,12.1)  (-13.5,9.9)     (2.5,0.6) |  (0.0,-5.6)  (-11.8,-8.3)
   | (-8.7,-13.6)   (10.0,11.1)    (0.6,9.4)  | (12.2,-21.2)  (-9.3,-0.9)  (14.5,-15.6) |  (2.4,11.1)   (-22.7,0.2)
   | (-27.7,-3.1) (-21.8,-21.3)  (-22.6,6.0)  |   (0.2,11.6)   (-1.6,6.6)   (-7.2,-0.4) |  (0.5,25.6)   (20.3,23.8)

PSRCFT2 and PDRCFT2--Real-to-Complex Fourier Transforms in Two Dimensions

These subroutines compute the mixed-radix two-dimensional complex conjugate even discrete Fourier transform of real data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

where:

and where:

x_j1,j2 are elements of array X.

y_k1,k2 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

For scale = 1 and isign being positive, you obtain the discrete Fourier transform. For scale = 1/((n1)(n2)) and isign being negative, you obtain the inverse Fourier transform.

See references [1] and [3].

Table 102. Data Types

X, scale Y Subroutine
Short-precision real Short-precision complex PSRCFT2
Long-precision real Long-precision complex PDRCFT2

Syntax

Fortran	CALL PSRCFT2 \| PDRCFT2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	psrcft2 \| pdrcft2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

Scope: local

Specified as: an array of (at least) length ldx × LOCq(n2), containing numbers of the data type indicated in Table 102. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the first dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the second dimension of the two-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 102, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- ldx, the leading dimension of the array specified for X, equals n1
- ldy, the leading dimension of the array specified for Y, equals n2
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate values for ldx and ldy.
IP(2-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx, of the array specified for X, where:
If IP(20) = 0, then ldx = n1.
If IP(20) <> 0, then ldx is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldy, of the array specified for Y, where:
If IP(21) = 0, then ldy = n2.
If IP(21) <> 0, then ldy is this value of IP(21).
IP(22-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(20) >= n1 or IP(20) = 0

IP(21) >= n2 or IP(21) = 0

On Return

y

is the local array Y, stored in FFT-packed storage mode, containing the results of the computation that are block-column distributed, where:

If IP(1) = 0, the local array Y has dimensions n2 × LOCq(n1/2).

If IP(1) <> 0 and IP(21) = 0, the local array Y has dimensions n2 × LOCq(n1/2).

If IP(1) <> 0 and IP(21) <> 0, the local array Y has dimensions ldy × LOCq(n1/2).

Scope: local

Returned as: an ldy × LOCq(n1/2) array, containing the numbers of the data type indicated in Table 102. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

These subroutines always return Y in transposed form.
For the output array Y, these subroutines may use any extra space available when ldy is greater than its minimum value.
You may specify the same array for X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For more information on LOCq(_), and how sequences are block-column distributed and stored in FFT-packed storage mode, see "Two-Dimensional Sequence".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
The length of n1 or n2 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n1 (that is, ldx < n1)
IP(1) <> 0 and IP(21) <> 0 and IP(21) < n2 (that is, ldy < n2)

Example

This example shows how to compute a two-dimensional transform. The data is block-column distributed over a 1 × 2 process grid. The arrays are declared as follows:

  REAL*8 X(0:11,0:1)
  COMPLEX*16 Y(0:6,0:1)
  INTEGER*4 IP(40)
  REAL*8 SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 1
IP(20) = 12 (that is, ldx = 12)
IP(21) = 7  (that is, ldy = 7)
 
              X   Y   N1  N2  ISIGN  SCALE   ICONTXT   IP
              |   |   |   |     |      |        |      |
CALL PDRCFT2( X , Y , 8 , 4 ,   1  , 1.0D0 , ICONTXT , IP)

Global matrix X of order 8 × 4:

B,D        0              1
     *                          *
     |  1.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
 0   |  0.0   0.0  |   0.0  0.0 |
     |  0.0   0.0  |   0.0  0.0 |
     |   .     .   |    .    .  |
     |   .     .   |    .    .  |
     |   .     .   |    .    .  |
     |   .     .   |    .    .  |
     *                          *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for X:

p,q  |     0      |      1
-----|------------|------------
     |  1.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
 0   |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |   .    .   |    .    .
     |   .    .   |    .    .
     |   .    .   |    .    .
     |   .    .   |    .    .

Output

The following global matrix Y is returned in transposed form and stored in FFT-packed storage mode:

B,D              0                          1
     *                                                   *
     |  (1.0,1.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |  (1.0,1.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
 0   |  (1.0,0.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |      .          .      |         .          .     |
     |      .          .      |         .          .     |
     |      .          .      |         .          .     |
     *                                                   *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

The following local arrays for Y are returned in transposed form and stored in FFT-packed storage mode:

p,q  |           0             |            1
-----|-------------------------|-------------------------
     |  (1.0,1.0)   (1.0,0.0)  |   (1.0,0.0)   (1.0,0.0)
     |  (1.0,0.0)   (1.0,0.0)  |   (1.0,0.0)   (1.0,0.0)
     |  (1.0,1.0)   (1.0,0.0)  |   (1.0,0.0)   (1.0,0.0)
 0   |  (1.0,0.0)   (1.0,0.0)  |   (1.0,0.0)   (1.0,0.0)
     |      .           .      |       .           .
     |      .           .      |       .           .
     |      .           .      |       .           .

PSCRFT2 and PDCRFT2--Complex-to-Real Fourier Transforms in Two Dimensions

These subroutines compute the mixed-radix two-dimensional real discrete Fourier transform of complex conjugate even data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

where:

and where:

x_j1,j2 are elements of array X.

y_k1,k2 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

For scale = 1 and isign being positive, you obtain the discrete Fourier transform. For scale = 1/((n1)(n2)) and isign being negative, you obtain the inverse Fourier transform.

See references [1] and [3].

Table 103. Data Types

X Y, scale Subroutine
Short-precision complex Short-precision real PSCRFT2
Long-precision complex Long-precision real PDCRFT2

Syntax

Fortran	CALL PSCRFT2 \| PDCRFT2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	pscrft2 \| pdcrft2 (`x`, `y`, `n1`, `n2`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

Scope: local

Specified as: an array of (at least) length ldx × LOCq(n1/2), containing numbers of the data type indicated in Table 103. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the second dimension of the two-dimensional data of the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the first dimension of two-dimensional data of the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 103, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- ldx, the leading dimension of the array specified for X, equals n2
- ldy, the leading dimension of the array specified for Y, equals n1
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate values for ldx and ldy.
IP(2-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx, of the array specified for X, where:
If IP(20) = 0, then ldx = n2.
If IP(20) <> 0, then ldx is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldy, of the array specified for Y, where:
If IP(21) = 0 then ldy = n1.
If IP(21) <> 0, then ldy is this value of IP(21).
IP(22-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(20) >= n2 or IP(20) = 0

IP(21) >= n1 or IP(21) = 0

On Return

y

is the local array Y that is block-column distributed and contains the results of the computation, where:

If IP(1) = 0, the local array Y is stored in normal form and has dimensions n1 × LOCq(n2).

If IP(1) <> 0 and IP(21) = 0, the local array Y is stored in normal form and has dimensions n1 × LOCq(n2).

If IP(1) <> 0 and IP(21) <> 0, the local array Y is stored in normal form and has dimensions ldy × LOCq(n2).

Scope: local

Returned as: an ldy × LOCq(n2) array, containing the numbers of the data type indicated in Table 103. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

These subroutines always return Y in normal form.
For the output array Y, these subroutines may use any extra space available when ldy is greater than its minimum value.
For more information on LOCq(_), and how sequences are block-column distributed and stored in FFT-packed storage mode, see "Two-Dimensional Sequence".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.
You may specify the same array for X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space.

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
The length of n1 or n2 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n2 (that is, ldx < n2)
IP(1) <> 0 and IP(21) <> 0 and IP(21) < n1 (that is, ldy < n1)

Example

This example shows how to compute a two-dimensional transform. The data is block-column distributed over a 1 × 2 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:6,0:1)
  REAL*8 Y(0:11,0:1)
  INTEGER*4 IP(40)
  REAL*8 SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 1
IP(20) = 7 (that is, ldx = 7)
IP(21) = 12 (that is, ldy = 12)
 
              X   Y   N1  N2  ISIGN     SCALE       ICONTXT   IP
              |   |   |   |     |         |            |      |
CALL PDCRFT2( X , Y , 8 , 4 ,  -1  , 1.0D0/32.0D0 , ICONTXT , IP)

The following global matrix X is stored in FFT-packed storage mode:

B,D              0                          1
     *                                                   *
     |  (1.0,1.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |  (1.0,1.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
 0   |  (1.0,0.0)  (1.0,0.0)  |     (1.0,0.0)  (1.0,0.0) |
     |      .          .      |         .          .     |
     |      .          .      |         .          .     |
     |      .          .      |         .          .     |
     *                                                   *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

The following local arrays for X are stored in FFT-packed storage mode:

p,q  |           0            |            1
-----|------------------------|------------------------
     |  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)
     |  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)
 0   |  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)
     |      .          .      |       .          .
     |      .          .      |       .          .
     |      .          .      |       .          .

Output

Global matrix Y:

B,D        0              1
     *                           *
     |  1.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
 0   |  0.0   0.0  |   0.0   0.0 |
     |  0.0   0.0  |   0.0   0.0 |
     |   .     .   |    .     .  |
     |   .     .   |    .     .  |
     |   .     .   |    .     .  |
     |   .     .   |    .     .  |
     *                           *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for Y:

p,q  |     0      |      1
-----|------------|------------
     |  1.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
 0   |  0.0  0.0  |   0.0  0.0
     |  0.0  0.0  |   0.0  0.0
     |   .    .   |    .    .
     |   .    .   |    .    .
     |   .    .   |    .    .
     |   .    .   |    .    .

PSCFT3 and PDCFT3--Complex Fourier Transforms in Three Dimensions

These subroutines compute the mixed-radix three-dimensional discrete Fourier transform of complex data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

k3 = 0, 1, ..., n3-1

where:

and where:

x_j1,j2,j3 are elements of array X.

y_k1,k2,k3 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

For scale = 1 and isign being positive, you obtain the discrete Fourier transform. For scale = 1/((n1)(n2)(n3)) and isign being negative, you obtain the inverse Fourier transform.

See references [1] and [3].

Table 104. Data Types

X, Y scale Subroutine
Short-precision complex Short-precision real PSCFT3
Long-precision complex Long-precision real PDCFT3

Syntax

Fortran	CALL PSCFT3 \| PDCFT3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	pscft3 \| pdcft3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

is the local array X, containing the three-dimensional data to be transformed that has been block-plane distributed over a 1 × q process grid, where q is the number of processes. (The values of ldx1 and ldx2 are set in the IP array.)

Scope: local

Specified as: an array of (at least) length ldx1 × ldx2 × LOCq(n3), containing numbers of the data type indicated in Table 104. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the first dimension of the three-dimensional data of the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the second dimension of the three-dimensional data of the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

n3

is the length of the third dimension of the three-dimensional data of the array to be transformed.

Scope: global

Specified as: a fullword integer; n3 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 104, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- y is returned in transposed form; that is global y has dimensions n3 × n2 × n1.
- ldx1 and ldx2, the leading dimensions of the array specified for X, equal n1 and n2, respectively.
- ldy1 and ldy2, the leading dimensions of the array specified for Y, equal n3 and n2, respectively.
  The remaining parameters of array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate whether y is stored in normal or transposed form, and indicate values for the leading dimensions.
IP(2) indicates whether y is to be stored in normal or transposed form.
If IP(2)=0, then y is to be stored in transposed form on output.
If IP(2)=1, then y is to be stored in normal form on output.
IP(3-19) are reserved.
IP(20) indicates the values of the leading dimension, ldx1, for the array specified for X, where:
If IP(20) = 0, then ldx1 = n1.
If IP(20) <> 0, then ldx1 is this value of IP(20).
IP(21) indicates the values of the leading dimension, ldx2, for the array specified for X, where:
If IP(21) = 0, then ldx2 = n2.
If IP(21) <> 0, then ldx2 is this value of IP(21).
IP(22) indicates the values of the leading dimension, ldy1, for the array specified for Y, where:
If IP(22) = 0 and IP(2) = 1, then ldy1 = n1.
If IP(22) = 0 and IP(2) = 0, then ldy1 = n3.
If IP(22) <> 0, then ldy1 is this value of IP(22).
IP(23) indicates the values of the leading dimension, ldy2, for the array specified for Y, where:
If IP(23) = 0, then ldy2 = n2.
If IP(23) <> 0, then ldy2 is this value of IP(23).
IP(24-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(2) = 0 or 1

IP(20) >= n1 or IP(20)=0

IP(21) >= n2 or IP(21)=0

IP(22) >= n1 (for normal form) or IP(22)=0

IP(22) >= n3 (for transposed form) or IP(22) = 0

IP(23) >= n2 or IP(23)=0

On Return

y

is the local array Y that is block-plane distributed and contains the results of the computation, where:

If IP(1) = 0, the local array Y is stored in transposed form and has dimensions n3 × n2 × LOCq(n1).

If IP(1) <> 0 and IP(2)=0, then the local array Y is stored in transposed form and has dimensions ldy1 × ldy2 × LOCq(n1).

If IP(1) <> 0 and IP(2)=1, then the local array Y is stored in normal form and has dimensions ldy1 × ldy2 × LOCq(n3).

Scope: local

Returned as: an ldy1 × ldy2 × LOCq(n3) array (for normal form) or an ldy1 × ldy2 × LOCq(n1) array (for transposed form), containing the numbers of the data type indicated in Table 104. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

For the output array Y, these subroutines may use any extra space available when ldy1 and ldy2 are greater than their minimum value.
You may specify the same array for X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For more information on LOCq(_) and how sequences are block-plane distributed, see "Three-Dimensional Sequences".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space.

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
n3 > 37748736
The length of n1, n2, or n3 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(2) <> 0 or 1
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n1 (that is, ldx1 < n1)
IP(1) <> 0 and IP(21) <> 0 and IP(21) < n2 (that is, ldx2 < n2)
IP(1) <> 0 and IP(2)=0 (for transpose mode) and IP(22) <> 0 and IP(22) < n3 (that is, ldy1 < n3)
IP(1) <> 0 and IP(2)=1 (for normal mode) and IP(22) <> 0 and IP(22) < n1 (that is, ldy1 < n1)
IP(1) <> 0 and IP(23) <> 0 and IP(23) < n2 (that is, ldy2 < n2)

Example 1

This example shows how to compute a three-dimensional transform. The data is block-plane distributed over a 1 × 2 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:3,0:3,0)
  COMPLEX*16 Y(0:3,0:3,0)
  INTEGER*4 IP(40)
  REAL*8 SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 1
IP(2) = 1
IP(20) = 4
IP(21) = 4
IP(22) = 4
IP(23) = 4
 
             X   Y   N1  N2  N3  ISIGN  SCALE   ICONTXT   IP
             |   |   |   |   |     |      |        |      |
CALL PDCFT3( X , Y , 4 , 4 , 2  ,  1  , 1.0D0 , ICONTXT , IP)

Global matrix X:

Plane 0:

B,D                          0
     *                                                *
     |  (1.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
 0   |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     *                                                *

Plane 1:

B,D                          1
     *                                                *
     |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
 0   |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     |  (0.0,0.0)   (0.0,0.0)   (0.0,0.0)   (0.0,0.0) |
     *                                                *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for X:

p,q  |                      0                       |                       1
-----|----------------------------------------------|----------------------------------------------
     |  (1.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
 0   |  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)
     |  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  |   (0.0,0.0)  (0.0,0.0)  (0.0,0.0)  (0.0,0.0)

Output

Global matrix Y:

Plane 0:

B,D                          0
     *                                                *
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
 0   |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     *                                                *

Plane 1:

B,D                          1
     *                                                *
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
 0   |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     *                                                *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for Y:

p,q  |                      0                       |                       1
-----|----------------------------------------------|----------------------------------------------
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
 0   |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)

Example 2

This example shows how to compute a three-dimensional transform. In this example, the IP array is set to 0, which means array Y is returned in transposed form, ldx=n1, and ldy=n3. This is an example of uneven block-plane distribution over a 1 × 3 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:3,0:1,0:1), Y(0:5,0:1,0:1)
  INTEGER*4  IP(40)
  REAL*8     SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 3
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 0
 
             X   Y   N1  N2  N3  ISIGN     SCALE     ICONTXT   IP
             |   |   |   |   |     |         |          |      |
CALL PDCFT3( X , Y , 4 , 2 , 6  ,  1  , 1.0D0/8.0D0, ICONTXT , IP)

Global matrix X:

          Plane 0:                 Plane 1:
----------------------------------------------------
B,D                          0
----------------------------------------------------
     *                                             *
     | (4.9,3.9)  (5.9,3.1) | (2.2,5.1)  (5.9,1.5) |
0    | (6.8,4.6)  (6.7,9.8) | (2.1,0.5)  (3.5,2.9) |
     | (4.9,1.9)  (1.6,4.9) | (7.0,6.8)  (6.4,1.1) |
     | (2.9,7.6)  (7.5,5.5) | (0.6,4.6)  (9.0,7.6) |
     *                                             *
 
          Plane 2:                 Plane 3:
----------------------------------------------------
B,D                         1
----------------------------------------------------
     *                                             *
     | (8.3,2.3)  (7.3,7.3) | (4.6,5.9)  (3.3,3.0) |
0    | (4.5,8.9)  (0.2,0.9) | (3.6,5.0)  (3.4,0.4) |
     | (1.8,6.5)  (2.5,1.7) | (8.5,9.3)  (0.2,1.7) |
     | (6.4,5.2)  (7.8,5.3) | (8.7,9.1)  (9.1,5.5) |
     *                                             *
 
          Plane 4:                 Plane 5:
----------------------------------------------------
B,D                         2
----------------------------------------------------
     *                                             *
     | (1.0,1.0)  (2.2,8.2) | (5.4,3.6)  (4.8,4.0) |
0    | (8.8,1.0)  (8.9,5.8) | (2.0,2.6)  (5.2,9.5) |
     | (3.3,0.0)  (8.9,7.5) | (3.1,9.7)  (5.4,7.5) |
     | (8.7,8.3)  (6.3,5.4) | (1.5,9.6)  (4.4,4.4) |
     *                                             *

The following is the 1 × 3 process grid:

B,D 0 1 2
0 P₀₀ P₀₁ P₀₂

B,D	0	1	2
0	P₀₀	P₀₁	P₀₂

Local arrays for X:

p,q|                   0                    |                   1                    |                   2
---|----------------------------------------|----------------------------------------|---------------------------------------
   |(4.9,3.9) (5.9,3.1) (2.2,5.1) (5.9,1.5) |(8.3,2.3) (7.3,7.3) (4.6,5.9) (3.3,3.0) |(1.0,1.0) (2.2,8.2) (5.4,3.6) (4.8,4.0)
   |(6.8,4.6) (6.7,9.8) (2.1,0.5) (3.5,2.9) |(4.5,8.9) (0.2,0.9) (3.6,5.0) (3.4,0.4) |(8.8,1.0) (8.9,5.8) (2.0,2.6) (5.2,9.5)
 0 |(4.9,1.9) (1.6,4.9) (7.0,6.8) (6.4,1.1) |(1.8,6.5) (2.5,1.7) (8.5,9.3) (0.2,1.7) |(3.3,0.0) (8.9,7.5) (3.1,9.7) (5.4,7.5)
   |(2.9,7.6) (7.5,5.5) (0.6,4.6) (9.0,7.6) |(6.4,5.2) (7.8,5.3) (8.7,9.1) (9.1,5.5) |(8.7,8.3) (6.3,5.4) (1.5,9.6) (4.4,4.4)

Output

Global matrix Y:

          Plane 0:                    Plane 1:
--------------------------------------------------------
B,D                           0
--------------------------------------------------------
     *                                                   *
     | (29.8,29.7)  (-1.9,1.1) |  (-3.0,0.9) (-3.0,-3.8) |
     |  (-3.3,1.0)  (0.4,-2.5) |  (4.1,-3.9) (-4.6,-1.4) |
0    | (-1.7,-1.2)   (0.1,3.4) |   (0.9,5.0)  (-0.6,1.6) |
     |  (2.3,-0.5)  (1.0,-4.6) |   (3.1,0.8)   (1.7,0.4) |
     |   (3.0,1.9)   (4.5,0.6) |  (0.5,-3.8) (-3.0,-0.1) |
     |   (1.0,0.1) (-5.7,-1.9) | (-1.4,-1.2)   (0.8,2.6) |
     *                                                   *
 
          Plane 2:                      Plane 3:
--------------------------------------------------------
B,D                            1
--------------------------------------------------------
     *                                                   *
     | (-2.4,-2.8)   (2.0,0.1) |  (3.6,-3.4)   (1.4,0.0) |
     |  (2.1,-3.5)   (2.8,1.3) | (-1.9,-0.1)   (2.3,6.3) |
0    |  (-1.7,0.7)   (2.8,1.0) |   (2.0,1.4)  (-0.6,1.0) |
     | (-3.3,-2.2) (-3.2,-5.1) |  (-0.3,3.3)   (0.8,0.5) |
     | (-1.5,-3.1)   (1.4,0.1) | (-1.3,-1.7)  (-2.7,0.7) |
     |   (1.8,0.6)  (-0.7,3.2) |   (0.2,2.9)  (1.0,-2.1) |
     *                                                   *

The following is the 1 × 3 process grid:

B,D 0 1 2
0 P₀₀ P₀₁ P₀₂

B,D	0	1	2
0	P₀₀	P₀₁	P₀₂

Local arrays for Y:

p,q  |                      0                          |                       1
-----|-------------------------------------------------|-----------------------------------------------
     | (29.8,29.7)  (-1.9,1.1)  (-3.0,0.9) (-3.0,-3.8) | (-2.4,-2.8)   (2.0,0.1)  (3.6,-3.4)  (1.4,0.0)
     |  (-3.3,1.0)  (0.4,-2.5)  (4.1,-3.9) (-4.6,-1.4) |  (2.1,-3.5)   (2.8,1.3) (-1.9,-0.1)  (2.3,6.3)
 0   | (-1.7,-1.2)   (0.1,3.4)   (0.9,5.0)  (-0.6,1.6) |  (-1.7,0.7)   (2.8,1.0)   (2.0,1.4) (-0.6,1.0)
     |  (2.3,-0.5)  (1.0,-4.6)   (3.1,0.8)   (1.7,0.4) | (-3.3,-2.2) (-3.2,-5.1)  (-0.3,3.3)  (0.8,0.5)
     |   (3.0,1.9)   (4.5,0.6)  (0.5,-3.8) (-3.0,-0.1) | (-1.5,-3.1)   (1.4,0.1) (-1.3,-1.7) (-2.7,0.7)
     |   (1.0,0.1) (-5.7,-1.9) (-1.4,-1.2)   (0.8,2.6) |   (1.8,0.6)  (-0.7,3.2)   (0.2,2.9) (1.0,-2.1)

There is not any data located on P₀₂.

PSRCFT3 and PDRCFT3--Real-to-Complex Fourier Transforms in Three Dimensions

These subroutines compute the mixed-radix three-dimensional complex conjugate even discrete Fourier transform of real data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

k3 = 0, 1, ..., n3-1

where:

and where:

x_j1,j2,j3 are elements of array X.

y_k1,k2,k3 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

See references [1] and [3].

Table 105. Data Types

X, scale Y Subroutine
Short-precision real Short-precision complex PSRCFT3
Long-precision real Long-precision complex PDRCFT3

Syntax

Fortran	CALL PSRCFT3 \| PDRCFT3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	psrcft3 \| pdrcft3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

Scope: local

Specified as: an array of (at least) length ldx1 × ldx2 × LOCq(n3), containing numbers of the data type indicated in Table 105. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the first dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the second dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

n3

is the length of the third dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n3 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 105, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- ldx1 and ldx2, the leading dimensions of the array specified for X, equal n1 and n2, respectively.
- ldy1 and ldy2, the leading dimensions of the array specified for Y, equal n3 and n2, respectively.
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate values for the leading dimensions.
IP(2-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx1, of the array specified for X, where:
If IP(20) = 0, then ldx1 = n1.
If IP(20) <> 0, then ldx1 is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldx2, of the array specified for X, where:
If IP(21) = 0, then ldx2 = n2.
If IP(21) <> 0, then ldx2 is this value of IP(21).
IP(22) indicates the value of the leading dimension, ldy1, of the array specified for Y, where:
If IP(22) = 0, then ldy1 = n3.
If IP(22) <> 0, then ldy1 is this value of IP(22).
IP(23) indicates the value of the leading dimension, ldy2, of the array specified for Y, where:
If IP(23) = 0, then ldy2 = n2.
If IP(23) <> 0, then ldy2 is this value of IP(23).
IP(24-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(20) >= n1 or IP(20) = 0

IP(21) >= n2 or IP(21) = 0

IP(22) >= n3 or IP(22) = 0

IP(23) >= n2 or IP(23) = 0

On Return

y

is the local array Y, stored in FFT-packed storage mode, containing the results of the computation that are block-plane distributed, where:

If IP(1) = 0, the local array Y has dimensions n3 × n2 × LOCq(n1/2).

If IP(1) <> 0, the local array Y has dimensions ldy1 × ldy2 × LOCq(n1/2).

Scope: local

Returned as: an ldy1 × ldy2 × LOCq(n1/2) array, containing the numbers of the data type indicated in Table 105. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

These subroutines always return Y in transposed form.
For the output array Y, these subroutines may use any extra space available when ldy1 and ldy2 are greater than their minimum value.
You may specify the same array for X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For more information on LOCq(_), and how sequences are blocked-plane distributed and stored in FFT-packed storage mode, see "Three-Dimensional Sequences".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.

Error Conditions

Computational Errors

None

Resource Errors

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
n3 > 37748736
The length of n1, n2, or n3 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n1 (that is, ldx1 < n1)
IP(1) <> 0 and IP(21) <> 0 and IP(21) < n2 (that is, ldx2 < n2)
IP(1) <> 0 and IP(22) <> 0 and IP(22) < n3 (that is, ldy1 < n3)
IP(1) <> 0 and IP(23) <> 0 and IP(23) < n2 (that is, ldy2 < n2)

Example

This example shows how to compute a three-dimensional transform. The data is block-plane distributed over a 1 × 2 process grid. The arrays are declared as follows:

  REAL*8 X(0:8,0:3,0:1)
  COMPLEX*16 Y(0:4,0:3,0)
  INTEGER*4 IP(40)
  REAL*8 SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
IP(1) = 1
IP(20) = 9
IP(21) = 4
IP(22) = 5
IP(23) = 4
 
              X   Y   N1  N2  N3  ISIGN  SCALE   ICONTXT   IP
              |   |   |   |   |     |      |        |      |
CALL PDRCFT3( X , Y , 4 , 4 , 4  ,  1  , 1.0D0 , ICONTXT , IP)

Global matrix X:

          Plane 0:                Plane 1:
----------------------------------------------------
B,D                         0
----------------------------------------------------
     *                                             *
     |  1.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
 0   |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     *                                             *
 
          Plane 2:                Plane 3:
----------------------------------------------------
B,D                         1
----------------------------------------------------
     *                                             *
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
 0   |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     *                                             *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for X:

p,q  |                    0                     |                     1
-----|------------------------------------------|------------------------------------------
     |  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0   |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .

Output

The following global matrix Y is returned in transposed form and stored in FFT-packed storage mode:

Plane 0:

B,D                          0
     *                                                *
     |  (1.0,1.0)   (1.0,0.0)   (1.0,1.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
 0   |  (1.0,1.0)   (1.0,0.0)   (1.0,1.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |      .           .           .           .     |
     *                                                *

Plane 1:

B,D                          1
     *                                                *
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
 0   |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |  (1.0,0.0)   (1.0,0.0)   (1.0,0.0)   (1.0,0.0) |
     |      .           .           .           .     |
     *                                                *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

The following local arrays for Y are returned in transposed form and stored in FFT-packed storage mode:

p,q  |                      0                       |                       1
-----|----------------------------------------------|----------------------------------------------
     |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
 0   |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |      .          .          .          .      |       .          .          .          .

PSCRFT3 and PDCRFT3--Complex-to-Real Fourier Transforms in Three Dimensions

These subroutines compute the mixed-radix three-dimensional real discrete Fourier transform of complex conjugate even data:

for:

k1 = 0, 1, ..., n1-1

k2 = 0, 1, ..., n2-1

k3 = 0, 1, ..., n3-1

where:

and where:

x_j1,j2,j3 are elements of array X.

y_k1,k2,k3 are elements of array Y.

Isign is + or - (determined by argument isign).

scale is a scalar value.

See references [1] and [3].

Table 106. Data Types

X Y, scale Subroutine
Short-precision complex Short-precision real PSCRFT3
Long-precision complex Long-precision real PDCRFT3

Syntax

Fortran	CALL PSCRFT3 \| PDCRFT3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`)
C and C++	pscrft3 \| pdcrft3 (`x`, `y`, `n1`, `n2`, `n3`, `isign`, `scale`, `icontxt`, `ip`);

On Entry

x

Scope: local

Specified as: an array of (at least) length ldx1 × ldx2 × LOCq(n1/2), containing numbers of the data type indicated in Table 106. This array must be aligned on a doubleword boundary.

y

See 'On Return'.

n1

is the length of the first dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n1 <= 37748736 and must be one of the values listed in Figure 12.

n2

is the length of the second dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n2 <= 37748736 and must be one of the values listed in Figure 12.

n3

is the length of the third dimension of the three-dimensional data in the array to be transformed.

Scope: global

Specified as: a fullword integer; n3 <= 37748736 and must be one of the values listed in Figure 12.

isign

controls the direction of the transform, determining the sign, isign, of the exponent of W_n, where:

If isign = positive value, Isign = + (transforming time to frequency).

If isign = negative value, Isign = - (transforming frequency to time).

Scope: global

Specified as: a fullword integer; where isign > 0 or isign < 0.

scale

is the scaling constant scale.

Scope: global

Specified as: a number of the data type indicated in Table 106, where scale > 0.0 or scale < 0.0.

icontxt

is the BLACS context parameter.

Scope: global

Specified as: the fullword integer that was returned by a prior call to BLACS_GRIDINIT or BLACS_GRIDMAP.

ip

is an array of parameters, IP(i), where:

IP(1) indicates whether the default values for ip are used or you set the values for ip.
If IP(1) = 0, then the following default values are used:
- ldx1 and ldx2, the leading dimensions of the array specified for X, equal n3 and n2, respectively.
- ldy1 and ldy2, the leading dimensions of the array specified for Y, equal n1 and n2, respectively.
The remaining parameters of the array IP are ignored.
If IP(1) <> 0, then you set the remaining values of ip to indicate values for the leading dimensions.
IP(2-19) are reserved.
IP(20) indicates the value of the leading dimension, ldx1, of the array specified for X, where:
If IP(20) = 0, then ldx1 = n3.
If IP(20) <> 0, then ldx1 is this value of IP(20).
IP(21) indicates the value of the leading dimension, ldx2, of the array specified for X, where:
If IP(21) = 0, then ldx2 = n2.
If IP(21) <> 0, then ldx2 is this value of IP(21).
IP(22) indicates the value of the leading dimension, ldy1, of the array specified for Y, where:
If IP(22) = 0, then ldy1 = n1.
If IP(22) <> 0, then ldy1 is this value of IP(22).
IP(23) indicates the value of the leading dimension, ldy2, of the array specified for Y, where:
If IP(23) = 0, then ldy2 = n2.
If IP(23) <> 0, then ldy2 is this value of IP(23).
IP(24-40) are reserved.

Scope: global

Specified as: a one-dimensional array of (at least) length 40, containing fullword integers, where:

IP(1) is any integer

IP(20) >= n3 or IP(20) = 0

IP(21) >= n2 or IP(21) = 0

IP(22) >= n1 or IP(22) = 0

IP(23) >= n2 or IP(23) = 0

On Return

y

is the local array Y that is block-plane distributed and contains the results of the computation, where:

If IP(1) = 0, the local array Y has dimensions n1 × n2 × LOCq(n3).

If IP(1) <> 0, the local array Y has dimensions ldy1 × ldy2 × LOCq(n3).

Scope: local

Returned as: an ldy1 × ldy2 × LOCq(n3) array, containing the numbers of the data type indicated in Table 106. This array must be aligned on a doubleword boundary.

Notes and Coding Rules

These subroutines always return Y in normal form.
For the output array Y, these subroutines may use any extra space available when ldy1 and ldy2 are greater than their minimum value.
You may specify the same array for X and Y. In this case, output overwrites input. If you specify different arrays X and Y, they must have no common elements; otherwise, results are unpredictable.
For more information on LOCq(_), and how sequences are block-plane distributed and stored in FFT-packed storage mode, see "Three-Dimensional Sequences".
In general, distributing your data evenly provides the best work load balance among the processes and allows the use of the most efficient collective communication. However, for your specific problem size and number of processes available, experimentation is necessary to achieve optimal performance.

Unable to allocate work space

Input-Argument and Miscellaneous Errors

Stage 1

icontxt is invalid

Stage 2

Process grid is not 1 × q
The subroutine was called from outside the process grid.

Stage 3

n1 > 37748736
n2 > 37748736
n3 > 37748736
The length of n1, n2, n3 is not an allowable transform length.
isign = 0
scale = 0.0
IP(1) <> 0 and IP(20) <> 0 and IP(20) < n3 (that is, ldx1 < n3)
IP(1) <> 0 and IP(21) <> 0 and IP(21) < n2 (that is, ldx2 < n2)
IP(1) <> 0 and IP(22) <> 0 and IP(22) < n1 (that is, ldy1 < n1)
IP(1) <> 0 and IP(23) <> 0 and IP(23) < n2 (that is, ldy2 < n2)

Example

This example shows how to compute a three-dimensional transform. The data is block-plane distributed over a 1 × 2 process grid. The arrays are declared as follows:

  COMPLEX*16 X(0:4,0:3,0)
  REAL*8 Y(0:8,0:3,0:1)
  INTEGER*4 IP(40)
  REAL*8 SCALE

Call Statements and Input

ORDER = 'R'
NPROW = 1
NPCOL = 2
CALL BLACS_GET(0, 0, ICONTXT)
CALL BLACS_GRIDINIT(ICONTXT, ORDER, NPROW, NPCOL)
CALL BLACS_GRIDINFO(ICONTXT, NPROW, NPCOL, MYROW, MYCOL)
SCALE = 1.0D0/4*4*4
IP(1) = 1
IP(20) = 5
IP(21) = 4
IP(22) = 9
IP(23) = 4
 
              X   Y   N1  N2  N3  ISIGN      SCALE      ICONTXT   IP
              |   |   |   |   |     |          |           |      |
CALL PDCRFT3( X , Y , 4 , 4 , 4  , -1  , 1.0D0/64.0D0 , ICONTXT , IP)

The following global matrix X is stored in FFT-packed storage mode:

Plane 0:

B,D                         0
     *                                             *
     |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
 0   |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |      .          .          .          .     |
     *                                             *

Plane 1:

B,D                         1
     *                                             *
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
 0   |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0) |
     |      .          .          .          .     |
     *                                             *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

The following local arrays for X are stored in FFT-packed storage mode:

p,q  |                      0                       |                       1
-----|----------------------------------------------|----------------------------------------------
     |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
 0   |  (1.0,1.0)  (1.0,0.0)  (1.0,1.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  |   (1.0,0.0)  (1.0,0.0)  (1.0,0.0)  (1.0,0.0)
     |      .          .          .          .      |       .          .          .          .

Output:

Global matrix Y:

          Plane 0:                Plane 1:
----------------------------------------------------
B,D                         0
----------------------------------------------------
     *                                             *
     |  1.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
 0   |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     *                                             *
 
          Plane 2:                Plane 3:
----------------------------------------------------
B,D                         1
----------------------------------------------------
     *                                             *
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
     |  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0 |
 0   |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     |   .    .    .    .   |    .    .    .    .  |
     *                                             *

The following is the 1 × 2 process grid:

B,D 0 1
0 P₀₀ P₀₁

B,D	0	1
0	P₀₀	P₀₁

Local arrays for Y:

p,q  |                    0                     |                     1
-----|------------------------------------------|------------------------------------------
     |  1.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
     |  0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0  |   0.0  0.0  0.0  0.0  0.0  0.0  0.0  0.0
 0   |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .
     |   .    .    .    .    .    .    .    .   |    .    .    .    .    .    .    .    .

[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]

`X`, `Y`	`scale`	Subroutine
Short-precision complex	Short-precision real	PSCFT2
Long-precision complex	Long-precision real	PDCFT2

`X`, `scale`	`Y`	Subroutine
Short-precision real	Short-precision complex	PSRCFT2
Long-precision real	Long-precision complex	PDRCFT2

`X`	`Y`, `scale`	Subroutine
Short-precision complex	Short-precision real	PSCRFT2
Long-precision complex	Long-precision real	PDCRFT2