XL Fortran for AIX 8.1

Language Reference

+-------------------------------IBM Extension--------------------------------+


Chapter 16. OpenMP Execution Environment Routines and Lock Routines

The OpenMP specification provides a number of routines which allow you to control and query the parallel execution environment.

Parallel threads created by the run-time environment through the OpenMP interface are considered independent of the threads you create and control using calls to the Fortran Pthreads library module. References within the following descriptions to "serial portions of the program" refer to portions of the program that are executed by only one of the threads that have been created by the run-time environment. For example, you can create multiple threads by using f_pthread_create. However, if you then call omp_get_num_threads from outside of an OpenMP parallel block, or from within a serialized nested parallel region, the function will return 1, regardless of the number of threads that are currently executing.

The OpenMP execution environment routines are:

Included in the OpenMP run-time library are two routines that support a portable wall-clock timer. The OpenMP timing routines are:

The OpenMP run-time library also supports a set of simple and nestable lock routines. You must only lock variables through these routines. Simple locks may not be locked if they are already in a locked state. Simple lock variables are associated with simple locks and may only be passed to simple lock routines. Nestable locks may be locked multiple times by the same thread. Nestable lock variables are associated with nestable locks and may only be passed to nestable lock routines.

For all the routines listed below, the lock variable is an integer whose KIND type parameter is denoted either by the symbolic constant omp_lock_kind, or by omp_nest_lock_kind.Please note that the predefined lock variables are defined inside the omp_lib module which is not an intrinsic data type.

This variable is sized according to the compilation mode. It is set either to '4' for 32-bit applications or '8' for 64-bit non-Large Data Type (LDT) applications and 64-bit LDT applications.

OpenMP provides the following simple lock routines:

OpenMP provides the following nestable lock routines:

Note:
You can define and implement your own versions of the OpenMP routines. However, by default, the compiler will substitute the XL Fortran versions of the OpenMP routines regardless of the existence of other implementations, unless you specify the -qnoswapomp compiler option. For more information, see User's Guide.

Table 17. OpenMP Execution Environment Routines, Lock Routines and Timing Routines

Procedures
omp_destroy_lock

This subroutine dissassociates a given lock variable from all locks. You have to use omp_init_lock to reinitialize a lock variable that has been destroyed with a call to omp_destroy_lock before using it again as a lock variable.

Note:
If you call omp_destroy_lock with a lock variable that has not been initialized, the result of the call is undefined.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LOCK
      CALL omp_destroy_lock(LOCK)

For an example of how to use omp_destroy_lock, see omp_init_lock

omp_destroy_nest_lock

This subroutine unitializes a nestable lock variable causing the lock variable to become undefined. The variable nvar must be an unlocked and initialized nestable lock variable.

Note:
: If you call omp_destroy_nest_lock using a variable that is not initialized, the result is undefined.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_nest_lock_kind) LOCK
      CALL omp_destroy_nest_lock(LOCK)
omp_get_dynamic

The omp_get_dynamic function returns .TRUE. if dynamic thread adjustment by the run-time environment is enabled, and .FALSE. otherwise.

Format/ Example

      USE omp_lib
      LOGICAL LVAR
      LVAR = omp_get_dynamic()
omp_get_max_threads

This function returns the maximum number of threads that can execute concurrently in a single parallel region. The return value is equal to the maximum value that can be returned by the omp_get_num_threads function. If you use omp_set_num_threads to change the number of threads, subsequent calls to omp_get_max_threads will return the new value.

The function has global scope, which means that the maximum value it returns applies to all functions, subroutines, and compilation units in the program. It returns the same value whether executing from a serial or parallel region.

You can use omp_get_max_threads to allocate maximum-sized data structures for each thread when you have enabled dynamic thread adjustment by passing omp_set_dynamic an argument which evaluates to .TRUE.

Format/ Example

      USE omp_lib
      INTEGER MAX_THREADS
      MAX_THREADS = omp_get_max_threads()
omp_get_nested

The omp_get_nested function returns .TRUE. if nested parallelism is enabled and .FALSE. if nested parallelism is disabled.

Format/ Example

      USE omp_lib
      LOGICAL LVAR
      LVAR = omp_get_nested()
omp_get_num_procs

The omp_get_num_procs function returns the number of online processors on the machine.

Format/ Example

      USE omp_lib
      INTEGER NUM_PROCS
      NUM_PROCS = omp_get_num_procs()
omp_get_num_threads

The omp_get_num_threads function returns the number of threads in the team currently executing the parallel region from which it is called. The function binds to the closest enclosing PARALLEL directive.

The omp_set_num_threads subroutine and the OMP_NUM_THREADS environment variable control the number of threads in a team. If you do not explicitly set the number of threads, the run-time environment will use the number of online processors on the machine by default.

If you call omp_get_num_threads from a serial portion of your program or from a nested parallel region that is serialized, the function returns 1.

Format/ Example

      USE omp_lib
      INTEGER N1, N2
 
      N1 = omp_get_num_threads()
      PRINT *, N1
!$OMP PARALLEL PRIVATE(N2)
      N2 = omp_get_num_threads()
      PRINT *, N2
!$OMP END PARALLEL

The omp_get_num_threads call returns 1 in the serial section of the code, so N1 is assigned the value 1. N2 is assigned the number of threads in the team executing the parallel region, so the output of the second print statement will be an arbitrary number less than or equal to the value returned by omp_get_max_threads.

omp_get_thread_num

This function returns the number of the currently executing thread within the team. The number returned will always be between 0 and NUM_PARTHDS - 1. The master thread of the team returns a value of 0.

If you call omp_get_thread_num from within a serial region, from within a serialized nested parallel region, or from outside the dynamic extent of any parallel region, this function will return a value of 0.

This function binds to the closest PARALLEL, PARALLEL DO, or PARALLEL SECTIONS directive that encloses it.

Format/ Example

      USE omp_lib
      INTEGER NP
 
!$OMP PARALLEL PRIVATE(NP)
      NP = omp_get_thread_num()
      CALL WORK(NP)
!$OMP MASTER
      NP = omp_get_thread_num()
      CALL WORK(NP)
!$OMP END MASTER
!$OMP END PARALLEL
      END
 
      SUBROUTINE WORK(THD_NUM)
      INTEGER THD_NUM
      PRINT *, THD_NUM
      END
omp_get_wtick

The omp_get_wtick function returns a double precision value equal to the number of seconds between consecutive clock ticks.

Format/ Example

      USE omp_lib
      DOUBLE PRECISION WTICKS
      WTICKS = omp_get_wtick()
      PRINT *, 'The clock ticks ', 10 / WTICKS, &
      ' times in 10 seconds.'
 
omp_get_wtime

The omp_get_wtime function returns a double precision value equal to the number of seconds since the initial value of the operating system real-time clock. This value is guaranteed not to change during execution of the program.

The value returned by the omp_get_wtime function is not consistent across all threads in the team.

Format/ Example

      USE omp_lib
      DOUBLE PRECISION START, END
      START = omp_get_wtime()
!     Work to be timed
      END = omp_get_wtime()
      PRINT *, 'Stuff took ', END - START, ' seconds.'
 
omp_in_parallel

The omp_in_parallel function returns .TRUE. if you call it from the dynamic extent of a region executing in parallel and returns .FALSE. otherwise. If you call omp_in_parallel from a region that is serialized but nested within the dynamic extent of a region executing in parallel, the function will still return .TRUE.. (Nested parallel regions are serialized by default. See omp_set_nested and the environment variable OMP_NESTED in XL Fortran for AIX User's Guide, Chapter 4, for more information.)

Format/ Example

      USE omp_lib
      INTEGER N, M
      N = 4
      M = 3
      PRINT*, omp_in_parallel()
!$OMP PARALLEL DO
      DO I = 1,N
!$OMP   PARALLEL DO
        DO J=1, M
          PRINT *, omp_in_parallel()
        END DO
!$OMP   END PARALLEL DO
      END DO
!$OMP END PARALLEL DO

The first call to omp_in_parallel returns .FALSE. because the call is outside the dynamic extent of any parallel region. The second call returns .TRUE., even if the nested PARALLEL DO loop is serialized, because the call is still inside the dynamic extent of the outer PARALLEL DO loop.

omp_init_lock

The omp_init_lock subroutine initializes a lock and associates it with the lock variable passed in as a parameter. After the call to omp_init_lock, the initial state of the lock variable is unlocked.

Note:
If you call this routine with a lock variable that you have already initialized, the result of the call is undefined.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LCK
      INTEGER ID
      CALL omp_init_lock(LCK)
!$OMP PARALLEL SHARED(LCK), PRIVATE(ID)
      ID = omp_get_thread_num()
      CALL omp_set_lock(LCK)
      PRINT *,'MY THREAD ID IS', ID
      CALL omp_unset_lock(LCK)
!$OMP END PARALLEL
      CALL omp_destroy_lock(LCK)

In the above example, one at a time, the threads gain ownership of the lock associated with the lock variable LCK, print the thread ID, and release ownership of the lock.

omp_init_nest_lock

The omp_init_nest_lock subroutine allows you to initialize a nestable lock and associate it with the lock variable you specify. The initial state of the lock variable is unlocked, and the initial nesting count is zero. The value of nvar must be an unitialized nestable lock variable.

Note:
If you call omp_init_nest_lock using a variable that is already initialized, the result is undefined.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_nest_lock_kind) LCK
      CALL omp_init_nest_lock(LCK)

For an example of how to use omp_init_nest_lock, see omp_set_nest_lock.

omp_set_dynamic

The omp_set_dynamic subroutine enables or disables dynamic adjustment, by the run-time environment, of the number of threads available to execute parallel regions.

If you call omp_set_dynamic with a scalar_logical_expression that evaluates to .TRUE., the run-time environment can automatically adjust the number of threads that are used to execute subsequent parallel regions to obtain the best use of system resources. The number of threads you specify using omp_set_num_threads becomes the maximum, not exact, thread count.

If you call the subroutine with a scalar_logical_expression which evaluates to .FALSE., dynamic adjustment of the number of threads is disabled. The run-time environment cannot automatically adjust the number of threads used to execute subsequent parallel regions. The value you pass to omp_set_num_threads becomes the exact thread count.

By default, dynamic thread adjustment is enabled. If your code depends on a specific number of threads for correct execution, you should explicitly disable dynamic threads.

Note:
The number of threads remains fixed for each parallel region. The omp_get_num_threads function returns that number.

This subroutine has precedence over the OMP_DYNAMIC environment variable.

Format/ Example

      USE omp_lib
      LOGICAL LVAR
      CALL omp_set_dynamic(LVAR)
omp_set_lock

The omp_set_lock subroutine forces the calling thread to wait until the specified lock is available before executing subsequent instructions. The calling thread is given ownership of the lock when it becomes available.

Note:
If you call this routine with a lock variable that has not been initialized, the result of the call is undefined. Also, if a thread that owns a lock tries to lock it again by issuing a call to omp_set_lock, it will produce a deadlock.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LCK_X
      CALL omp_init_lock (LCK_X)
!$OMP PARALLEL PRIVATE (I), SHARED (A, X)
!$OMP DO
      DO I = 3, 100
        A(I) = I * 10
        CALL omp_set_lock (LCK_X)
        X = X + A(I)
        CALL omp_unset_lock (LCK_X)
      END DO
!$OMP END DO
!$OMP END PARALLEL
      CALL omp_destroy_lock (LCK_X)

In this example, the lock variable LCK_X is used to avoid race conditions when updating the shared variable X. By setting the lock before each update to X and unsetting it after the update, you ensure that only one thread is updating X at a given time.

omp_set_nested

The omp_set_nested subroutine enables or disables nested parallelism.

If you call the subroutine with a scalar_logical_expression that evaluates to .FALSE., nested parallelism is disabled. Nested parallel regions are serialized, and they are executed by the current thread. This is the default setting.

If you call the subroutine with a scalar_logical_expression that evaluates to .TRUE., nested parallelism is enabled. Parallel regions that are nested can deploy additional threads to the team. It is up to the run-time environment to determine whether additional threads should be deployed. Therefore, the number of threads used to execute parallel regions may vary from one nested region to the next.

This subroutine takes precedence over the OMP_NESTED environment variable.

Format/ Example

      USE omp_lib
      LOGICAL LVAR
      CALL omp_set_nested(LVAR)
omp_set_nest_lock

The omp_set_nest_lock subroutine allows you to set a nestable lock. The thread executing the subroutine will wait until a lock becomes available and then set that lock, incrementing the nesting count. A nestable lock is available if it is owned by the thread executing the subroutine, or is unlocked.

Format/ Example

USE omp_lib
INTEGER P
INTEGER A
INTEGER B
INTEGER ( kind=omp_nest_lock_kind ) LCK
 
CALL omp_init_nest_lock ( LCK )
 
!$OMP PARALLEL SECTIONS
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + A
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
CALL omp_unset_nest_lock ( LCK )
!$OMP SECTION
CALL omp_set_nest_lock ( LCK )
P = P + B
CALL omp_unset_nest_lock ( LCK )
!$OMP END PARALLEL SECTIONS
 
CALL omp_destroy_nest_lock ( LCK )
omp_set_num_threads

The omp_set_num_threads subroutine tells the run-time environment how many threads to use in the next parallel region. The scalar_integer_expression that you pass to the subroutine is evaluated, and its value is used as the number of threads. If you have enabled dynamic adjustment of the number of threads (see omp_set_dynamic), omp_set_num_threads sets the maximum number of threads to use for the next parallel region. The run-time environment then determines the exact number of threads to use. However, when dynamic adjustment of the number of threads is disabled, omp_set_num_threads sets the exact number of threads to use in the next parallel region.

This subroutine takes precedence over the OMP_NUM_THREADS environment variable.

Note:
If you call this subroutine from the dynamic extent of a region executing in parallel, the behavior of the subroutine is undefined.

Format/ Example

      USE omp_lib
      INTEGER(8) NUM_THREADS
      CALL omp_set_num_threads(NUM_THREADS)
omp_test_lock

The omp_test_lock function attempts to set the lock associated with the specified lock variable. It returns .TRUE. if it was able to set the lock and .FALSE. otherwise. In either case, the calling thread will continue to execute subsequent instructions in the program.

Note:
If you call omp_test_lock with a lock variable that has not yet been initialized, the result of the call is undefined.

Format/ Example

      USE omp_lib
      INTEGER LCK
      INTEGER ID
      CALL omp_init_lock (LCK)
!$OMP PARALLEL SHARED(LCK), PRIVATE(ID)
      ID = omp_get_thread_num()
      DO WHILE (.NOT. omp_test_lock(LCK))
        CALL WORK_A (ID)
      END DO
      CALL WORK_B (ID)
!$OMP END PARALLEL
      CALL omp_destroy_lock (LCK)

In this example, a thread repeatedly executes WORK_A until it can set LCK, the lock variable. When it succeeds in setting the lock variable, it executes WORK_B.

omp_test_nest_lock

The omp_test_nest_lock subroutine allows you to attempt to set a lock using the same method as omp_set_nest_lock but the execution thread does not wait for confirmation that the lock is available. If the lock is successfully set, the function will increment the nesting count. If the lock is unavailable the function returns a value of zero. The result value is always a default integer.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LOCK
      CALL omp_test_nest_lock(LOCK)
omp_unset_lock

This subroutine causes the executing thread to release ownership of the specified lock. The lock can then be set by another thread as required.

Note:
The behavior of the omp_unset_lock subroutine is undefined if either:
  • The calling thread does not own the lock specified, or
  • The routine is called with a lock variable that has not been initialized.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_lock_kind) LOCK
      CALL omp_unset_lock(LOCK)

For an example of how to use omp_unset_lock, see omp_set_lock.

omp_unset_nest_lock

The omp_unset_nest_lock subroutine allows you to release ownership of a nestable lock. The subroutine decrements the nesting count and releases the associated thread from ownership of the nestable lock.

Format/ Example

      USE omp_lib
      INTEGER(kind=omp_nest_lock_kind) LOCK
      CALL omp_unset_nest_lock(LOCK)

For an example of how to use omp_unset_nest_lock, see omp_set_nest_lock.

+----------------------------End of IBM Extension----------------------------+


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]