IBM Books

Operation and Use, Volume 2, Part 2 Profiling


Appendix C. Profiling Programs with the AIX prof and gprof Commands

The difference between profiling serial and parallel applications with the AIX profilers is that serial applications can be run to generate a single profile data file, while a parallel application can be run to produce many.

You request parallel profiling by setting the compile flag to -p or -pg as you would with serial compilation. The parallel profiling capability of PE creates a monitor output file for each task. The files are created in the current directory, and are identified by the name mon.out.taskid or gmon.out.taskid, where taskid is a number between 0 and one less than the number of tasks.

Following the traditional method of profiling using the AIX operating system, you compile a serial application and run it to produce a single profile data file that you can then process using either the prof or gprof commands. With a parallel application, you compile and run it to produce a profile data file for each parallel task. You can then process one, some, or all the data files produced using either the prof or gprof commands. The following table describes how to profile parallel programs. For comparison, the steps involved in profiling a serial program are shown in the left-hand column of the table.
To Profile a Serial Program: To Profile a Parallel Program:
Step 1: Compile the application source code using the cc command with either the -p or -pg flag. Step 1: Compile the application source code using the command mpcc (for C programs), mpCC (for C++ programs), or mpxlf (for Fortran programs) as described in IBM Parallel Environment for AIX: Operation and Use, Volume 1, Using the Parallel Operating Environment. You should use one of the standard profiling compiler options - either -p or -pg - on the compiler command. For more information on the compiler options -p and -pg, refer to their use on the cc command as described in IBM AIX Version 4 Commands Reference and IBM AIX Version 4 General Programming Concepts: Writing and Debugging Programs
Step 2: Run the executable program to produce a profile data file. If you have compiled the source code with the -p option, the data file produced is named mon.out. If you have compiled the source code with the -pg option, the data file produced is named gmon.out. Step 2: Before you run the parallel program, set the environment variable MP_EUILIBPATH=/usr/lpp/ppe.poe/lib/profiled:/usr/lib/profiled:/lib/profiled : /usr/lpp/ppe.poe/lib. If your message passing library is not in /usr/lpp/ppe.poe/lib, substitute your message passing library path. Run the parallel program. When the program ends, it generates a profile data file for each parallel task. The system gives unique names to the data files by appending each task's identifying number to mon.out or gmon.out. If you have compiled the source code with the -p option, the data files produced take the form:
     mon.out.taskid

If the source code has been compiled with the -pg option, the data files produced take the form:

     gmon.out.taskid

Note:The current directory must be writable from all remote nodes. Otherwise, the profile data files will have to be manually moved to the home node for analysis with prof and gprof. You can also use the mcpgath command to move the files. See IBM Parallel Environment for AIX: Operation and Use, Volume 1, Using the Parallel Operating Environment for more about mcpgath.

Step 3: Use either the prof or the gprof command to process the profile data file. You use the prof command to process the mon.out data file, and the gprof command to process the gmon.out data file. Step 3: Use either the prof or gprof command to process the profile data files. The prof command processes the mon.out data files, and gprof processes the gmon.out data files. You can process one, some, or all of the data files created during the run. You must specify the name(s) of the profile data file(s) to read, however, because the prof and gprof commands read mon.out or gmon.out by default. On the prof command, use the -m flag to specify the name(s) of the profile data file(s) it should read. For example, to specify the profile data file for task 0 with the prof command:

ENTER
prof -m mon.out.0

You can also specify that the prof command should take profile data from some or all of the profile data files produced. For example, to specify three different profile data files - the ones associated with tasks 0, 1, and 2 - on the prof command:

ENTER
prof -m mon.out.0 mon.out.1 mon.out.2

On the gprof command, you simply specify the name(s) of the profile data file(s) it should read on the command line. You must also specify the name of the program on the gprof command, but no option flag is needed. For example, to specify the profile data file for task 0 with the gprof command:

ENTER
gprof program gmon.out.0

As with the prof command, you can also specify that the gprof command should take profile data from some or all of the profile data files produced. For example, to specify three different profile data files - the ones associated with tasks 0, 1, and 2 - on the gprof command:

ENTER
gprof program gmon.out.0 gmon.out.1 gmon.out.2

The parallel utility, mp_profile( ), may also be used to selectively profile portions of a program. To start profiling, call mp_profile(1). To suspend profiling, call mp_profile(0). The final profile data set will contain counts and CPU times for the program lines that are delimited by the start and stop calls. In C, the calls are mpc_profile(1), and mpc_profile(0). By default, profiling is active at the start of the user's executable.
Note:Like the sequential version of prof/gprof, if more than one profile file is specified, the parallel version of the prof/gprof command output shows the sum of the profile information in the given profile files. There is no statistical analysis contacted across the multiple profile files.


[ Top of Page | Previous Page | Next Page | Table of Contents | Index ]