To analyze the correctness and the performance of a program, information about the dynamic behavior of all participating processes is needed. The dynamic behavior can be modeled as a stream of events required for a later analysis including appropriate attributes. Based on this idea, KOJAK, a trace-based toolkit for performance analysis, records and analyzes the activities of MPI-1 point-to-point and collective communication.
On platforms with special hardware providing efficient RMA support, one-sided communication is often made available to the programmer in the form of libraries, for example SHMEM (Cray) or LAPI (IBM). However, these libraries are typically platform- or at least vendor-specific. The exception is SHMEM, which is offered by a group of vendors. To support remote-memory access (RMA) hardware in a portable way, MPI-2 introduced a standardized interface for remote memory access. However, potential performance gains come at the expense of more complex semantics. From a programmer's point of view, an data transfer is only completed after a sequence of communication and associated synchronization calls.
We describe the integration of performance measurement and analysis methods for RMA communication into the KOJAK toolkit. Special emphasis is put on the underlying event model used to represent the dynamic behavior of MPI-2 RMA operations. We show that our model reflects the relationships between communication and synchronization more accurately than existing models. In addition, the model is general enough to also cover alternate but simpler RMA interfaces, such as SHMEM and Co-Array Fortran.