Application work vs. MPI communication

Executing the parallel Gauss elimination program on a 300 by 300 symmetric matrix, 3, 4 and 5 processors are used, respectively, to compare the performance. The optimization task for a parallel program is to speed up the computation, which requires the balance of the number of processors involved (prefers more processors) and the communication time consumed (prefers less processors). From the following data, we can see that for a 300 by 300 symmetric matrix, 4 processors can give satisfactory results whereas the computation is slowed down if 5 processors are used (since much more time is spent on MPI communication). It is clear that more processors should to be used if the matrix size is enlarged.

Case I: 3 processors

Program running time: (in seconds):


processor 0 = 0.3388819695,
processor 1 = 0.3543879986,
processor 2 = 0.3539350033.


The Activity Chart display window: With the default pie chart display you can recognize the load imbalance at a glance in the traced program by comparing the different time consumption of activities over all processes. VAMPIR can assist the user by visualizing the actual trace data in different chart modes. Depending on the users preference the activity chart can be switched to the so called "Histogram" mode. To focus on a single activity, for instance MPI, please open the context menu with the right mouse button click inside the Activity Chart and select the activity MPI from the Display menu cascade. A new set of pie charts is drawn, showing only the symbols of the selected activity MPI. So you can compare different activities or symbols of all processes.



The Global Timeline display widow: For each process, the display shows the different states and their change over execution time along a horizontal time axis. Messages between processes are indicated as lines connecting the sending and receiving processes. By default, the timeline view shows the whole execution trace. Even with smaller traces this will lead to a cluttered display like that shown in the figure below. To concentrate on a special part of the trace file please invoke the VAMPIR zooming function by selecting an area of interest with your mouse. To zoom into a part of the timeline view, move the the mouse pointer to the start of the interval you want to zoom into, press the left mouse button, drag the mouse to the end of the zoom interval while keeping the left mouse button down (only the x-coordinate matters). VAMPIR will indicate the marked region with rubber-bands. Finally, release the mouse button. The timeline display will be redrawn showing just the time interval you selected, with the contents magnified accordingly. You can repeatedly zoom into arbitrary levels of detail. Zooming out step-by-step can be done with the Undo Zoom function of the context menu. The Global Displays/Timeline view is the central display of VAMPIR because all other global and process specific statistic displays can be configured to use only the portion of time the timeline view displays. This option can be selected for a single display by selecting Timeline from the appropriate context menu.



Case II: 4 processors

Program running time: (in seconds):


processor 0 = 0.2572540045,
processor 1 = 0.2717200518,
processor 2 = 0.2717679739,
processor 3 = 0.2718409300.


The Activity Chart display window:



The Global Timeline display widow:



Case III: 5 processors

Program running time (in seconds):


processor 0 = 1.801327109,
processor 1 = 1.814256072,
processor 2 = 1.823287964,
processor 3 = 1.821319938,
processor 4 = 1.822461009.


The Activity Chart display window:



The Global Timeline display widow: