Application work vs. MPI communication
Executing the parallel Gauss elimination program on a 300 by 300 symmetric matrix, 3, 4 and 5 processors are used, respectively, to compare the performance. The optimization task for a parallel program is to speed up the computation, which requires the balance of the number of processors involved (prefers more processors) and the communication time consumed (prefers less processors). From the following data, we can see that for a 300 by 300 symmetric matrix, 4 processors can give satisfactory results whereas the computation is slowed down if 5 processors are used (since much more time is spent on MPI communication). It is clear that more processors should to be used if the matrix size is enlarged.
Case I: 3 processors
Program running time: (in seconds):
processor 0 = 0.3388819695,
processor 1 = 0.3543879986,
processor 2 = 0.3539350033.
The overall display window:
The communication display widow:
Case II: 4 processors
Program running time: (in seconds):
processor 0 = 0.2572540045,
processor 1 = 0.2717200518,
processor 2 = 0.2717679739,
processor 3 = 0.2718409300.
The overall display window:
The communication display widow:
Case III: 5 processors
Program running time (in seconds):
processor 0 = 1.801327109,
processor 1 = 1.814256072,
processor 2 = 1.823287964,
processor 3 = 1.821319938,
processor 4 = 1.822461009.
The overall display window:
The communication display widow: